packlib.c: make dictionaries independent of byte order#41
packlib.c: make dictionaries independent of byte order#41kanavin wants to merge 1 commit intocracklib:mainfrom
Conversation
The previous dict files are NOT byte-order independent, in fact they are probably arch-specific. Create the dict files in big endian, and convert to host endian while loading them. This should fix the endian issues on multiple platforms. We can't use the endian.h, htobe* and be*toh functions because they are not available on older versions of glibc, such as that found in RHEL 5.9. Change to checking endian and directly calling bswap_* as defined in byteswap.h. Signed-off-by: Hongxu Jia <hongxu.jia@windriver.com> Signed-off-by: Mark Hatle <mark.hatle@windriver.com> Signed-off-by: Lei Maohui <leimaohui@cn.fujitsu.com>
|
@yixiangzhike @jandd @rra @vapier @drfiemost @srcshelton I think this request is useful - however, would want to see some input from others on whether it will cause any impact to compatibility, both across platforms, and for existing deployments where folks have built their own indexes. Will this read existing indexes as-is, or will they have to be rebuilt? It looks (without going deep into changes) that you're using existing header, and it should just work, but would want to make sure of that before merging. |
|
@nneul I do think that this is a good idea, but it will definitely require maintainer scripts in the Debian package to rewrite the dictionary files, I think this is possible but we should mark this as a breaking change in release notes. |
|
Just to be clear, I did not write the patch, @hongxu-jia did, almost 10 years ago. I only made the upstream submission as an ongoing effort to reduce the amount of custom patches yocto project carries. This one in particular is maintained here: |
that isn't what the git commit metadata says. you set the author to "Lei Maohui". as written, it has aliasing problems, uses old/non-standard APIs, and really could use simplification (e.g. define "tocpu" helpers instead of inlining the same "is LE then bswap" logic everywhere). i would also argue that, if we're picking a single endian format for everyone, it really should be little endian. i get that "network endian" is big endian, but i think it's safe to say that the vast majority of systems that will be parsing these data files are going to be little endian. x86 & arm(LE) are by far the dominant desktop/server beasts. i'm not too worried about the format changing ... did we commit to ABI stability anywhere ? we already have to regenerate when new dictionary files are installed right ? so distros are already doing this. Gentoo regens everytime it updates cracklib. |
|
to be clear, I'm in favor of the idea, just not this patch/implementation |
|
Patch authorship was incorrectly reassinged here: I'm fine if it's not acceptable as it is. We submit patches without expectation that they merge; the minimum goal is to make upstreams aware of the issues downstream is facing. |
|
Any progress on this PR? |
|
it sounds like the reporter has no intention of fixing issues, just throwing patches they didn't author over the wall. so let's close this out. |
|
The intention is to make upstream aware of the issues downstream is facing. I guess opening a ticket would be better received? As things stand we have to continue carrying and rebasing the patch, which isn't optimal for you or us. |
|
as i said above, the idea is fine, but the patches (both content & metadata) need improving. if you're spending time rebasing and want to merge things upstream, then the issues need addressing. if you don't intend on addressing the problems, then opening an issue is fine. i'll note again that the way you're handling these patches, especially the metadata, is not kosher. this isn't specific to cracklib -- not correctly attributing authorship & ownership is antithetical to open source. |
|
Ticket filed. If you can lay out a sketch of what the eventual patch should do (in the ticket), would be appreciated. Sadly, I simply don't have time to understand the cracklib codebase and work on making this patch better. There's still just over 250 patches (of varying quality) we need to submit to various upstreams, on top of everything else that needs maintenance. Some time ago it was double that amount, so overall progress is happening despite a few local snags like this one. None of the patches are written by me (I follow strict upstream-first rule in all of my work), and their authors have by and large vanished since. Yes, sometimes it means that sloppy work gets submitted without commitment to drive it to merging, and upstreams get annoyed. It's still better than not submitting: it prevents the mass of patches from growing uncontrollably to the point where we can't sustain it. Regarding ownership/authorship, we keep component patches as files in a git tree, and a 3rd party contributor rewrote the content of the patch headers in the file, while fixing some other issue with what the patch does. Authorship was incorrectly reassigned by that, and certainly wasn't a deliberate act. This change then had slipped through review, and all traces of who was the original author vanished, many many years ago: |
The previous dict files are NOT byte-order independent, in fact they are
probably arch-specific.
Create the dict files in big endian, and convert to host endian while
loading them. This should fix the endian issues on multiple platforms.
We can't use the endian.h, htobe* and be*toh functions because they are
not available on older versions of glibc, such as that found in RHEL
5.9.
Change to checking endian and directly calling bswap_* as defined in
byteswap.h.
Signed-off-by: Hongxu Jia hongxu.jia@windriver.com
Signed-off-by: Mark Hatle mark.hatle@windriver.com
Signed-off-by: Lei Maohui leimaohui@cn.fujitsu.com