-
Notifications
You must be signed in to change notification settings - Fork 140
Portability: add support for big endian and need-aligned-load-store architectures #26
Comments
Just a little factoid to create additional motivation to fix this issue: I just noticed that you have become a dependency of git-annex, which is a popular tool on little NAS devices which often run arm or mips. |
See https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=796800 |
I applied that patch to the Debian directory, and it seems to help on arm64, armel, kfreebsd-i386, ppc64el; compare https://buildd.debian.org/status/logs.php?pkg=haskell-cryptonite&ver=0.6-1&suite=experimental with https://buildd.debian.org/status/logs.php?pkg=haskell-cryptonite&ver=0.6-2&suite=experimental It still fails for the big-endian architectures: powerpc, s390x, ppc64. It also uncovered a bug that quickcheck seems to find only occasionally and that seems to be independent of this patch, I opened issue #29 for it. |
@egrimley: the intent of the arch section detecting the endianness, if for endian-constant architecture like x86/amd64 to not have to go through any 'if byteOrder then x else y' branching code. Sadly, with some personal experiments in, an unrelated, but similar case, ghc was not able to optimise endianness away. For non-endian-const architecture like mips/arm, the goal was to do this if byteorder then .. else .. business (not to autodetect/hardcode the endianness of the arch) Also one more thing, cryptonite shouldn't depends on any extra packages. |
If I understand correctly, you're suggesting that the cabal file distinguish three cases (little-endian arch, big-endian arch, variable-endian arch) and then the Haskell code, likewise, have three implementations. That sounds rather complex and difficult to maintain. It might be worth checking if you can, somehow, get GHC to optimise away the endianness test, and then include the appropriate part of the byteorder package (which is tiny) if you don't want the additional dependency. |
It seems that the functionality is also available in your |
Sincerely expecting and reality often mismatch in the case of GHC (specially considering the broad versions supported); I've seen some pretty horrible stuff happening, so I've learned to be paranoid/careful. |
From https://github.com/vincenthz/hs-memory/blob/master/memory.cabal I get the impression that memory might have the same problems as cryptonite. If there is an easy way of getting GHC to optimise away endianness tests then it doesn't seem to be widely known. This page only suggests template Haskell: http://stackoverflow.com/questions/29349835/conditionally-compiling-based-on-endianness So perhaps a good approach would be to use an inefficient but portable run-time endianness test but push the test up the call tree so that it doesn't hurt performance (much). So instead of having be32Prim test for endianness you'd get mutableArray32FromAddrBE to test for endianness and select between two implementations, with and without byteswapping. |
@egrimley I've never seen the unoptimised 8 bit access, optimised in 32 bits access before in the case of gcc, but I haven't done any benchmark of that sort of optimisation in many years ;). also by default, I would just write the optimised case without thinking about it really. GHC is much more archaic in those low-level machine optimisation |
Ok, I checked: This code
optimizes the check away, leaving |
@vincenthz I hope you realise that the "optimised" C code you're referring to is probably incorrect and that a grumpy C compiler could do something very nasty with it, like assume that the bad code is unreachable and optimise the entire program on that assumption, along the lines of if (x) *(uint32_t *)p = ... "That memory access would be illegal, so now we know that x must be zero, so..." A few years ago you wouldn't have to worry about that but modern C compilers really do stuff like that. |
@nomeata So Data.Memory.Endian does the job, and it seems to do it by #including "MachDeps.h", which is supplied by GHC. That's the quick and easy solution, then. Thanks. |
@egrimley: it's incorrect to use {,double,quad}word memory access ? Can you point me to relevant informations ? |
In general you're not allowed to dereference a pointer unless the pointer value was obtained by taking the address of an object with a compatible type. So I was thinking of a situation in which the compiler can see that p was not obtained thus. You're pretty safe, in practice, if p is an argument passed from code that was generated separately, though if you have char *p and you use both *(int *)p and *(int *)(p + 15) then the compiler can deduce that one of those is unaligned and therefore undefined (regardless of whether the hardware allows unaligned memory access). Undefined behaviour invalidates the entire execution, which is why the compiler can assume that x is zero not just after the if (x) *(uint32_t *)p = ... but beforehand, too. By the way, I'm not an expert language lawyer, so I might be erring on the side of caution here... |
unaligned access is never undefined. the compiler could maybe decide to change the access method (switch to instructions that uses bytes), but I don't really see the compiler removing instructions because it thinks they are invalid. |
@nomeata I think most issues should be fixed, any chance you can check that the latest version work properly for second tier archs ? |
cryptonite-v0.7 seems to require memory >=0.8 (for B.snoc and
Base64OpenBSD) while Debian only has haskell-memory 0.7-3 in experimental
so it might be easier to wait a bit.
|
tracking issue for supporting more architectures
#24 #25
The text was updated successfully, but these errors were encountered: