Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix alignment restrictions #269

Merged
merged 3 commits into from
Jul 3, 2016
Merged

Conversation

iliastsi
Copy link
Contributor

The testsuite for conduit-extra fails on armhf architectures because of unaligned memory access.

According to the docs, the peek and poke functions might require properly aligned addresses to function correctly. This is architecture dependent; thus, portable code should ensure that when peeking or poking values of some type a, the alignment constraint for a, as given by the function alignment is fulfilled.

Here I provide two patches that ensure that alignment constraints are fulfilled. The implementation requires one extra copy, but makes the code portable.

I have patched the code in two steps, first the sinkStorable implementation and then the testsuite. This hopefully demonstrates that the patches introduce no regressions.

@snoyberg
Copy link
Owner

I'm not that familiar with the code in question (I inherited it from elsewhere). Two questions:

  1. I'm surprised that the test suite needed patching. Does this mean that the bytestring functions do not, in fact, allocate aligned buffers?
  2. This will negatively impact performance on all platform that do not require alignment, correct? Is there any way to use conditional compilation to avoid that overhead?

@snoyberg
Copy link
Owner

Correction to the previous comment, I am actually the author of this code, I thought I was looking at something else. Still, original concern remains: can we avoid the performance hit of copying on systems that don't require alignment?

@iliastsi
Copy link
Contributor Author

Hi,

I'm surprised that the test suite needed patching. Does this mean that the bytestring functions do not, in fact, allocate aligned buffers?

In the test suite, the pokeByteOff function is used in order to fill the bytestring with a list of SomeStorable. This causes unaligned access in case, for example, the first item is of type Word8 and the second of type Double. This has nothing to do with the alignment of the bytestring itself.

This will negatively impact performance on all platform that do not require alignment, correct? Is there any way to use conditional compilation to avoid that overhead?

I am not an expert on this, so I am not sure how bad the impact on the performance would be. From what I understand, all architectures impose alignment constraints on memory access, but some of them are able to correct an unaligned one transparently, with a performance cost. Other times, the kernel may handle the exception and fall-back to copying data (see also unaligned-memory-access from Linux kernel).

I couldn't find a list of architectures where the above properties are described, so I thought it would be safer to enforce aligned access to all of them. If we could find such a document, then we could enable this feature only for the affected architectures, using the preprocessor.

What do you think?

@snoyberg
Copy link
Owner

In our performance testing for Store, we found that on x86 architectures at least, unaligned access did not negatively hurt performance. I'd be happy to go with a whitelist approach here and just say that x86 (32- and 64-bit) continue with the code as-is, and all other architectures use the copy/align logic you've implemented here.

@iliastsi
Copy link
Contributor Author

Yes, that sounds great! Do you want me to implement the whitelisting and update the PR, or would you like to do it yourself?

I also noticed that mapM_BS uses the peek function as well, but since the testsuite now passes, I didn't change that. Do you believe we should change that as well?

P.S.: We applied the patch in Debian and the testsuite now passes for all architectures.

@snoyberg
Copy link
Owner

If you could update it, that would be great. For the mapM_BS function, I think alignment isn't an issue, since we're just peeking one byte, which should have an alignment of 1.

The effects of performing an unaligned memory access vary from
architecture to architecture. Add the ALLOW_UNALIGNED_ACCESS macro that
denotes that unaligned memory access is allowed for the architecture for
which we are building. For now, unaligned memory access is only allowed
for the x86_64 and i386 architectures.
sinkStorable uses the peek function to read a value from the given
memory location but doesn't pay attention to the alignment constraints.
This makes the code to fail under the armhf architecture (thus making
the code non-portable).

Change the sinkStorableHelper function to use a safer version of the
peek function which first copies the data byte-for-byte to a new,
properly aligned location, and then reads the value. The code reverts to
the old implementation in case ALLOW_UNALIGNED_ACCESS is defined.
Modify the someStorables function to take into account the alignment
constraints when using the pokeByteOff function.
@iliastsi
Copy link
Contributor Author

Sorry for the delay, I have updated this PR.
Feel free to make any changes you like in order to match you personal coding style.

Also, I am a little worried about the use of the inlinePerformIO function. The docs seem to suggest that this function should not be used. Please consider replacing it with the unsafePerformIO.

@snoyberg
Copy link
Owner

snoyberg commented Jul 3, 2016

Before getting too deep into inlinePerformIO, I'd recommend reading more about it. I actually wrote a fairly long article for the GHC wiki on the subject: https://wiki.haskell.org/Evaluation_order_and_state_tokens

@snoyberg snoyberg merged commit 46e293b into snoyberg:master Jul 3, 2016
@iliastsi iliastsi deleted the feature-alignment branch July 4, 2016 06:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants