Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Eliminate seed from XXH32_state_s #79

Closed
stbrumme opened this issue Aug 4, 2016 · 2 comments
Closed

Eliminate seed from XXH32_state_s #79

stbrumme opened this issue Aug 4, 2016 · 2 comments

Comments

@stbrumme
Copy link

stbrumme commented Aug 4, 2016

The seed variable is stored in the XXH32_state_s struct because it's used when XXH32_digest_endian() is called after less than 16 bytes. However, that means that state->v1...v4 are is still unmodified and especially state->v3 has it's initial value which was state->seed.
If you replace line 665 by "h32 = state->v3 + PRIME32_5;" then we can get rid of state->seed.

See my xxHash implementation at http://create.stephan-brumme.com/xxhash/ (written from scratch) for a proof-of-concept.

I haven't analyzed xxHash64 yet, but after a quick look at your code it seems we could apply the same trick.

@Cyan4973
Copy link
Owner

Cyan4973 commented Aug 6, 2016

Thanks @stbrumme, this is an excellent idea.

While it's certainly possible to implement it, I'm nonetheless worried about a possible ABI break when reducing structure size.
A bad scenario would be as follows :

  • Program uses the new API/ABI with smaller structure size. It allocates enough space for the new structure.
  • It then links to a library which expects the older, larger, size. During initialization, library writes / read beyond allocated buffer. Segfault.

This scenario is supposed to be explicitly forbidden in xxhash.h,
but it's difficult to guarantee everyone follows the rules.

So a cautious transition scenario could be as follows :

  • Implement your idea, remove seed, to be replaced by state->v3 instead.
  • Preserve the structure size, by adding a "reserved" field at the end
    • Make sure the "reserved" field is never read nor written within library
  • Sometimes in the future, reduce structure size, by removing the "reserved" field

It should reduce risks in case of version mismatch, by allowing an "overlap" period, where the structure is compatible with older implementations, and implementation is compatible with future structure.

For discussion

Cyan4973 added a commit that referenced this issue Aug 6, 2016
@Cyan4973 Cyan4973 mentioned this issue Aug 10, 2016
@Cyan4973
Copy link
Owner

Partly implemented in v0.6.2.
Tail space in now "reserved", neither read nor written.
Should be possible to remove it completely in a few versions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants