Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Map B=6 support. #65

Merged
merged 12 commits into from
Nov 25, 2018
Merged

Map B=6 support. #65

merged 12 commits into from
Nov 25, 2018

Conversation

fedormatantsev
Copy link
Contributor

I still have to refine the implementation, but the idea is clear.
I have some issues with building tests, will continue tomorrow.

Looking forward for the review :)

using bitmap_t = std::uint32_t;
using count_t = std::uint32_t;
using shift_t = std::uint32_t;
};
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @fedormatantsev!
One thing that is important: only bitmap_t depends on B! The other types should remain outside of this types template. In this way, you can also keep the bits_t type, which you have removed. Also in that way you need to change way less things overall in the code. There are very few mentions to bitmap_t in the code, which is the only type dependent on B.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, sounds reasonable.

#endif
}

inline count_t<6> popcount(bitmap_t<6> x)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of overloading popcount for different bitmap_t<X>, I would suggest defining the overloads in terms of the underlying representation types. This is, to provide functions:

std::uint32_t popcount(std::uint32_t x);
std::uint64_t popcount(std::uint64_t x);
...

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise you are unnecesarily coupling a general function (popcount) to implementation details of hamts (the notion of bitmap). Does this make sense?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good suggestion, thanks!

# else
return __builtin_popcount(x);
# endif
#else
// More alternatives:
// https://en.wikipedia.org/wiki/Hamming_weight
// http://wm.ite.pl/articles/sse-popcount.html
// http://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetParallel
x = x - ((x >> 1) & 0x55555555);
x = (x & 0x33333333) + ((x >> 2) & 0x33333333);
return ((x + (x >> 4) & 0xF0F0F0F) * 0x1010101) >> 24;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Be careful, this fallback only works for 32 bits, not for any size of bitmap!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I will adjust constants based on wiki page
https://en.wikipedia.org/wiki/Hamming_weight

int popcount64c(uint64_t x)
{
    x -= (x >> 1) & m1;             //put count of each 2 bits into those 2 bits
    x = (x & m2) + ((x >> 2) & m2); //put count of each 4 bits into those 4 bits 
    x = (x + (x >> 4)) & m4;        //put count of each 8 bits into those 8 bits 
    return (x * h01) >> 56;  //returns left 8 bits of x + (x<<8) + (x<<16) + (x<<24) + ... 
}

@codecov-io
Copy link

codecov-io commented Nov 24, 2018

Codecov Report

Merging #65 into master will increase coverage by 0.04%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #65      +/-   ##
==========================================
+ Coverage   94.45%   94.49%   +0.04%     
==========================================
  Files          84       84              
  Lines        6901     6902       +1     
==========================================
+ Hits         6518     6522       +4     
+ Misses        383      380       -3
Impacted Files Coverage Δ
immer/set.hpp 100% <ø> (ø) ⬆️
immer/map.hpp 91.17% <ø> (ø) ⬆️
immer/detail/hamts/champ.hpp 93.98% <100%> (+1.38%) ⬆️
immer/detail/hamts/bits.hpp 100% <100%> (ø) ⬆️
test/map/generic.ipp 100% <100%> (ø) ⬆️
immer/detail/hamts/node.hpp 97.27% <100%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3eb9a0c...a5051cd. Read the comment docs.

@fedormatantsev
Copy link
Contributor Author

At this point I'd like to discuss a special test to check if two values are not colliding when their hash differs only in the last 32 bits. I basically have no idea how to make it, but it actually makes sense to me since tests shouldn't fail even without my changes... I think it requires some kind of introspection for internal data structure

@arximboldi
Copy link
Owner

Hmmm, I don't think that a test needs to instrospect the internal data-structure. I prefer tests that only use the observable interface. There are already tests that generate collisions by overriding the hash function passed. I think it is ok to just reuse the same test-suite and use it for different B's. The tests are already generic, you can look at immer/test/set/B3.cpp to see how the tests are instantiated for B=3, just create a new file in that folder B6.cpp with the appropriate instantiation.

Once again, thanks for doing this work!

@fedormatantsev
Copy link
Contributor Author

fedormatantsev commented Nov 25, 2018

I already did B6 tests for map/set =) But the point is not to check if values with hash collision work, but, on the contrary, values with different hashes don't collide

I agree test shouldn't know too much about internals

@fedormatantsev fedormatantsev changed the title [WIP] Map B=6 support. Map B=6 support. Nov 25, 2018
@arximboldi
Copy link
Owner

Nice! This is starting to look very good. It looks to me like it is almost ready, right?

There are a couple of suggestions that I still have. The first one is more stylistic: can you define an alias to the right using bitmap_t = ... inside struct champ and another one inside struct node? In this way, you don't have to tell in all the places bitmap_t<B> but just bitmap_t, reducing the amount of change but also the amount of places that need to be concerned with dealing with B, reducing possibility of mistakes.

The other thing that I don't like a lot is the get_bit function. I like to be able to see the bit math directly in the code. I would suggest to just use bitmap_t{1} << ... directly in the code instead of going through a function.

@fedormatantsev
Copy link
Contributor Author

Yeah, I think it's ready to be merged.
Some iterations over it might be useful anyway (and I'm ready to iterate(: ).

Feel free to squash&merge the PR whenever you think it's OK.

@arximboldi
Copy link
Owner

This is looking great! I am happy to merge if you want, but I can also wait if you have some other changes in mind. One potential line improvement would be to use 16bit bitmaps for B=4 and 8bit bitmaps for B=3, but it is not a condition for merge. Great work!

@fedormatantsev
Copy link
Contributor Author

I think I'd like to implement your suggestion in scope of another weekend-PR. We can file an issue and assign it to me so I'll not forget to do it.

Let's wait for tests to complete and merge it!

@arximboldi arximboldi merged commit 1c8fdac into arximboldi:master Nov 25, 2018
@arximboldi
Copy link
Owner

Done, thanks a lot!

@fedormatantsev
Copy link
Contributor Author

Cool, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants