Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Guava Table support and smaller fixes #108

Merged
merged 6 commits into from Jan 26, 2019
Merged

Guava Table support and smaller fixes #108

merged 6 commits into from Jan 26, 2019

Conversation

maraswrona
Copy link
Contributor

TL;DR;

I noticed that the Guava's Table structure is not directly supported by this library. I need this in my project, so I decided to share my work.

What's inside:

  • serializers for all guava's Table implementations, including: ArrayTable, HashBasedTable, TreeBasedTable and ImmutableTable
  • a fix to TreeMultimapSerializer to include key and value comparators (this allows to serialize a TreeMultimap together with non-standard comparators)
  • a fix to ImmutableMapSerializer to include some private implementations used by DenseImmutableTable

Details

So I'm using kryo-serializers + subzero + hazelcast + guava. I noticed I couldn't serialize the ImmutableTable properly so I decided to investigate why. Surprisingly some of the Table implementations are serialized just fine (e.g. HashBasedTable) but that is only because they are handled by the generic FieldSerializer, so their private fields are basically inferred and serialized. The reason why this doesn't work for ImmutableTable implementation is that it internally uses some private implementation of ImmutableMap (e.g. Column or Row) which in turn are serialized by the FieldSerializer as regular Map, and fail during deserialization.

This PR fixes both sides of this problem: it adds direct support for Table implementations (including ImmutableTable) as well as registers additional private implementations of ImmutableMap with ImmutableMapSerializer.

Additionally I noticed that the TreeMultimapSerializer didn't cover non-default comparators. I added support for those in TreeBasedTableSerializer, so decided to fix it for TreeMultimapSerializer too.

Please take a look and review before merging. I'll appreciate any feedback!

@coveralls
Copy link

Coverage Status

Coverage increased (+1.2%) to 87.158% when pulling e7d7ce7 on maraswrona:master into c8cc4f8 on magro:master.

@coveralls
Copy link

coveralls commented Jan 21, 2019

Coverage Status

Coverage increased (+1.3%) to 87.256% when pulling 655686f on maraswrona:master into c8cc4f8 on magro:master.

@magro
Copy link
Owner

magro commented Jan 21, 2019

Awesome! Can you please also update the readme?

@maraswrona
Copy link
Contributor Author

I updated the readme.

I also changed the implementation of Table serialization strategy. The first attempt was to serialize a collection of cells as R + C + V triplets. I've run some benchmarks and the output produced by this approach is 10% bigger than what you get from the default FieldSerializer. It's especially inefficient for very dense tables which are filled e.g. 100%. Then basically we serialize rows and columns multiple times unnecessarily.

Now the strategy goes through the rows and through the columns for each row in nested loop and saves row first, and then its colums and values. It is 5% more efficient than the FieldSerializer and 12% more efficient than the previous implementation.

I keep quite a lot of these Table objects in my Hazelcast cluster, so it matters to me if they don't occupy to much memory, so I wanted to measure this. I hope I didn't make any mistakes.

@magro magro merged commit eae2cf6 into magro:master Jan 26, 2019
@magro
Copy link
Owner

magro commented Jan 26, 2019

Great, thanks! I just pushed version 0.44 to maven central.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants