As requested, I provide a pull request for the issue #73
It compiles, but I could not test, because I don't have a local hadoop installation.
It would be nice if someone could give it a try and report any issues that may pop up during testing.
It would be also very interesting to know if there are any performance improvements.
Note: Right now I use UnsafeInput and UnsafeOutput constructors which do not use explicit buffer-size parameter and therefore Kryo will use the default size of 4096. It could be interesting to experiment with different buffer sizes to see how it affects performance.
Make use of latest Kryo snapshot which provides Unsafe-based streams
It would be good to add a couple of basic Hadoop tests using the HadoopSerializationFactory here to make sure that all's well. I'll add an issue.
Thanks for doing this.
Worried about bumping the Kryo version to a SNAPSHOT since we have a lot of internal code on this.
Is 2.22 the only one with Unsafe?
Yes, 2.22-SNAPSHOT is the first one with Unsafe. And of course I do not propose to release Chill version depending on a SNAPSHOT version of Kryo.
Let me explain why this Kryo snapshot it is not released as a version yet. IMHO, since the Unsafe-support introduces a lot of changes into Kryo code-base , Nate is a bit afraid of making it an official release before some of the bigger projects using Kryo tried it out and confirmed that it does not break existing code and really provides performance improvements. And many users wait for 2.22 to be released before they can test. It is a catch22 situation ;-)
Ironically, some other projects "copy&pasted" this implementation and already put it into their trunks and releases, e.g. Avro and Hazelcast.
Therefore, my suggestion is to test chill with my changes using this pull request. It could be done even on a dedicated branch. Should it work fine, we could report it back to Nate and it may make it easier for him to release 2.22. And then Chill Unsafe-support can be merged into Chill trunk as well.
What do you think?
I'm curious, now that 2.22 is out, do you guys plan to switch to it in Chill 0.3.4?
@mateiz we have been running a fixed snapshot in production, but we have not yet tested 2.22 final. Have you guys tried it? If it is not binary compatible, we need to push to 0.4.0.
pull req here with a bump would speed us up. :)
We haven't tested it yet. I think the Unsafe based format in 2.22 is not compatible with the normal Kryo one, but the normal one is compatible with 2.21. So people who want to use Unsafe won't be able to load old data. Anyway, I'll let you know if we experiment with it.