Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.Sign up
[jvm-packages] enable deterministic repartitioning when checkpoint is enabled #4807
trams left a comment
LGTM in general.
Why do we need to add a feature value at all to our hash function?
If the method of adding a value from this vector makes the hashing better why don't Murmur do this? Is this because Murmur is essentially streaming?
If this method does not make the hashing "better" is there some other reason I am missing to do this?
Why can't we just use a hash value and HashPartitioner? It should simplify logic