We have noticed that Rubix provides better performance on Spark over Presto. It is also required to get a sense of similar projects like raptor, LLAP cache and Spark RDDs. This will help us emulate and surpass the performance of these projects.
Design a Rubix Client to interact with Rubix without the need of an engine like Spark or Presto. It is also useful in product tests. The client should be future compatible to handle less primitive servers.
The node a file gets allocated in decided by consistent hashing. The main purpose of consistent hashing it to minimize the membership (between file and node) change when a new node gets added or an existing node gets removed.