How we can achieve 1 million TPS with Omid?
The design in the branch MegaOmid achieves that by partitioning the transactions among multiple status oracles. The global transactions are handled through a global status oracle. Global status oracle is necessary only for global transactions and transactions local to a status oracle do not need the global status oracle to progress.
The current prototype is lacking the following features:
The recovery procedure for the global transactions is not integrated into recovery of local transactions.
The dynamic partitioning and dynamic status oracle membership is not implemented.
We ran some experiments with a synthetic traffic generated by 30 clients. The setup uses 1 global status oracle and 10 normal status oracles. Inspired from the load of TPC-E, 80% of the traffic is read-only. The average transaction size is 8 for single-SO transactions and 17 for global transactions. Each status oracle handles 80K TPS normal traffic and 73K TPS global traffic, which translates into 873K TPS.
Here, we evaluate the overhead of global transactions when MegaOmid is integrated into HBase. Running a global transaction is more expensive since the client has to communicate with all the status oracles (that are responding), which essentially makes the partitioning ineffective. Therefore, the overall performance depends on how well the traffic is split among the partitions. In this experiment, we modify YCSB to change the ratio of global transactions from 0% to 100%.
The clients are oblivious to existence of multiple status oracles. All the detail of having multiple status oracles is hidden in the transaction library. This ensures that the developers are not burdened with such details. The transaction manager always favors local transactions. The partition of the local transaction is selected based on a most-frequently-accessed policy. If the transaction accesses a row beyond the range of the partition, the library throws an exception, indicating that the client should retry the transaction. In the next attempt, the library automatically switches to a global transaction to cover the range that the transaction requires.
Figure above depicts the throughput in terms of local and global transactions as we increase the probability of cross- partition requests from 0% to 100%. The load is generated by 47 clients and split over 5 status oracles. In the ideal case that all requests are limited to the designed partitions, the system delivers 296K TPS, which only consists of local transactions. With increase the ratio of cross-partition requests, the system fails on more local transactions and falls back to global transactions more often. With 100% cross-partition requests, the rate of executed global transactions increases to 62K TPS. This is while the rate of local transactions drops to 9K. (The reason that this rate is still not zero is that the partitions are chosen randomly and a small ratio of requests happens to be limited to a single partition.)