Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some questions #1940

Open
Davidc2525 opened this issue May 10, 2024 · 17 comments
Open

Some questions #1940

Davidc2525 opened this issue May 10, 2024 · 17 comments

Comments

@Davidc2525
Copy link

I have some questions

Is its scalability linear? I mean, the more nodes in the cluster, the more requests it can process?

Can I have, for example, more than 50 nodes or 100 nodes in a cluster?

馃馃

@emschwartz
Copy link
Contributor

Hi @Davidc2525 that's a great question!

TigerBeetle can handle a huge number of transactions with a very high throughput. However, it is intentionally not horizontally scalable. You could theoretically run more than the 6 recommended nodes, but it won't increase the throughput.

One of the insights behind TigerBeetle's design is that business transactions don't shard well. Horizontally scaling a database seems like it would help process more transactions, but often there are a few hot accounts that are involved in a large percentage of the transactions. You'd then need to handle updates to these accounts using row locks, they'd become the bottleneck, and that would eliminate the gains you'd expect from sharding the system.

Instead, TigerBeetle is built to squeeze the maximum amount of performance out of a single CPU core. Transactions are submitted in large batches and it uses techniques like direct I/O, io_uring, and zero-copy parsing to get the most out of the CPU. Because transactions are processed by a single core, there is no need for row locks, which further improves performance.

In the TigerBeetle cluster, the 6 nodes are there for high availability, fault tolerance, and automated failover, rather than to increase the transaction throughput.

Does that make sense? Let me know if you have more questions about this!

You might also be interested in this paper: https://www.usenix.org/system/files/conference/hotos15/hotos15-paper-mcsherry.pdf

@Davidc2525
Copy link
Author

Davidc2525 commented May 11, 2024

Well, I did a test with a single node and it seems very fast, but having a cluster of only 6 nodes for a system that plans to be large, I think that only 6 nodes for the entire load is very little, look at these statistics:

https://apple.github.io/foundationdb/performance.html

Do you think that with Tigeebeetle you can achieve that performance?

I am looking to implement a system but being limited in the scalability of one part of the system I do not think it is appropriate, I know that tigerbeetle and foundationdb are not the same, but in terms of scalability foundation is very scalable and linear, of course I know that in the same team the 8 thousand transactions are done faster than in foundationdb, but when it comes to scaling it can have the same performance, take a look and tell me what you think

@emschwartz
Copy link
Contributor

Great questions again.

First of all, FoundationDB is a great system, so this is not a knock against it. However, it's built for a different type of workload. That benchmark you referenced uses a workload of "90% read and 10% write". If you're building a system that's meant to handle a large number of incoming business transactions, the bottleneck will actually be the write performance. TigerBeetle was specifically designed to handle a very write-heavy workload.

If you have hot accounts that are accessed by many write transactions, it's likely that the write performance of a distributed DB will be even worse than a single-core one because it'll need to lock those hot accounts during transactions.

Even for reads, FoundationDB will have trouble with very hot accounts. Their limitations page says:

However, it does not currently increase the replication factor of keys that are frequently read. As a result, the aggregate read performance of any given key or small contiguous set of keys in a triple-replicated system is limited to the total performance of three server processes (typically on the order of 100,000 reads per second).

One more difference that's worth calling out is how the transaction throughput is achieved. The benchmark you mention achieves 8 million TPS with 384 cores. TigerBeetle is designed to handle 1 million TPS on a single core (+5 for fault tolerance).

A lot of very high-performance transaction systems also emphasize single-core designs. For example, TigerBeetle was very much inspired by the LMAX exchange architecture, which was built for super high throughput and low latency. One of the key takeaways from this presentation is that you get better performance with a single-core architecture: https://www.infoq.com/presentations/LMAX/

Hope this helps!

@Davidc2525
Copy link
Author

Hello, as always, thank you for taking the time to respond.

(TB) = TigerBeetle
(FDB) = FoundationDb

I know that (FDB) is not very fast in writing, and I know that it is nothing like (TB), but I can implement what I need in (FDB), although I want to use (TB) but as you mentioned:

One more difference that's worth calling out is how the transaction throughput is achieved. The benchmark you mention achieves 8 million TPS with 384 cores. TigerBeetle is designed to handle 1 million TPS on a single core (+5 for fault tolerance).

What if I need more than 1 million TPS?

Another thing is that I have not yet been able to test it thoroughly (TB), I only did a test to check how long it took to process the 8 thousand transfers per batch that it accepts (TB) and in comparison with (FDB) with the tests that I do in local with comfortable hardware, there is no way to beat (TB).

I had planned to set up an interface to be able to create 2 implementations, one using (TB) and the other (FDB)
For the implementation in (FDB) I want to make a 3-layer system

1 cache (all transfers and balance changes are processed here), with ACID operations capacity

2 a programmer who sends those changes to (FDB)

3 layer (FDB)

and the other implementation using TigerBeetle.

One more difference that's worth calling out is how the transaction throughput is achieved. The benchmark you mention achieves 8 million TPS with 384 cores. TigerBeetle is designed to handle 1 million TPS on a single core (+5 for fault tolerance).

Again, out of 6 nodes, only one processes the transfer batches?

The other 5 don't receive requests?

The other 5 are just replicas? (I think so)

If a node dies, does another of the remaining 5 start processing the transfer batches?

I actually have a lot of questions.

@emschwartz
Copy link
Contributor

What's the use case where you'd need more than 1 million TPS?

Out of the 6 nodes, one is the leader and that's the only one whose processing of the transactions really "counts". The others are replicas and they are processing the transactions as well, but only to follow along and provide fault tolerance for the leader's state. If a node dies, the others will keep processing. If there's a problem with the leader, one of the other replicas will automatically become the leader and the system will keep going.

You mention that you know you can implement what you need in FDB but you're not sure about TB. Is that purely about performance or are there other features you're talking about?

@Davidc2525
Copy link
Author

I don't know what use case can reach that level but you have to be prepared, at least I like to prevent everything, just that.

I have to do some tests. Thank you for taking the time to respond, any other questions, I know I can find a solution here馃槉

@Davidc2525
Copy link
Author

Tb Can it actually do 1 million transactions per second? 馃馃

@emschwartz
Copy link
Contributor

With only primary indexes enabled, TB can do 988k TPS.

By default, TB comes with around 20 secondary indexes enabled. With this configuration, the automatic benchmarks run on CI with shared infrastructure show it doing around 160k TPS. Note that a dedicated, physical server would be faster.

There are still more optimizations that have not yet been implemented that would raise these numbers further.

@Davidc2525
Copy link
Author

What about indexes?

The one million TPS is achieved on a server with good performance, I assume. Have you conducted such tests on a server?"

@Davidc2525
Copy link
Author

Tomorrow I'm going to do some tests, tomorrow I'll tell you the results I do.

@Davidc2525
Copy link
Author

Davidc2525 commented May 15, 2024

Hello, I did some tests but in codeanywhere, since I still do not have access to a local server, I do not know the characteristics of the servers of this service, but here are the results in some images.

usually executes a batch of 8000 thousand transactions between
90 and 150 ms and to complete the million transactions it takes about 30 seconds.

There is a small program in Go in case you can run the test on a server, if available, and then put more code into the program.

It is already compiled, you can run it using ./main. must have a TB server running on 3000

to see the commands:

./main help

Repo here

https://github.com/Davidc2525/test_tb

I also leave some screenshots

  • img 1

Captura de pantalla de 2024-05-15 10-21-31

  • img 2
    Captura de pantalla de 2024-05-15 10-19-55

  • img 3
    Captura de pantalla de 2024-05-15 10-17-18

  • img 4
    Captura de pantalla de 2024-05-15 10-16-35

  • img 5

Captura de pantalla de 2024-05-15 10-13-29

Let me know what you

@Davidc2525
Copy link
Author

I did the same test on my PC but using aerospike instead of Tigerbeetle

6 minutes for 1 million transactions, I suppose that on a much more powerful server it gets better times, just like with TB

aerospike is a db in memory and with ACID

aero

@Davidc2525
Copy link
Author

@emschwartz hey

@Davidc2525
Copy link
Author

Davidc2525 commented May 17, 2024

I ran it on another server and the million TPS dropped to 7 seconds :)

8 segundos

@Davidc2525
Copy link
Author

Hey @emschwartz @jorangreef

@emschwartz
Copy link
Contributor

Hi @Davidc2525, is that TB server running using the default configuration? If it has the secondary indexes enabled, that sounds like the expected performance.

@Davidc2525
Copy link
Author

Davidc2525 commented May 21, 2024

Hii @emschwartz
Yes, it is the default configuration, will production compile it to see how it goes, how do I enable the indexes you mention?

Well the truth is that 30 seconds to 8 is amazing, but I'm going to compile in production, I would like it to go down to 3 seconds or less

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants