Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Write-through inter-block cache #1947

Closed
cwgoes opened this issue Aug 9, 2018 · 15 comments · Fixed by #4748
Closed

Write-through inter-block cache #1947

cwgoes opened this issue Aug 9, 2018 · 15 comments · Fixed by #4748
Assignees
Labels
Milestone

Comments

@cwgoes
Copy link
Contributor

cwgoes commented Aug 9, 2018

Many values in state are frequently read but infrequently written, such as global parameters, or written at the end of one block and read in the next, such as governance proposal queues. We should have a write-through inter-block (persistent) cache which prevents these kinds of reads from hitting the disk.

@ValarDragon
Copy link
Contributor

Seems like something we should absolutely do. Can be done in a non-breaking manner as well, as its just caching in memory. (Hence I've tagged it postlaunch)

@rigelrozanski
Copy link
Contributor

trying to understand what's being proposed. Are you suggesting that we add a write through cache which reflects certain state normalling held in the block state, but persists in between blocks? (unless in a rare situation one of those parameters is updated). So for each block the cache would still need to be verified once anyways - or maybe we could have a special update functionality for any params in this cache

@cwgoes
Copy link
Contributor Author

cwgoes commented Aug 28, 2018

So for each block the cache would still need to be verified once anyways.

No, because it's a write-through cache (the underlying IAVL tree is written too), and the blocks are executed in order. The cache can be implemented below the ABCI abstraction, just over the GoMemDB. Main advantage is in speeding up reads and iteration over frequently accessed parameters.

(maybe I misunderstand your question?)

I actually think we might want to make this prelaunch, it's not too hard and could speed up block execution substantially.

@rigelrozanski
Copy link
Contributor

if it's easy to do, and you think it will speed things up substantially then yes it sounds like a priority - but still a last prelaunch priority which happens after feature complete

@ValarDragon
Copy link
Contributor

I actually think we might want to make this prelaunch, it's not too hard and could speed up block execution substantially.

Lets get the state machine complete, and mostly bug free, then consider performance improvements. Those can easily be punted to postlaunch.

Currently we don't even have benchmarks for IAVL or our block execution. Thus I'm extremely hesistant for us to spend awhile optimizing code prelaunch, when we don't even have evidence saying its the problem. I'm currently leaning on making any backwards compatible performance improvement postlaunch, as our current block execution time is sufficient, and we don't have evidence of what our bottlenecks are.

The biggest bottleneck in the entire system is our current mempool design, which we have decided to fix postlaunch. The best this change would do is speedup block processing time, by a currently unknown factor, which is not at all a prelaunch concern imo. The best faster blocks provides, is faster network block time. Given that the first 3 weeks have 0 load, block time is not a concern at all, and therefore we could develop this change in that 3 week interrim.

@cwgoes
Copy link
Contributor Author

cwgoes commented Aug 28, 2018

Lets get the state machine complete, and mostly bug free, then consider performance improvements.

Agreed on benchmarks first. I think we want to avoid needing to push changes within a short timeframe after launch, because we might need to react to unexpected bugs. If we can figure out how much of an issue this will be short-term, that would better inform the pre/post-launch decision.

@ValarDragon
Copy link
Contributor

Do you have a particular preference for this being a write-through cache vs a write-back cache? I think it'd be a better design to make it write-back (i.e. all writes are batched in end block or commit) While we may not have a batch write operation in IAVL, its reasonable to think that one day we might.

@cwgoes
Copy link
Contributor Author

cwgoes commented Oct 18, 2018

A write-back cache could make sense, depending on what operations the IAVL tree supports (presumably we could change the kind of cache without breaking the state machine?) - I think the more important part is just to cache reads of data that is frequently read (e.g. parameters, validator info) but not updated between blocks, which AFAIK we don't do at all at the moment.

@jackzampolin
Copy link
Member

Did we implement this? Also applicable for @ultraeric

@rigelrozanski
Copy link
Contributor

rigelrozanski commented May 28, 2019

not that I'm aware of

@alexanderbez
Copy link
Contributor

@jackzampolin No, we currently do not have this implemented atm and we can and should introduce such functionality soon (in a non-breaking manner of course).

From my general understanding is as follows:

write-through: Slower writes, but fast/efficient reads. Best for situations that write and then (re)read data frequently.
write-back: Faster writes (only blocks on I/O to cache) and efficient reads, but can potentially suffer from data loss if the cache is corrupted/lost and data not written to disk in time. Best for situations that require low latency and frequent writes.

Being that we want to keep any hits to I/O at a minimum, it seems a write-back is more suitable. Data loss should not be a problem if we can batch deletes and writes in the store/cachekv.Store#Write. But even without batching, this can still be done afaict. Correct me if I'm wrong.

Perhaps this intra-block cache can reside in the BaseApp and somehow feed this cache into the context when starting a block. Commit it when the block finishes.

@ValarDragon
Copy link
Contributor

This is an issue for an inter-block cache, in that setting, I think a write through cache is more suitable. For an inter block write cache, I don't think write back is suitable. Its not fine for a node to crash at block N + x, with that Tendermint state, whereas its state machine state is block N. That problem is elided with the write through cache.

I do agree with a write-back intra-block cache (which you suggested), but I wanted to clarify as it is an inter-block cache issue. I do think the inter-block cache is more critical, as intra-block caching is at least semi achieved with L3/L4 caches in servers.

@alexanderbez
Copy link
Contributor

Ahhh yes, I misunderstood then. If commits aren't happening blocky, then write-through is more suitable for safety guarantees. I think we can easily achieve this through some caching type in the BaseApp that is passed along to contexts.

But, as I think of it, we already have intra-block caching, don't we? The BaseApp's deliverState and multi-store cache wrapping provide this unless I'm mistaken.

@cwgoes
Copy link
Contributor Author

cwgoes commented Jul 4, 2019

Bump; I'm pretty sure this will increase performance quite a bit (ref #4641).

What do you think @alexanderbez?

@alexanderbez
Copy link
Contributor

Yes, this is on the top of my priority list -- plan on getting it started this weekend.

@alexanderbez alexanderbez self-assigned this Jul 5, 2019
@fedekunze fedekunze added this to the v0.37.0 milestone Jul 19, 2019
chillyvee pushed a commit to chillyvee/cosmos-sdk that referenced this issue Mar 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants