Conceptual: Substrate Storage #255

shawntabrizi · 2019-09-24T14:26:14Z

Should this be in node or core?
Maybe needs a lot more content if there are relevant topics to touch on. Need storage expert to chime in.
More references
Diagram

shawntabrizi · 2019-09-24T15:42:02Z

Notes from @cheme

trie abstraction
I do not know if you are interested in technical details here, like what rocksdb inner storage looks like:

for trie for instance there is some relevant point for poeple implementing this kind of stuff

all trie node are stored in rocksdb. Part of the trie state can get pruned (key-value deleted from state out of pruning window/range for non archive node).
no reference counted in rocksdb stored trie node
trie nodes are encoded and stored in rocksdb at 'encoded trie path up to node ++ hash(encoded_node)'. That is to allow any kind of key and still be able to avoid key collision.
A word about the fact that the trie is good to maintain historic of value:
a nice thing to note is that this kind of structure allow storing history of block state. I mean sharing block state between block is inherent to the way the trie is defined (you don't have a state trie per block but a trie hash that will point on nodes from previous block state).

Also you could say that other than state trie, there is multiple place where trie are used (but not stored in rocks db with their full state: only key value are needed because we do not modify the trie): block extrinsics root, change tries and maybe others.

For storage it means : blocks storage (right now I do not know where they are stored, probably should look at clientdb), non cannonical block exection data/delta storage (that is state_db code, does not use trie), key pending pruning storage (I am not sure may also be from state_db code, does not use trie),

Main trie
I would call it 'State trie', it actually has one root hash per block (block header).
Used to verify indeed state at any block.

Technical detail could go into the fact that the rocksdb nodes content is only for canonical chain (so no branch), and there is a 'state_db' layer that maintain trie state with reference counted in memory only for all that is non canonical (loaded from stored delta for blocks: see journaldb).

Not having reference counted in the persistent db is for performance reason.

child trie
child trie are identical to main trie except their root is stored and updated in the main trie instead of the header.

therefore prooving inclusion of a key value at a state involves:

proving inclusion of the child trie root in the main trie for a given block header hash
proving inclusion of the key value in a trie with the previous child trie root value as root
Technical details are subject to change here (PR under progress to isolate the rocksdb key value between child trie).

The interesting one is to say that trie encoded value are using the same rocksdb collection, so there is a need to prefix the rocksdb keys to avoid key collision (related to the fact that we do not use reference counted).

I remember there was some discussion and bench about the complexity and cost of using a trie.

So it should be worth writing somewhere that access to trie data is costy; for a single key value query we need to parse all node leading to the trie. Therefore a key value cache is suitable (and in place, even if I did not have the occasion to verify it works correctly).

I think change trie could get their own description, but I do not think the feature is not really use yet, there is two thing important their

the tries are in memory only (only the key values are stored), the values are the indexes of key change, so the change trie can be use to see which extrinsic does touch a key for a block.
their is also a log digest that is build into the change trie for checking state, I won't try to describe the purpose (too complicated for me but Rob did a presentation about it the first day of ethcc).
There is also a bunch of metadata but that can be a bit unrelated.

Description of light client and offchain worker could be interesting to, but I am really not up to date with substrate light client, and part of offchain worker is still in progress (I am on a storage for it that keep trace of blockchain change, that is why even if very inefficient, the trie storage property of keeping all its history indexed is really good).

docs/conceptual/core/storage.md

kianenigma · 2019-09-27T07:39:20Z

docs/conceptual/core/storage.md

+
+## Trie Abstraction
+
+One advantage of using a simple key-value store is that you are able to easily abstract other storage structures on top.


Can we put a bit more content on how a Trie is abstracting on top of a K-V store? I have worked with storage from highest level (SRML), to the lowest (raw RPC to to encoded key) and from the looks of it, honestly, I never really felt like I am working with a Trie here. It always looked more like a bare bone KV store to me.

Would need to know what that information would look like...

Implementation details like this could be above the level of "conceptual" docs, but I agree we would want this information in the reference docs.

I would say it is interesting to keep in mind as a runtime developper that the trie is way more costy than a standard k-v store.
General idea being that you should rather store a serialized struct with two field at a storage location rather than both field at different location, except if you want to reduce the size of the proof.
That is something that can get quite clear when you apply storage cost (if you have a base cost and variable cost for size to store).

docs/conceptual/core/storage.md

shawntabrizi · 2019-09-27T07:59:51Z

@cheme Can you give a final approval when you are happy?

docs/conceptual/core/storage.md

joepetrowski

A lot of confusing paragraphs and not always clear what the point is. The "Trie Abstraction" part starts out with a few paragraphs on how tries can be used to verify state agreement, but then suddenly starts talking about performance and pruning.

Still a lot of English to be cleaned up,

The Substrate uses a simple a key-value data store
Tries are important tool for
it is still easy to verify of the complete node state

docs/conceptual/core/storage.md

JoshOrndorff · 2019-09-27T18:33:41Z

docs/conceptual/core/storage.md

+
+Substrate has a single main trie, called the state trie, whose changing root
+hash is placed in each block header. This is used to easily verify the state of
+the blockchain and provide a basis for light clients to verify proofs.


I would be interested to know more about the light clients and what exactly is proved to them. I know they only read block headers (not full blocks) so they get to see the state root before and after each block, but how do they actually know that the transactions in the block were executed correctly and lead to the resulting state root?

what is proved to the light client is validity of an operation over a state of the chain. Say the light client does not have the full state, he get the info he need with a proof that it comes from a valid full state.

simplier example should be:

Querying keys from state trie, the request will be (from cone/network/src/protocol/message.rs):

#[derive(Debug, PartialEq, Eq, Clone, Encode, Decode)] /// Remote storage read request. pub struct RemoteReadRequest<H> { /// Unique request id. pub id: RequestId, /// Block at which to perform call. pub block: H, /// Storage key. pub keys: Vec<Vec<u8>>, }

then the reply will be

#[derive(Debug, PartialEq, Eq, Clone, Encode, Decode)] /// Remote read response. pub struct RemoteReadResponse { /// Id of a request this response was made for. pub id: RequestId, /// Read proof. pub proof: Vec<Vec<u8>>, }

To make sense out of this reply we fetch request by id and the state trie root of the requested block hash, then we query keys over the proof.
The proof is here a subset of the memorydb the state trie is build upon.
This subset contains only the trie nodes the full client does record when running the query on its side (for every keys from the request).
Then the light client run the same keys query over a trie build from the block state root (he know it from the cht) and the trie nodes he just received (field proof of response). From this incomplete state trie he can get the resulting values of every keys in input (and since every node of trie refers to each over through a crypto hash it is proof it is in the chain state).

but this way of recording some operation on a full client and re-executing over this record on a light client can apply to many thing (if you look in message.rs there is a few query, call for instance uses executor on light blockchain).
Possible future design for substrate light client seems to be allowing evaluation of some wasm code which could be a better solution than chaining queries result like it was done in eth. (better solution to reducing the number of queries roundtrip).

Not 100% sure on the following point, but proving the transition between two consecutive blocks is not something we do (I think), we just rely on the fact that the blocks got validated and we know they are chained. Surely a full client can execute the transition by running a full block on its state, so it should be possible for him to send to a light client all accessed db keys during the full block execution and next root calculation, and the light client will be able to execute the block on those keys and produce next root. But I guess the proof will be really massive, so relying on network having validating those state may be enough in the light client usecase.

But you can run any call already (see RemoteCallRequest) or query the deltas between block through RemoteChangesRequest.

I think a slightly simpler, but hopefully still fully accurate explanation of light client proofs come from merkle proof in general.

from here: https://www.quora.com/Cryptography-How-does-a-Merkle-proof-actually-work

The image hopefully shows that if you want to prove TX3 is in the trie, you dont need ALL the nodes, just the full branch will leads to TX3 and all the nodes next to the nodes on that full branch. Obviously WAYWAY less data on big tries like the ones on blockchains.

Full nodes act as the provider to light clients of the proof it needs. They do so pretty selflessly, but there are thoughts to have light clients start micro-payments to full nodes for their services.

JoshOrndorff · 2019-09-27T18:35:39Z

I read the whole thing and learned while reading. I would really like to see an example of how and why to use a child tree. I guess how doesn't go in conceptual docs, but it should go somewhere. why probably does go here.

shawntabrizi · 2019-09-27T23:17:30Z

I read the whole thing and learned while reading. I would really like to see an example of how and why to use a child tree. I guess how doesn't go in conceptual docs, but it should go somewhere. why probably does go here.

That makes sense. The why, afaik, is that you want your own trie with it's own root hash that you can use the verify the state of that child trie.

A trie only has a single "root hash" which describes the whole trie. Subsections of the trie do not have some hash which represents their "sub-content". But maybe you want that, only for a subsection of data.... so you make a child trie.

Co-Authored-By: cheme <emericchevalier.pro@gmail.com>

Co-Authored-By: joe petrowski <25483142+joepetrowski@users.noreply.github.com>

…brizi/substrate-developer-hub.github.io into shawntabrizi-storage-doc

shawntabrizi · 2019-09-28T00:35:36Z

@joepetrowski I would like to merge this in having addressed every issue except:

A lot of confusing paragraphs and not always clear what the point is. The "Trie Abstraction" part starts out with a few paragraphs on how tries can be used to verify state agreement, but then suddenly starts talking about performance and pruning.

I think the best option is to just remove this section:

All trie nodes are stored in RocksDB and part of the trie state can get pruned,
i.e. a key-value pair can be deleted from the storage when it is out of pruning
range for non archive nodes. We do not use reference
counting for performance
reasons.

Which is entirely implementation details. But would want to hear from engineering that this data is not important to be taught.

joepetrowski

It's close. Mostly grammar nits and a few content suggestions. I should caution that tries/storage are a weak point in my CS understanding, so I think this is generally OK as in I learned something, but I'm not the authority on its correctness.

docs/conceptual/core/storage.md

Co-Authored-By: joe petrowski <25483142+joepetrowski@users.noreply.github.com>

joepetrowski

As long as @cheme approves for accuracy, I'm good with it.

shawntabrizi added 2 commits September 24, 2019 14:39

Update storage.md

368c7bc

skeleton of storage doc

ac6d78b

shawntabrizi added A3-inprogress S0-Conceptual Documentation about learning T2-Page Documentation which should live as a detailed external page labels Sep 24, 2019

shawntabrizi added 2 commits September 24, 2019 16:27

typo

09fb5dd

Update storage.md

7b8b959

shawntabrizi changed the title ~~Conceptual: Client Storage~~ Conceptual: Substrate Storage Sep 24, 2019

Integrate feedback

97e85f5

cheme reviewed Sep 27, 2019

View reviewed changes

docs/conceptual/core/storage.md Outdated Show resolved Hide resolved

shawntabrizi added 2 commits September 27, 2019 08:34

fixes

402bba0

clarify kind of node

89c720c

kianenigma reviewed Sep 27, 2019

View reviewed changes

docs/conceptual/core/storage.md Outdated Show resolved Hide resolved

Clarify

d77a40b

bkchr reviewed Sep 27, 2019

View reviewed changes

docs/conceptual/core/storage.md Outdated Show resolved Hide resolved

typo, fix line width

999d39f

shawntabrizi added 2 commits September 27, 2019 09:15

fixes

68a8592

replace misleading data

5bbead7

cheme reviewed Sep 27, 2019

View reviewed changes

docs/conceptual/core/storage.md Outdated Show resolved Hide resolved

docs/conceptual/core/storage.md Outdated Show resolved Hide resolved

docs/conceptual/core/storage.md Outdated Show resolved Hide resolved

cheme approved these changes Sep 27, 2019

View reviewed changes

joepetrowski requested changes Sep 27, 2019

View reviewed changes

JoshOrndorff reviewed Sep 27, 2019

View reviewed changes

shawntabrizi and others added 3 commits September 28, 2019 02:09

Update docs/conceptual/core/storage.md

ec323ac

Co-Authored-By: cheme <emericchevalier.pro@gmail.com>

Update docs/conceptual/core/storage.md

5813b88

Co-Authored-By: joe petrowski <25483142+joepetrowski@users.noreply.github.com>

Update docs/conceptual/core/storage.md

d6d7292

Co-Authored-By: joe petrowski <25483142+joepetrowski@users.noreply.github.com>

shawntabrizi and others added 4 commits September 28, 2019 02:20

Update docs/conceptual/core/storage.md

1c5993e

Co-Authored-By: joe petrowski <25483142+joepetrowski@users.noreply.github.com>

update from feedback

ab8caad

Merge branch 'shawntabrizi-storage-doc' of https://github.com/shawnta…

e3ceff2

…brizi/substrate-developer-hub.github.io into shawntabrizi-storage-doc

Add a section on why to use child tries

f4570ee

shawntabrizi requested a review from joepetrowski September 28, 2019 00:36

shawntabrizi added A0-pleasereview and removed A3-inprogress labels Sep 28, 2019

joepetrowski reviewed Sep 28, 2019

View reviewed changes

shawntabrizi and others added 12 commits September 28, 2019 23:27

Update docs/conceptual/core/storage.md

9db4f54

Co-Authored-By: joe petrowski <25483142+joepetrowski@users.noreply.github.com>

Update docs/conceptual/core/storage.md

23f725b

Co-Authored-By: joe petrowski <25483142+joepetrowski@users.noreply.github.com>

Update docs/conceptual/core/storage.md

59f529a

Co-Authored-By: joe petrowski <25483142+joepetrowski@users.noreply.github.com>

Update docs/conceptual/core/storage.md

07026a3

Co-Authored-By: joe petrowski <25483142+joepetrowski@users.noreply.github.com>

Update docs/conceptual/core/storage.md

9240c48

Co-Authored-By: joe petrowski <25483142+joepetrowski@users.noreply.github.com>

Update docs/conceptual/core/storage.md

2cc0216

Co-Authored-By: joe petrowski <25483142+joepetrowski@users.noreply.github.com>

Update docs/conceptual/core/storage.md

9cb7da5

Co-Authored-By: joe petrowski <25483142+joepetrowski@users.noreply.github.com>

Update docs/conceptual/core/storage.md

df67541

Co-Authored-By: joe petrowski <25483142+joepetrowski@users.noreply.github.com>

Update docs/conceptual/core/storage.md

96fee72

Co-Authored-By: joe petrowski <25483142+joepetrowski@users.noreply.github.com>

Update docs/conceptual/core/storage.md

330d8a1

Co-Authored-By: joe petrowski <25483142+joepetrowski@users.noreply.github.com>

Final fixes

72550f5

add todo

2018486

shawntabrizi requested a review from joepetrowski September 28, 2019 23:29

joepetrowski approved these changes Sep 29, 2019

View reviewed changes

Merge branch 'source' into shawntabrizi-storage-doc

0e33445

shawntabrizi merged commit 726befd into substrate-developer-hub:source Sep 29, 2019

shawntabrizi deleted the shawntabrizi-storage-doc branch September 29, 2019 11:23

4meta5 mentioned this pull request Sep 29, 2019

Generalized Child Tries (storage) JoshOrndorff/recipes#35

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Conceptual: Substrate Storage #255

Conceptual: Substrate Storage #255

shawntabrizi commented Sep 24, 2019 •

edited

shawntabrizi commented Sep 24, 2019

kianenigma Sep 27, 2019

shawntabrizi Sep 27, 2019

cheme Sep 27, 2019

shawntabrizi commented Sep 27, 2019

joepetrowski left a comment

JoshOrndorff Sep 27, 2019

cheme Sep 27, 2019

shawntabrizi Sep 27, 2019 •

edited

JoshOrndorff commented Sep 27, 2019

shawntabrizi commented Sep 27, 2019 •

edited

shawntabrizi commented Sep 28, 2019 •

edited

joepetrowski left a comment

joepetrowski left a comment


		## Trie Abstraction

		One advantage of using a simple key-value store is that you are able to easily abstract other storage structures on top.

Conceptual: Substrate Storage #255

Conceptual: Substrate Storage #255

Conversation

shawntabrizi commented Sep 24, 2019 • edited

shawntabrizi commented Sep 24, 2019

kianenigma Sep 27, 2019

Choose a reason for hiding this comment

shawntabrizi Sep 27, 2019

Choose a reason for hiding this comment

cheme Sep 27, 2019

Choose a reason for hiding this comment

shawntabrizi commented Sep 27, 2019

joepetrowski left a comment

Choose a reason for hiding this comment

JoshOrndorff Sep 27, 2019

Choose a reason for hiding this comment

cheme Sep 27, 2019

Choose a reason for hiding this comment

shawntabrizi Sep 27, 2019 • edited

Choose a reason for hiding this comment

JoshOrndorff commented Sep 27, 2019

shawntabrizi commented Sep 27, 2019 • edited

shawntabrizi commented Sep 28, 2019 • edited

joepetrowski left a comment

Choose a reason for hiding this comment

joepetrowski left a comment

Choose a reason for hiding this comment

shawntabrizi commented Sep 24, 2019 •

edited

shawntabrizi Sep 27, 2019 •

edited

shawntabrizi commented Sep 27, 2019 •

edited

shawntabrizi commented Sep 28, 2019 •

edited