Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs(adr): update ADR-40 by adding low level access to SC #9451

Closed
wants to merge 14 commits into from

Conversation

robert-zaremba
Copy link
Collaborator

@robert-zaremba robert-zaremba commented Jun 3, 2021

Description

When reviewing ADR-40 we were discussion about a use-case for modules to commit to an external value without engaging a state store. This proposal extends the ADR-40 describing such access.


Before we can merge this PR, please make sure that all the following items have been
checked off. If any of the checklist items are not applicable, please leave them but
write a little note why.

  • Targeted PR against correct branch (see CONTRIBUTING.md)
  • Linked to Github issue with discussion and accepted design OR link to spec that describes this work.
  • Code follows the module structure standards.
  • Wrote unit and integration tests
  • Updated relevant documentation (docs/) or specification (x/<module>/spec/)
  • Added relevant godoc comments.
  • Added a relevant changelog entry to the Unreleased section in CHANGELOG.md
  • Re-reviewed Files changed in the Github PR explorer
  • Review Codecov Report in the comment section below once CI passes

@robert-zaremba robert-zaremba added the T: ADR An issue or PR relating to an architectural decision record label Jun 3, 2021
@github-actions github-actions bot added the T:Docs Changes and features related to documentation. label Jun 3, 2021
@roysc
Copy link
Contributor

roysc commented Jun 6, 2021

State sync snapshots will still need to send the contents of the external state and derive the commitment structure from it, right? We may want some way to register a namespace as a "sub-store", and let the module specify how it is serialized.

@robert-zaremba
Copy link
Collaborator Author

State sync snapshots will still need to send the contents of the external state and derive the commitment structure from it, right? We may want some way to register a namespace as a "sub-store", and let the module specify how it is serialized.

@roysc - I think it's better that stat sync works on DB level: it sends DB data (not structures) over the wire. So we have 2 DBs to sync: SS and SC. This can be implemented as streaming all key - value pairs. All DBs we evaluated support that.

@roysc
Copy link
Contributor

roysc commented Jun 8, 2021

This can be implemented as streaming all key - value pairs. All DBs we evaluated support that.

That should work fine, but the bootstrapping node will need to rebuild the state commitment store once the DB contents are synced. When it does this, it needs to exclude the records in the module's namespace; then the module needs to rebuild its custom SC store from those records.

So what I mean to say is, the module needs to specify how its data is deserialized, and the root store needs to be aware of what namespaces to exclude from its SC data.

@robert-zaremba
Copy link
Collaborator Author

but the bootstrapping node will need to rebuild the state commitment store once the DB contents are synced.

The only thing we need to rebuild SC is the root key. SC is also stored in DB, so we only need to reconstruct the runtime root object. Recall, that we have 2 DBs: one for SC, and one for SS (or the same DB but under 2 different namespaces). We will sync both databases, not use one to reconstruct the other one.

@roysc
Copy link
Contributor

roysc commented Jun 10, 2021

I was going off this comment: #8430 (comment). I believe either way should work, no? the SC data could be derived from SS, or sent in a snapshot - it's just a tradeoff of computing time vs. bandwidth, I don't have any context to know which we'd rather optimize.

My main question here is, if the module is storing an "external" object, how does it get sent during state sync, if at all?

@robert-zaremba
Copy link
Collaborator Author

You are right @roysc , initially we didn't plan to expose writing to SC only. So we could rebuild SC from SS.
I think it's easier and more reliable to do ABCI sync for both SC and SS. Other solution would be to not expose SC at all -- so disregard this PR and use the "normal" way. That means: if module X will like to commit some data it will create a new record and the store will save it in both SC and SS.
If we will have only few such records then the impact is zero and the less troubles on the development. What do you think?
Let's verify this with @aaronc .
cc: @ValarDragon - could you chime in? You also had comments about it.

@robert-zaremba robert-zaremba changed the title adr: update ADR-40 by adding low level access to SC chore(adr): update ADR-40 by adding low level access to SC Jun 10, 2021
@robert-zaremba robert-zaremba changed the title chore(adr): update ADR-40 by adding low level access to SC docs(adr): update ADR-40 by adding low level access to SC Jun 10, 2021
@roysc
Copy link
Contributor

roysc commented Jun 14, 2021

@robert-zaremba I would overall prefer not exposing SC unless there's a good use case, to keep the design from becoming too complicated. It depends on how important the external-storage/custom-SC feature is.

If it's just "nice to have", there should not be anything design-wise that would block this from being added in the future. It would mainly be a matter of supporting the proof format and defining a way to make the root store "aware" of the external data.

@robert-zaremba
Copy link
Collaborator Author

We have already use-cases where we will just like to hook to the SC.
The workaround for this is to save the record "normally" (so it will be stored in both SC and SS). I agree that this complicates slightly API.
If there won't be much data stored, then I agree - there is no point to make the low level API. If we will like to store more things, then that low level API will be useful.

@robert-zaremba
Copy link
Collaborator Author

After thinking more about it I'm more leaning towards not messing up with the internals. If the low level access will be used only to save few kb of data, then we can easily duplicate it and save normally (so both in SC and SS).

@aaronc
Copy link
Member

aaronc commented Jun 30, 2021

This is not just or even mostly about performance.

It is about exposing a proof format that is logical for client using a higher level API like the orm.

Also it is for allowing other state systems to hook into the proof system. In this case the overhead of storing twice may be small if it's just a hash, I'm not sure.

But for the orm use case I think the ability for storing different data in SC vs SS could greatly improve the client developer UX.

@robert-zaremba
Copy link
Collaborator Author

The key question is what we will save in SC which won't be saved in SS? If it's only few records per module then it's not worth.

If a module will use it's own storage, then the module will save the commitment to that store, unless that store will not have a succinct and efficient commitment. Example that we have been chatting about is a module with it's own SQL DB and will like to commit to all values individually. However I still wonder if it would make more sense to create another, module level SC on site and only commit the SC root globally and keep the API simple.

@aaronc
Copy link
Member

aaronc commented Jul 2, 2021

The key question is what we will save in SC which won't be saved in SS? If it's only few records per module then it's not worth.

For the ORM approach I think it will be every record which would be inefficient.

If a module will use it's own storage, then the module will save the commitment to that store, unless that store will not have a succinct and efficient commitment. Example that we have been chatting about is a module with it's own SQL DB and will like to commit to all values individually. However I still wonder if it would make more sense to create another, module level SC on site and only commit the SC root globally and keep the API simple.

I'm imagining that in the future an ORM/Object Store approach will be the default. So I think we should consider this a first class use case. Storing just a root would make proofs harder at the client level, no? Anyway, maybe let's sync up on a call.

@aaronc
Copy link
Member

aaronc commented Jul 2, 2021

@robert-zaremba I described the ORM use case in more detail here: #9156 (comment). Maybe SC doesn't need to be exposed to all modules, but it would be nice to allow this ORM/object store to use it under the hood.

@robert-zaremba
Copy link
Collaborator Author

Do we need to define here how the low level access will be exposed? I was thinking about the accessor method.

@robert-zaremba robert-zaremba mentioned this pull request Aug 6, 2021
38 tasks
@roysc
Copy link
Contributor

roysc commented Aug 11, 2021

As discussed on a call, having SC accessible as a separate store introduces the possibility of it being accessed out of sync. More generally, it adds complexity to the store's behavior that users need to be aware of. It may be better to somehow internally allow access for the ORM/object store feature as @aaronc suggests.

@github-actions
Copy link
Contributor

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the stale label Sep 26, 2021
Copy link
Contributor

@roysc roysc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs more discussion and a design consensus. Having considered it, I agree with @aaronc that store keys would be the best access design for this. Let's evaluate if that would satisfy the need use cases.

@tac0turtle
Copy link
Member

@robert-zaremba can you address roy's comments. Lets please please get this merged.

@robert-zaremba
Copy link
Collaborator Author

The current proposal basically totally circumvents store keys and gives all modules direct access to all other modules stores

@aaronc The current proposal doesn't give direct access to all modules. The module will still receive a store key. But instead of receiving 3 store keys, it will get one and will use it with KVStore, SSStore or SCStore in order to access the right namespace.

@aaronc
Copy link
Member

aaronc commented Mar 11, 2022

The current proposal basically totally circumvents store keys and gives all modules direct access to all other modules stores

@aaronc The current proposal doesn't give direct access to all modules. The module will still receive a store key. But instead of receiving 3 store keys, it will get one and will use it with KVStore, SSStore or SCStore in order to access the right namespace.

I'm fine with some sort of multistore interface that allows access to different store types (including memory and transient and other types which may arise) with a single key. That's actually beneficial for the ORM. But it doesn't belong in sdk.Context, it belongs on the multistore.

I think we need a larger conversation about the specific ideal store key design for the ORM's use case. And ultimately I think we need this multistore design to support both IAVL and SMT. Let's find a time to unpack this more soon.

@tac0turtle
Copy link
Member

closing this as its been open for a year+

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T: ADR An issue or PR relating to an architectural decision record T:Docs Changes and features related to documentation.
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

None yet

8 participants