Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[spec] data schema #126

Merged
merged 2 commits into from
Apr 14, 2021
Merged

[spec] data schema #126

merged 2 commits into from
Apr 14, 2021

Conversation

tzemanovic
Copy link
Member

@tzemanovic tzemanovic commented Apr 12, 2021

Copy link
Contributor

@cwgoes cwgoes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally looks well thought-out; a few questions.


At high level, all the data in the [accounts' dynamic sub-spaces](accounts.md#dynamic-storage-sub-space) is just keys associated with arbitrary bytes. To help the processes that read and write this data (transactions, validity predicates, intents) interpret it and implement interesting functionality on top it, the ledger could provide a way to describe the schema of the stored data.

For storage data encoding, we're currently using the borsh library, which provides a way to derive schema for data that can describe its structure in a very generic way that can easily be consumed in different data-exchange formats such as JSON. In Rust code, the data can be composed with Rust native ADTs (`struct` and `enum`) and basic collection structures (fixed and dynamic sized array, hash map, hash set). Borsh already has a decent coverage of different implementations in e.g. JS and TypeScript, JVM based languages and Go, which we'll hopefully be able to support in wasm in near future too.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we implement borsh in Juvix? cc @mariari

}
```

When the transaction is applied, the data is stored together with a reference to the derived data schema, e.g.:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is the schema stored?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In particular, is this efficient? (we don't want to store the schema along with each copy of the data)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure that a transaction can recognize its type as MultiSig when the transaction reads an account.
I think it can be stored with a key like <height>/schema/<address>/.../<var_name> to the DB.
When a transaction reads the var of the account (the key would be <address>/.../<var_name>), it can read the schema entry with the same key.

In this case, I suppose it has to be stored along with each copy because, for example, another MultiSig which has different members for other accounts exists.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think it should be possible to reuse schema definitions by e.g. storing the schemas in storage outside of accounts sub-spaces with some unique identifiers (hash of the schema). Then accounts sub-spaces data could store the identifier to the schema (if specified).

With the MultiSig example, we would store the schema at <height>/schema/<schema_id> and there could be many accounts using this schema with e.g.:

  • MultiSig encoded data at <height>/subspace/<address>/<sub_key...>
  • <height>/subspace/<address>/<sub_key...>/schema = <height>/schema/<schema_id>

A possible slight variation would be to have a dedicated special account for schemas, so we could add a validity predicate to it instead of having its logic in the ledger.

I think it would be nice if we could somehow split the storage fee for schemas between its users, so that most commonly used schemas would be very cheap.

Copy link
Member

@yito88 yito88 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

@tzemanovic tzemanovic merged commit 7151a36 into master Apr 14, 2021
@tzemanovic tzemanovic deleted the tomas/data-schema-design branch April 14, 2021 07:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants