-
Notifications
You must be signed in to change notification settings - Fork 38
Description
Verida schemas
The Verida protocol supports defining a schema for data stored in Verida databases. These schema'd databases are called datastores. See Verida schema documentation.
Anyone can define a schema and reference it by URL. The Verida protocol defines a collection of base schemas in the schema repo.
All Verida schemas should match the JSON schema specification.
We need a Verida meta-schema
Verida schemas need to provide additional metadata, beyond just "validation rules".
This includes:
- Information about the database storing this schema (database name, indexes etc.)
- Display information used when displaying this schema in the Verida Vault (or other applications)
You can see an example of this additional metadata in the draft Veirda social/contact schema.
Tasks:
- Define a JSON structure for storing the above metadata
- Define and publish a Verida JSON metaschema
- Update our existing schemas to reference the new Verida JSON metaschema
We need a plan for schema versioning
Schemas will change. We need to have a clearly defined strategy and process to help developers update schemas.
I'm not sure we need to solve this problem now, but at least need a plan.
Things to consider:
- What prior art / existing versioning strategies that exist? (ask community?)
- How to migrate data from one schema to an upgraded version?
- How to ensure all client applications are using the latest schema?
- How strict should client implementations be to ensure the latest schema is being used?
- What role can the Verida Vault play to ensure valid schemas are being used?
- Need to support versioning of the meta schema
- When data is signed, it includes signing the schema URI. This means "updating" the schema to a new URI will invalidate the signature unless the data is re-signed by the data originator (unlikely).
My Initial thoughts:
In phase 1, we support versioning by having a convention of building it into the URL, ie: .../social/contact/schema/v1.json. Since multiple schemas can use the same database, it's possible to have records stored in the same database but using different schema versions.
In phase 2, we support a "data migration" process, whereby a new schema version can define a data migration schema. The Verida client can support applying this data transform based on the data migration schema to convert data from an older schema to the new schema.
Tasks:
- Research current versioning strategies
- Document a set of staged recommendations to build into the Verida protocol
- Sign off
- Implement first phase of recommendations
Schema security
There's a security risk where a schema is specified by URL and then the schema is modified (or the hosting provider hacked) to generate a different URL. For example, modifying the schema to remove the list of required fields, allowing data to be saved across the network with invalid data.
I don't think we need to solve this right now, but need to consider the implications and have a strategy to improve this in the future.
It's possible to use IPFS to store a schema and then reference the content addressable URI within data saved using the Verida protocol.
In a future phase, we could support on chain "schema hashes" via the Trust Framework. This allows schemas to be referenced by an on-chain hash instead of a https URL. Ceramic network also provides similar capabilities.
Tasks:
- Document an initial assessment of the security risks
- Research appropriate mitigation strategies and document recommendations
Community resources
The following community resources exist and seem active:
- JSON schema slack -- I've had success here in the past
- Google Groups
- Github Discussions