Schemas validation #31

leplatrem · 2015-04-30T21:31:53Z

Let's start a discussion about schema validation !

I believe we want something similar to Daybed. Except that we won't
reinvent the wheel with custom schema formalism.

It looks like there is no way around JSON schema. We won't rewrite thousands
of lines of code using Colander (see https://github.com/Julian/jsonschema).
We will use existing validators, and frontends will build forms from schemas.

By default, collections could be schemaless.

PUT|PATCH|DELETE /collections/person/schema
{
    {
        "title": "Example Schema",
        "type": "object",
        "properties": {
            "firstName": {
                "type": "string"
            },
            "lastName": {
                "type": "string"
            },
            "age": {
                "description": "Age in years",
                "type": "integer",
                "minimum": 0
            }
        },
        "required": ["firstName", "lastName"]
    }
}

It is acceptable if the validation of the definition is done using Cornice/Colander.
Even though there are meta-schemas :)

If a schema exists for a collection, records that don't match the schema are
refused with 400 Bad Request (on PUT and after merge of PATCH).

Unlike Daybed, I propose that each user owns the schema of the collection.
Especially because the schema endpoint will probably be a resource :)
Schemas could be cached to avoid overhead of reading it from storage at each incoming record.

This implies that users can store heterogeneous data with the same collection name.

Hence, on first start, the JS application will have to check that the user did
not set any schema (e.g. at /collections/moz:readinglist:articles/schema).
If she did, then confirm to replace it. If she hacks it afterwards and stores
invalid records, the JS application may crash, but that's okay.

Open questions

Is the schema using the resource code of Cliquet ? If so, how do avoid overlap of stored collections ? We could use underscores prefix, and prevent public collections to have a name that starts with underscore :)
What happen to existing data when schema is changed ? Probably ignore and wait for next
PUT or PATCH ?
What happen when records are shared between users ? Do we let other users
records crash our application when fetching shared records ? We could probably
run client-side validation using JSON schema on shared records.
Do we want to provide a collection of custom formats or even types ?
I'm thinking of recurrent needs for uuid4, geohash, GeoJSON objects, phone, postal codes...

The text was updated successfully, but these errors were encountered:

leplatrem · 2015-05-06T20:39:39Z

For example, generate Angular forms from JSON schema http://schemaform.io/

almet · 2015-05-08T16:07:52Z

Unlike Daybed, I propose that each user owns the schema of the collection.
Especially because the schema endpoint will probably be a resource :)

I don't quite get this. How is that different from daybed? I would propose anyone who can create a collection can also create a schema.

almet · 2015-05-08T16:13:27Z

Is the schema using the resource code of Cliquet ? If so, how do avoid overlap of stored collections ? We could use underscores prefix, and prevent public collections to have a name that starts with underscore :)

I think this is handled by the "bucket" concept.

What happen to existing data when schema is changed ? Probably ignore and wait for next PUT or PATCH ?

That's a good question. I believe in this case it should be possible to iterate on all the records and apply a function to them, maybe?

What happen when records are shared between users ? Do we let other users records crash our application when fetching shared records ? We could probably run client-side validation using JSON schema on shared records.

In case we download data from somewhere, we assume it's already validated by the server, so I don't get where the problem lies here?

Do we want to provide a collection of custom formats or even types ? I'm thinking of recurrent needs for uuid4, geohash, GeoJSON objects, phone, postal codes...

I think we should do that but would need to explore the json schema spec further to understand better how to do that.

Also, I think json schema has one big problem: its complexity. It doesn't seem to be simple to use it. As such, we could probably provide a way to create schema in an easy way, which would then map to the standard?

leplatrem · 2015-05-11T07:47:37Z

Unlike Daybed, I propose that each user owns the schema of the collection.
Especially because the schema endpoint will probably be a resource :)

I don't quite get this. How is that different from daybed? I would propose anyone who can create a collection can also create a schema.

Unlike Daybed, the collections are not global. It means that as a user, I can associate a schema to my todo collection, even if someone else already had set a different schema for her own todo collection.

Is the schema using the resource code of Cliquet ? If so, how do avoid overlap of stored
collections ? We could use underscores prefix, and prevent public collections to have a name that
starts with underscore :)

I think this is handled by the "bucket" concept.

Nope, what I meant with this was mozilla-services/cliquet#243.
And that the schema endpoint is built using the cliquet.ressource.BaseRessource class (CRUD).

What happen when records are shared between users ? Do we let other users records crash our
application when fetching shared records ? We could probably run client-side validation using JSON
schema on shared records.

In case we download data from somewhere, we assume it's already validated by the server, so I
don't get where the problem lies here?

I was wondering what happens if two users have a different schema for the same collection name.

Also, I think json schema has one big problem: its complexity. It doesn't seem to be simple to use it.
As such, we could probably provide a way to create schema in an easy way, which would then
map to the standard?

I wouldn't go that way. Maybe if it's too complex, then we can imagine a WYSIWYG JSON schema builder ?

almet · 2015-05-11T10:33:33Z

Unlike Daybed, the collections are not global. It means that as a user, I can associate a schema to my todo collection, even if someone else already had set a different schema for her own todo collection.

Then we agree :-) However, the notion of "own" differs a little: with buckets, a resource can have multiple owners.

Gotcha about how we should store the schemas. This should be handled by mozilla-services/cliquet#243 then.

I was wondering what happens if two users have a different schema for the same collection name.

We need some kind of namespacing here (and I believe this is achieved through buckets). Like on github: leplatrem/cliquet differs from ametaireau/cliquet.

I wouldn't go that way. Maybe if it's too complex, then we can imagine a WYSIWYG JSON schema builder ?

I don't know the json schema spec well enough to make a call, but it seems that it would be harder to do it that way than allowing a simpler format.

Validate JSONSchema for collections (ref #31)

leplatrem added a commit that referenced this issue May 26, 2015

First draft of schema endpoint (ref #31)

4ea7e30

leplatrem added a commit that referenced this issue May 26, 2015

Validate collection records with schema (ref #31)

c03dc77

leplatrem added a commit that referenced this issue May 26, 2015

Validate JSONSchema for collections (ref #31)

627df48

leplatrem added a commit that referenced this issue May 26, 2015

Test validation with PUT and PATCH (ref #31)

64d3f33

leplatrem added a commit that referenced this issue May 26, 2015

Test validation with PUT and PATCH (ref #31)

51dc355

leplatrem added a commit that referenced this issue Jun 8, 2015

First draft of schema endpoint (ref #31)

761adb5

leplatrem added a commit that referenced this issue Jun 8, 2015

Validate collection records with schema (ref #31)

d3be6da

leplatrem added a commit that referenced this issue Jun 11, 2015

First draft of schema endpoint (ref #31)

40896b8

leplatrem added a commit that referenced this issue Jun 11, 2015

Validate collection records with schema (ref #31)

69203c0

leplatrem added a commit that referenced this issue Jun 16, 2015

First draft of schema endpoint (ref #31)

b9a3397

leplatrem added a commit that referenced this issue Jun 16, 2015

Validate collection records with schema (ref #31)

016a852

leplatrem added the enhancement label Jul 3, 2015

leplatrem modified the milestone: 1.4.0 Aug 14, 2015

leplatrem added a commit that referenced this issue Aug 21, 2015

Validate collection records with schema (ref #31)

79a030a

leplatrem added a commit that referenced this issue Aug 21, 2015

Set schema attribute on records (ref #31)

8995f94

leplatrem added a commit that referenced this issue Aug 21, 2015

Validate collection records with schema (ref #31)

1071fd8

leplatrem closed this as completed in 9691dbe Aug 28, 2015

Natim added a commit that referenced this issue Aug 28, 2015

Merge pull request #39 from Kinto/31-schema-validation

4b9cf05

Validate JSONSchema for collections (ref #31)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Schemas validation #31

Schemas validation #31

leplatrem commented Apr 30, 2015

leplatrem commented May 6, 2015

almet commented May 8, 2015

almet commented May 8, 2015

leplatrem commented May 11, 2015

almet commented May 11, 2015

Schemas validation #31

Schemas validation #31

Comments

leplatrem commented Apr 30, 2015

leplatrem commented May 6, 2015

almet commented May 8, 2015

almet commented May 8, 2015

leplatrem commented May 11, 2015

almet commented May 11, 2015