Schema API #466

lkraider · 2016-12-09T23:54:43Z

Implements #446

This is the foundational work to get back compatibility with SQLAlchemy. To do that, the concept of Serializable setters is reintroduced, along with a lower level abstraction of the Model named Schema.

The functional API can now rely on the Schema declaration directly, which simplifies integration by exposing a smaller surface. More work will need to be done to introduce a CompoundType with pure Schema.

Additionally, the concept of data validation stages is introduced (see contrib/machine.py) and initial work has been done in the Model to apply that, but it's not fully realized yet. Discussion for a more granular approach will be presented in another PR.

Ideally _data will be an immutable dict and all writes happen in _raw_data. Data access will return _raw_data first, then _data if set. Serialization will always attempt to validate and output only if valid. Another idea is to move the data state into a container, so that each field can be individually queried, which may reduce the `context` needed to be passed around during the transforms.

A Models data need to follow the specified data state flow, conversion will always be implicit. Calling the classmethod defeats the expected use and the functional API is preferred in this case.

The idea is to move to a common type for fields and avoid mixing descriptors with base types, but didn't work on a common abstraction yet. The slots will help reduce mem usage for this intermediary wrapper for now.

This is the old style functional test, that relies on a model instance, but it should not rely in the model attribute access to test state. Also removed use of `_initial` since it's not a source of valid data.

Changed assumption to be in line with data state flow.

This keeps the interface closer to the previous Model, while extracting the data state handling into a single concern class. This structure can be simplified in case the data state is stored with the data itself, but will need more changes, specially with the types. It is something we should investigate further.

It seems the loops could be optimized by moving the granularity of data state into each field.

The data mutation step is implicitly done before the validation. Still missing is aggregation of mutation errors to halt validation.

Required for compatibility with SQLAlchemy, might be wasteful if convert is not a no-op when already converted.

lkraider · 2016-12-19T23:02:17Z

This is ready for merging. Fixes SQLAlchemy integration.

I am leaving the deprecated API compatibility-layer enabled by default, work will need to be done to refactor it out later.

Unless there are objections, I'll merge it this friday.

Validation now will always clear invalid data from the Model, leaving accessible only what is considered valid by each field. A versioned Model could be devised to keep the different states available for inspection, or at least the invalid data could be returned by the exception.

Update parameters/docs to reference schema.

To avoid losing validation information, previous input data will be restored after serialization into the Model.

bintoro · 2016-12-23T00:21:04Z

@lkraider Could you maybe illustrate what's going on here in an easy-to-understand fashion, if possible? :)

lkraider · 2016-12-23T16:50:39Z

@bintoro The motivation for this work was to bring serializable setters (from 0.9) back into mainline. This feature is required for SQLAlchemy integration (supported by https://github.com/schematics/schemalchemy).

To make this possible it was required to change the transform functions back to accept and operate on instance data, as setters in the model have to modify the instance state (this also allows instance validators again).

My approach was to avoid making the Model instance a requirement, but to have them accept a "mutable" mapping, so that non-model data can be validated too. To make this explicit I opted to separate the Model from the actual "Schema" specification that is really required during the transforms. I took inspiration in the architecture in SQLAlchemy that splits the Table description from the declarative Base model mapping. This split helps in that it separates concerns of the structure itself from its workflow (the Schema encodes the structure, the Model encodes the workflow).

Related to the setters introduction, means that they can generate "raw data" during a transform (separated into its own _mutate step), but for the Model means there is a need to split valid from raw data, so now the Model dict has that segregation builtin. There is another way to achieve that which I want to explore further, and that is to attach metadata to the values to indicate their state, but that requires more discussion.

For this PR, I kept the Schema/Model interchangeable by providing some "deprecated" API mixins. The intent is to drop them completely eventually.

Below is an overview of the encoded states now:

As you can see, each step will only operate with data that is in the correct state from the previous step (at least that is the idea), and mutation resets the data state.

About 2.5 times faster using explicit dict init instead of namedtuple method _asdict.

bintoro · 2016-12-23T23:09:39Z

At this point I barely understand any of the diffs. This is going to take a long time to review. Would someone else like to give it a shot?

lkraider · 2017-01-24T14:43:31Z

Sorry for the long delay, I am ready to merge it this week.

bintoro · 2017-01-29T13:41:04Z

First question: why not have Model inherit from Schema?

lkraider · 2017-01-31T18:48:55Z

I preferred composition in this case, to make the split more clear. Also because the Model is already complicated enough with metaclass behaviour and properties, I opted not to overload it with more attributes (that could conflict with user-defined properties). It also makes the interaction with SQLAlchemy Base easier to understand.

The Model responsibilities now is to encapsulate API usage, the schema and data state.

Conceptually:

Model 'has a' Schema
Schema is inferred from the Model description (by means of ModelMeta)
Model encodes the data transitions in the state machine (schematics API + data state)

lkraider · 2017-03-01T23:00:46Z

I will move forward with this patch into develop branch this weekend, there are some fixes to other issues that depend on this and it should not break high level compatibility with current 2.0.0a code. People can propose changes with other PRs.

lkraider added 30 commits July 11, 2016 17:24

Extract Schema from Model

b90990a

Refactory Model to use Schema

0c73b59

Update metaclass compat for multiple inheritance

c26919a

Change tests to drop klass from Options

8c108fd

Serializable setter decorator

947fa92

Serializable repr

0c15c33

Add breaking serializable tests

6668569

Role class doc rewording

60ffbc3

Update Serializable type and Schema

43f3ecc

Refactor atoms iteration out of transforms

10b6647

Add simple state Machine contrib

b178494

Add tests for functional schema

126010d

Monkey patch inject deprecated API for now

a5706c9

Deprecate Model.convert public interface

bc58e6c

A Models data need to follow the specified data state flow, conversion will always be implicit. Calling the classmethod defeats the expected use and the functional API is preferred in this case.

Move deprecated API out of schema module

8ea76dc

Fix kw->kwargs rename

25e6ca5

Prefer relative import

02f84f0

Schema Field slots

24912a0

The idea is to move to a common type for fields and avoid mixing descriptors with base types, but didn't work on a common abstraction yet. The slots will help reduce mem usage for this intermediary wrapper for now.

Tests: functional should test dict input

ca77317

This is the old style functional test, that relies on a model instance, but it should not rely in the model attribute access to test state. Also removed use of `_initial` since it's not a source of valid data.

Tests: serialize setter only after validation

3cc2481

Changed assumption to be in line with data state flow.

Tests: update serialized fget reference

65b9ea3

Transforms rework for serializable and schema

70dcf37

It seems the loops could be optimized by moving the granularity of data state into each field.

Validate allow serializable setter to mutate data

6d84ae2

The data mutation step is implicitly done before the validation. Still missing is aggregation of mutation errors to halt validation.

Tests: schema use cleaner data comparison API

18bd85a

Fix compound objects Model convert

6140d3f

Validate update raw data from serializables

2362813

Tests: set correct id for testing

f4bba2e

Typecheck model data inside compound type

b6382e6

Convert value on serializable set

bec108c

Required for compatibility with SQLAlchemy, might be wasteful if convert is not a no-op when already converted.

lkraider mentioned this pull request Dec 19, 2016

One to many relationships? schematics/schemalchemy#1

Open

lkraider added 12 commits December 22, 2016 15:52

Atoms namedtuple rename

c213dd5

Model store partially validated data in dict

a0bf054

Allow Model validators access to the instance

12934c4

Refactor atom filter and docs

a09e46b

Refactor mutate into transforms

67f35b5

Move setter convert into mutate out of descriptor

0c2ad2c

Update parameters/docs to reference schema.

Merge serializable setter errors when validating

91c6ea4

Refactor Role class into own file

78d66ae

Tests: invariants that make assumptions explicit

f36fcd3

Serialize will silently validate data

89bf3f5

To avoid losing validation information, previous input data will be restored after serialization into the Model.

Enable apply defaults when field validation fails

c80d772

Faster dict creation in loop

f91c0c8

About 2.5 times faster using explicit dict init instead of namedtuple method _asdict.

lkraider mentioned this pull request Jan 5, 2017

Validators with db connection #469

Closed

vovanbo mentioned this pull request Feb 27, 2017

How to pass context to Serializable? #473

Open

lkraider added 2 commits March 15, 2017 11:42

Reword atoms type docs

bca8e2b

Reword test_invariants

78a054a

lkraider merged commit 4f8cfd7 into schematics:development Mar 15, 2017

This was referenced Mar 15, 2017

Split input data from valid data #124

Closed

Marshallers and extended serializables for better data manipulation at import/export time #269

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Schema API #466

Schema API #466

lkraider commented Dec 9, 2016

lkraider commented Dec 19, 2016

bintoro commented Dec 23, 2016

lkraider commented Dec 23, 2016 •

edited

Loading

bintoro commented Dec 23, 2016

lkraider commented Jan 24, 2017

bintoro commented Jan 29, 2017

lkraider commented Jan 31, 2017 •

edited

Loading

lkraider commented Mar 1, 2017 •

edited

Loading

Schema API #466

Schema API #466

Conversation

lkraider commented Dec 9, 2016

lkraider commented Dec 19, 2016

bintoro commented Dec 23, 2016

lkraider commented Dec 23, 2016 • edited Loading

bintoro commented Dec 23, 2016

lkraider commented Jan 24, 2017

bintoro commented Jan 29, 2017

lkraider commented Jan 31, 2017 • edited Loading

lkraider commented Mar 1, 2017 • edited Loading

lkraider commented Dec 23, 2016 •

edited

Loading

lkraider commented Jan 31, 2017 •

edited

Loading

lkraider commented Mar 1, 2017 •

edited

Loading