Propagate environment during recursive import/export/validation #359

bintoro · 2015-11-16T10:54:58Z

A rewrite of #266.

This is a huge PR, but it's split into meaningful commits, each of which represents a working state with all tests passing.

env is now always passed to all to_primitive, to_native, and export_loop methods on types. Consequently, the standard field_converter interface also has env as a mandatory component: field_converter(field, value, env).

The env object will be set up by either import_loop or export_loop during the first iteration if the application hasn't supplied it by then.

`context`

The overloading of "context" with a number of uses has been resolved as follows:

context in import_loop() and validate() renamed to trusted_data
context as an argument to to_native(), to_primitive(), validate(), and mock() superseded by env
context as an argument to expand in transforms.py renamed to expanded_data
context when accessing the private context replaced by env.context.

Basically, env.context is just a recommendation for establishing a private namespace. Apparently the only thing in the library that accesses env.context is MultiLingualString that looks for a locale there.

`export_loop` inspection gone

[DELETED]

Miscellaneous changes

ModelType.strict option removed for now. To support something like this, we would need to differentiate between at least two flavors of settings:
- static settings (specified as function defaults, model options, or ModelType definitions), where later (as in deeper level of recursion) overrides earlier
- explicit settings (specified by application at runtime) that would override everything else.
I suspect ModelType.strict may have been a band-aid to provide some degree of control where the runtime strict setting hasn't been doing anything.
BaseType.validate_required() removed. import_loop enforces required=True for fields directly on a model, so this was only needed for nested fields. They now get this validator in MultiType.__init__() as needed.

Validators

env is now passed to all validators, including model-level validators. Although the object is absolutely needed only in compound type validators, it's probably best to make the change everywhere at once, since appending env to the argument list is all that is needed to adapt existing code.

edc · 2015-11-16T17:13:05Z

schematics/transforms.py

+        env = ConfigObject()
+    elif not isinstance(env, ConfigObject):
+        env = ConfigObject(env)
+    env._setdefault('partial', partial)


I think it makes more sense to have values passed as arguments overwrite those in env.

Having two ways to specify options is confusing. But I understand you don't want to break backward compatibility.

Hmm. There may be some misunderstanding here. For the most part, options are supposed to be specified just like before:

foo.import_data({...}, strict=True, partial=True)

This PR does not intend to change that.

An application may also pass an env object, but there are only two reasons to do so:

you want to put some proprietary data in it

your custom type or Model class wants to change the parameters in the middle of the recursive import/export process and propagate the new settings from there on.

What this code does is it sets up the env container on the first invocation, and then the options will be frozen thereafter. Letting arguments overwrite the prior state on every call is precisely the problem this PR aims to fix.

Here's an example of how a call stack might look like before:

Model.__init__({...}, strict=True)

Model.convert({...}, strict=True)

convert({...}, strict=True)

import_loop({...}, strict=True)

ModelType.to_native(...)

Model.import_data(...)

Model.convert(...)

convert(...)

import_loop(...)

Okay, back at import_loop. What's the value of strict? It's False, set by convert() at step 7. But obviously it should be True, since that's what was originally specified.

There were two ways to fix this:

make sure every function throughout the chain propagates every possible parameter as an argument, or

encapsulate the settings in a single object in the beginning and pass that around instead.

I opted for the latter approach because it's way cleaner and you can now add new options without touching everything.

If there's something that was easy to do before but becomes awkward due to this change, I'm sure we can address it somehow.

bintoro · 2015-11-17T13:17:16Z

Okay, I think I get it now about to_native(). You might want to export a structure that can be fed to another library that understands standard types like datetime.datetime but does not understand models.

Fair enough, but in that case "native" is not a well-defined concept. Alternatively, Model.to_native() is a misnomer, since the end result is a mixed native/primitive representation.

Resolving what "native" means is out of scope for this PR, so I'm going to shelve the fifth commit.

lkraider · 2015-11-17T13:38:45Z

I like the cleanup idea proposed and the patch as is seems good.

Do you have an idea how this could affect execution performance? What's the tradeoff in overhead/ease of use of passing ConfigObjects instead of dicts or maybe namedtuples? (this is just a question, don't block this patch acceptance based on it - I think performance should be a next goal for schematics as a whole).

bintoro · 2015-11-17T17:14:50Z

Great idea about namedtuple. When I wrote #266, I assumed the container should be mutable, but I don't think that's actually the case at all, particularly now that we're designating env.context for arbitrary data.

Aside from being faster, namedtuple would also prevent bugs arising from accidental modification of the settings.

(Whenever env is modified locally, such as in import_loop where mapping is replaced by its subtree, the object must be copied — otherwise the changes would apply in parent scopes too. A namedtuple would enforce this.)

However, using namedtuple means that passing a partial env as an argument won't be so practical. So, ideally, context should get its own argument after all, and that brings us back to question of whether it's too outrageous to change the meaning of context in import_loop or if it should be renamed.

In general I totally agree about the performance concerns. It would be great if someone put together a testcase for trying out some optimization ideas.

lkraider · 2015-11-17T20:25:07Z

I see no problem in re-purposing the context keyword if we are breaking compatibility, it was de-facto an internal API anyway (the new trusted_data). But calling it user_data might be clearer.

bintoro · 2015-11-19T05:04:35Z

Now using namedtuple with a modified constructor that allows omitting values to make testing easier:

>>> from schematics.transforms import ExportContext
>>> ExportContext(role='public')
ExportContext(role='public', raise_error_on_role=None, print_none=None, app_data=None)

Added _branch method to simplify copying env during the recursion. It's like _replace but ignores None values and does not create a copy unless needed.

In a hypothetical ModelType class that has support for strict and partial settings,

params = {}
if self.strict is not None:
    params['strict'] = self.strict
if self.partial is not None:
    params['partial'] = self.partial

branched_env = env._replace(**params) if params else env

return model.import_data(value, env=branched_env)

can now be expressed as

params = {
  'strict': self.strict,
  'partial': self.partial
}

return model.import_data(value, env=env._branch(**params))

bintoro · 2015-11-23T17:52:14Z

Although the diff is all over the place, this is a rather straightforward PR, so I'm preparing to merge.

Last call regarding nomenclature:

(1) After @lkraider alleviated my concerns about reusing the retired context parameter, I'm now considering just using that instead of env because:

env can sound like an application-wide or library-wide thing, while context has all the right connotations
it would reduce the need for mechanical adaptation of old code, as the old context in most places is simply about to be replaced with the new env
context is no longer used in any conflicting meaning.

(2) The old context data bag is now called app_data, which I felt was more neutral and unambiguous than either private_data or user_data, but I'm open to being overruled on this too.

Thoughts?

If we make the switch from env to context, migrating custom types would go like this:

Add context as a third parameter to validator methods.
No need to touch method signatures that already include context.
In actual method code, replace context with context.app_data if you're using the data for something.

lkraider · 2015-11-23T21:18:40Z

Can you list the flow of options/context now? For example, how will something like validate(partial=True, context={partial=False}) behave? (I suppose the context will be overridden by the parameter?)

bintoro · 2015-11-24T06:00:35Z

For example, how will something like validate(partial=True, context={partial=False}) behave? (I suppose the context will be overridden by the parameter?)

Taken literally, that form (a user-provided dict as context) no longer exists, since with the addition of the app_data parameter everything can now be specified as regular arguments. In fact, the context parameter should be removed altogether from the likes of Model.validate() that make up the main API.

(EDIT: Actually no, it can't be removed because the same methods are also called from ModelType which needs to pass on the context object.)

Other than that, context always overrides any individual arguments, since they could just be random defaults.

This now makes things very simple: up until the first invocation of import_loop or export_loop there is no context object — everything's in the arguments. Once we get to the loop function, it creates the context object if there was none, and from there on only context matters.

The only case where a context object needs to be created from scratch is when writing tests for a method that needs one:

  mls = MultilingualStringType(default_locale='en_US')

  assert mls.to_primitive(
      {'en_US': 'snake', 'fr_FR': 'serpent'},
      context=ExportContext(app_data={'locale': 'fr_FR'})) == 'serpent'
              ^^^^^^^^^^^^^

lkraider · 2015-11-24T13:47:43Z

Thanks for clearing it up, sounds great for me.

This patch introduces a metadata object that is propagated through every level of a recursive import/validation process. It is used to carry options and other data that need to be available throughout the entire process, not just during the outermost loop.

* Add BaseType.parent_field * Remove BaseType.validate_required() (see MultiType.__init__()) * Remove ModelType.strict * Move ModelType.validate_model()

To simplify the creation of dummy env objects, use a modified namedtuple whose constructor sets missing fields to None.

bintoro · 2015-11-24T21:21:49Z

I believe this PR is now mature enough to be merged.

It resolves #316 and facilitates the addition of new conversion options in the future without causing disruption to existing methods.

Propagate environment during recursive import/export/validation

chekunkov · 2015-12-18T12:11:53Z

@bintoro hi, when is this change going to be released on PyPI?

bintoro · 2016-01-06T15:27:15Z

@chekunkov I'm not sure. The patch is technically backward-incompatible, since all to_* methods and custom validators are now required to accept the context parameter.

If we want to be really strict about semantic versioning, this would have to wait until v2.0-alpha. If we accept mildly breaking changes in a minor release, this could be shipped pretty quickly in v1.2.0. I haven't decided.

bintoro · 2016-01-22T18:10:00Z

Just FYI, I've decided to include this patch in v1.2.0 that will be released shortly.

I'm going to alleviate upgrading pains by making the context parameter optional on validator functions. If a validator doesn't declare the parameter, the function will be placed in a wrapper that drops the extra argument at runtime.

jacobstr · 2016-11-04T00:49:03Z

I'm curious what the state of the 1.2 release is?

ryanhiebert · 2016-11-04T02:34:53Z

I know that I'm using a 2.0 pre-release personally. You might wish to try it yourself, to see if this feature is in that, and if it works for your cases.

bintoro mentioned this pull request Nov 16, 2015

Create shared environment across nested import/validation loops via a metadata object #266

Closed

edc reviewed Nov 16, 2015
View reviewed changes

bintoro force-pushed the env branch from 8bcffa8 to 5369a56 Compare November 17, 2015 13:29

bintoro mentioned this pull request Nov 17, 2015

export_loop() vs. to_primitive() #337

Closed

bintoro mentioned this pull request Nov 18, 2015

How to apply fields' default values #360

Closed

bintoro force-pushed the env branch from 91fdfad to 5bf13c3 Compare November 19, 2015 01:33

bintoro force-pushed the env branch 3 times, most recently from c647fb7 to 98ae7e9 Compare November 20, 2015 03:50

bintoro force-pushed the env branch 3 times, most recently from 297d6e7 to 5427dde Compare November 24, 2015 05:29

bintoro added 6 commits November 24, 2015 21:44

Edit some field properties and validators

0dc5819

* Add BaseType.parent_field * Remove BaseType.validate_required() (see MultiType.__init__()) * Remove ModelType.strict * Move ModelType.validate_model()

Use 'env', 'trusted_data', etc. instead of 'context'

9023af1

Add 'env' to export_loop, field converters, validators

7f28c9f

Switch env to a namedtuple & add tests

b4109dd

To simplify the creation of dummy env objects, use a modified namedtuple whose constructor sets missing fields to None.

Use parameter 'context' instead of 'env'

92b70f0

bintoro force-pushed the env branch from 5427dde to 92b70f0 Compare November 24, 2015 21:06

bintoro added a commit that referenced this pull request Nov 24, 2015

Merge pull request #359 from bintoro/env

2482d50

Propagate environment during recursive import/export/validation

bintoro merged commit 2482d50 into schematics:development Nov 24, 2015

This was referenced Nov 24, 2015

strict=False doesn't propagate as expected #316

Closed

inconsistent context use for import vs export #274

Closed

Propagate strict overrides through nested models #318

Open

bintoro deleted the env branch April 30, 2016 17:19

himikof mentioned this pull request May 30, 2017

Handling sub-models with unknown fields #492

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Propagate environment during recursive import/export/validation #359

Propagate environment during recursive import/export/validation #359

bintoro commented Nov 16, 2015

edc Nov 16, 2015

bintoro Nov 16, 2015

bintoro commented Nov 17, 2015

lkraider commented Nov 17, 2015

bintoro commented Nov 17, 2015

lkraider commented Nov 17, 2015

bintoro commented Nov 19, 2015

bintoro commented Nov 23, 2015

lkraider commented Nov 23, 2015

bintoro commented Nov 24, 2015

lkraider commented Nov 24, 2015

bintoro commented Nov 24, 2015

chekunkov commented Dec 18, 2015

bintoro commented Jan 6, 2016

bintoro commented Jan 22, 2016

jacobstr commented Nov 4, 2016

ryanhiebert commented Nov 4, 2016

Propagate environment during recursive import/export/validation #359

Propagate environment during recursive import/export/validation #359

Conversation

bintoro commented Nov 16, 2015

context

export_loop inspection gone

Miscellaneous changes

Validators

edc Nov 16, 2015

Choose a reason for hiding this comment

bintoro Nov 16, 2015

Choose a reason for hiding this comment

bintoro commented Nov 17, 2015

lkraider commented Nov 17, 2015

bintoro commented Nov 17, 2015

lkraider commented Nov 17, 2015

bintoro commented Nov 19, 2015

bintoro commented Nov 23, 2015

lkraider commented Nov 23, 2015

bintoro commented Nov 24, 2015

lkraider commented Nov 24, 2015

bintoro commented Nov 24, 2015

chekunkov commented Dec 18, 2015

bintoro commented Jan 6, 2016

bintoro commented Jan 22, 2016

jacobstr commented Nov 4, 2016

ryanhiebert commented Nov 4, 2016

`context`

`export_loop` inspection gone