Create shared environment across nested import/validation loops via a metadata object #266

bintoro · 2014-12-18T14:41:32Z

Summary

This patch introduces a metadata object that is propagated through every level of a recursive import/validation process. It is used to carry options and other data that need to be available throughout the entire process, not just during the outermost loop.

Background

One thing currently blocking the Undefined feature is the fact that import/validation parameters (importantly, partial=True) only affect the first level of processing. If you have nested models, your settings no longer apply.

Apparently someone has noticed this before and created this construct in import_loop to pass the deserialization mapping down to nested models:

try:
    ...
    raw_value = field_converter(field, raw_value, mapping=model_mapping)
except Exception:
    raw_value = field_converter(field, raw_value)

And there was a similar try—except inside the field_converter in question.

The patch

EDIT: Changed param name from meta to env.

Instead of adding yet another parameter here, I have future-proofed this by turning the third parameter into a multipurpose object representing the current environment. Import converters and validators can now propagate options and whatever data they please in the env parameter. One practical application might be bookkeeping during validation; currently it's possible to have a ModelType structure that gets into an infinite loop.

The above code now looks like this:

    sub_env = env.copy() # Make a copy
    sub_env.mapping = model_mapping  # Put the relevant mapping in there
    raw_value = field_converter(field, raw_value, env=sub_env) # Pass it on

The way this works is any function (including the initial caller) could decide to set up the Environment object. Once it enters the processing chain, its contents override any overlapping parameters passed as keyword args (EDIT: not by necessity, of course, but because import_loop chooses to treat it that way).

Right now, the setup is the responsibility of the outermost import_loop. It takes the strict and partial arguments it receives and locks them in for the rest of the import recursion.

As an aside, if all options were collated into a env object from the start, the parameter signature of import_loop, convert, and friends could look like this:

def ...(cls, instance_or_dict, env)

Although incorporating many options into a single object may be considered a bit opaque, it's clearly preferable to always passing a dozen values to every method and their aunts:

model.validate(...)
=> validate(...)
=> import_loop(...)
=> field_converter(...)
=> field.validate(...)
=> submodel.validate(...)

That's five calls for each level of recursion.

BREAKING CHANGE WARNING

I've gotten rid of the try–except constructs that otherwise would have been all over the place.

Considering that the env container is primarily needed in a recursive process, the method signature dilemma has been solved by requiring compound types (MultiType descendants) to accept the env argument in their to_native and validate_ methods. Simple types are called without env.

Since custom compound types need a (trivial) change as a result, this would be a great opportunity also to get rid of the context parameter that appears in a ton of places. It would be better placed inside env as well.

Other changes incorporated here

Model.__init__ now accepts the partial parameter too (default=True because it was the de facto default already, specified in the main convert func).
Fixed a bug where ListType.validate_items() was added to the list of validators twice: once explicitly and once via collation by TypeMeta.

This patch introduces a metadata object that is propagated through every level of a recursive import/validation process. It is used to carry options and other data that need to be available throughout the entire process, not just during the outermost loop.

lkraider · 2015-04-09T15:45:34Z

I like the idea here, having the environment accessible during the loop helps to keep any options inspectionable during the recursion. Do we need the pass-by-copy though? Couldn't we devise a structure similar to namedtuple that would be immutable during the loop?

Also, the context usage should probably be cleaned up to pass only opaque user data, as requested in #274, while adding another dict to store the already validated data (this is what import_loop uses the context for now).

kstrauser · 2015-07-19T16:46:50Z

I like this, but:

Simple types will still need to receive env if we're going to move context into it.

I agree with @lkraider about context being a user-generated data bag. The intent was that a user could pass in information telling types how to export (or import) data within the context of the current situation. The canonical example is the MultiLingualString, where it can export to many different languages but won't know which one at instantiation time. The same could be true for imports. Say you don't know what locale you'll be parsing data from at model creation time, but don't want a hypothetical StrictlyLocalizedFloatType to accept "123,45" in a en_US context because that's not a valid representation in America, and "123.45" isn't valid in de_AT. We should support context in the import loop symmetrically to how it's supported in the export loop.

Since this is a (mildly) breaking change, we'd need to bump the minor version at least.

jmsdnns · 2015-10-20T19:49:42Z

Sorry, it is not clear to me what problem this solves.

The reason the arguments from the outer most loop are locked in is to keep the arguments consistent throughout the entire process. If you want a different result, export with different arguments.

Could you describe the problem this change solves?

jmsdnns · 2015-10-20T19:50:57Z

One thing currently blocking the Undefined feature is the fact that import/validation parameters (importantly, partial=True) only affect the first level of processing. If you have nested models, your settings no longer apply.

OK. I am thinking on this here.

Are you saying the value for partial is not passed throughout the rest of the export loop?

bintoro · 2015-10-20T19:52:42Z

Are you saying the value for partial is not passed throughout the rest of the export loop?

Yes. And neither does strict, see #316. Part of the problem is that changing the argument lists is a breaking change. That's why I propose we set up this container where parameters can added in the future without breaking things.

jmsdnns · 2015-10-20T19:55:32Z

ohhhhh well THAT is a bug that should be fixed.

I don't think we need to pass around an environment variable as much as we need subsequent calls to use the same keyword arguments.

I am not sure what the cause is yet, but it's supposed to copy the initial arguments the whole way down.

bintoro · 2015-10-20T20:39:37Z

The reason I prefer a single environment object is that, once it has been implemented, it's zero maintenance going forward. The same cannot be said about repeating this huge argument list in 100 methods around the codebase, particularly since authors of custom types also need to keep their code in sync with the full range of available arguments.

Reliance on individual keyword args is probably what has led to this kind of conditional code.

jmsdnns · 2015-10-20T20:57:19Z

Hmm. OK. I better understand what you're trying to do now. I see the point about every type having to support the same keyword arguments now too.

What we'd get with this approach is a dict that has all the values we care about, but rogue fields would be ignored by types that don't know how to use particular values. This is the primary strength as I see it.

I think I am on board with this but wanna think on it a little longer.

Thanks for explaining that as I worked my way through what you're thinking about.

kstrauser · 2015-10-20T21:00:36Z

Agreeing with @bintoro. I wrote #318 and it's an ugly mess that touches everything. I'd rather rip the band-aid off and be done with it so we can add future features without breaking all the things.

jmsdnns · 2015-10-20T21:02:39Z

srsly blowing my mind that the arguments weren't preserved... what's even
the point of a strict flag if it's ignored by compound types?!

we'll get this worked out properly.

On Tue, Oct 20, 2015 at 5:00 PM, Kirk Strauser notifications@github.com
wrote:

Agreeing with @bintoro https://github.com/bintoro. I wrote #318
#318 and it's an ugly mess
that touches everything. I'd rather rip the band-aid off and be done with
it so we can add future features without breaking all the things.

—
Reply to this email directly or view it on GitHub
#266 (comment)
.

jms dnns
me http://jmsdnns.com // facebook https://www.facebook.com/jmsdnns //
github https://github.com/jmsdnns/ // twitter
https://twitter.com/jmsdnns

kstrauser · 2015-10-20T21:05:35Z

Hey, we were all surprised by that one. 😄 This is off-topic for this particular PR, but I'll mention it anyway: even if we merged this today we'd still need to reimplement #318 on top of it to get strict working. I'd be so happy to throw away all the special-case code and start over with a clean foundation, though.

jmsdnns · 2015-10-20T21:07:36Z

I've been away from schematics for a while, so my memory on things is
rusty, but let it be known that my goal is to always play like a champion.

if we know what the ideal solution to something is, let's always do that.
if not, we can talk until it seems obvious what to do. :)

we'll probably do this one, and then i guess that will require a new
implementation of 318.

i'll think on this one and hopefully merge it in really soon.

On Tue, Oct 20, 2015 at 5:05 PM, Kirk Strauser notifications@github.com
wrote:

Hey, we were all surprised by that one. [image: 😄] This is
off-topic for this particular PR, but I'll mention it anyway: even if we
merged this today we'd still need to reimplement #318
#318 on top of it to get
strict working. I'd be so happy to throw away all the special-case code and
start over with a clean foundation, though.

—
Reply to this email directly or view it on GitHub
#266 (comment)
.

jms dnns
me http://jmsdnns.com // facebook https://www.facebook.com/jmsdnns //
github https://github.com/jmsdnns/ // twitter
https://twitter.com/jmsdnns

kstrauser · 2015-10-20T21:12:22Z

I haven't had the opportunity to be as active lately as in the past, but I haven't forgotten about it. Also, #324 is still open. One day it may be appropriate to have a new major version with a new API. Or not. But we can discuss it to see what makes the most sense!

bintoro · 2015-10-21T04:36:52Z

I've also been inactive for most of this year due to vision problems that kept me away from coding. In the past few weeks I've been looking at Schematics again, and it's an interesting coincidence that @jmsdnns is now back as well.

Another reason this PR was left pending is that it's a kind of breaking change. And if we're going to break things, then it would make the most sense to try to identify as many legacy pain points as possible and target a new (major?) release that would bundle all the backward-incompatible changes together. Then, the API could again remain constant for the foreseeable future.

Also, something to figure out is what to do with the arguments (such as context) that already exist throughout the call chain. In #316, I floated the idea that context should go inside env too, which got tentative support from @kstrauser.

The way I envisioned this, individual arguments (existing and future) should remain available in methods that are typically directly called by application code using Schematics. For example, Model.__init__ would still take strict and partial arguments. But in the majority of the other ("internal") methods, particularly those inside type classes, these would be absent and env would propagate the values.

bintoro · 2015-11-15T15:53:17Z

I think it's time to move on this, mostly because the Undefined feature needs this, but also because it's been nearly a year and there's no point in waiting any longer.

I'm currently rewriting the PR, and I'd like some input on what to do with context.

Based on prior discussion, the context argument in import_loop means "trusted data" and should be renamed to reflect that. That much is clear.

The other usage ("arbitrary user data") could otherwise continue to be called "context", but I think we'd like all conversion functions to accept such arbitrary data, including import_loop. Well, we can't just suddenly change the meaning of context in the import_loop signature from trusted data to arbitrary user data.

That said, once env is available, the ability to pass options as individual arguments is just a convenience:

instance.import_data({'name': 'bintoro'}, partial=True, strict=True)

instance.import_data({'name': 'bintoro'}, env=ConfigObject(partial=True, strict=True))

These two statements are equivalent. In the former, import_loop just takes care of constructing the env object.

If we forgo this convenience for context, then I think it's OK to continue to use that name, since it won't appear in the import_loop (or any other) signature any longer.

The other option is to eliminate "context" throughout the library and replace it with something like "private", "app", "user", ...

(Basically, env already is a context container where applications could put anything. But I think it's a good idea to have a dedicated object to keep the user namespace separate from reserved parameters.)

My current thinking is that we could just drop context from function signatures because

it's not the most commonly-used feature
when you need it, creating the env object is very easy
"context" is a pretty good name for user data.

EDIT:

And just to be clear, env can appear together with individual arguments as well. Let's say I just want to add context to the first statement above:

instance.import_data({...}, partial=True, strict=True, env=ConfigObject(context={...}))

Also, the code that wants to use the context data is going to access it via env.context anyway, so it kind of makes sense to set it through the same route.

EDIT 2:

I'll probably add support for supplying env as a dictionary (to avoid having to import ConfigObject), which will simplify the above to:

instance.import_data({...}, partial=True, strict=True, env={'context': {...}})

bintoro · 2015-11-16T10:55:44Z

Closing in favor of #359.

bintoro added 3 commits December 20, 2014 13:14

Interim commit for the import/validation 'meta' parameter

6d185a0

transforms.py fixes

bb46ef6

bintoro force-pushed the meta branch from 79c7a50 to d4962df Compare December 20, 2014 11:16

Changed 'meta' to 'env'

c3545c6

bintoro force-pushed the meta branch from d4962df to c3545c6 Compare December 20, 2014 11:22

Let ModelType options override env, fix imports

b2db5c0

bintoro force-pushed the meta branch from 80e9a09 to b2db5c0 Compare December 20, 2014 12:05

blafountain mentioned this pull request Jan 19, 2015

inconsistent context use for import vs export #274

Closed

bintoro mentioned this pull request Jul 12, 2015

strict=False doesn't propagate as expected #316

Closed

kstrauser mentioned this pull request Jul 19, 2015

What would you incompatibly break? #324

Open

bintoro mentioned this pull request Aug 5, 2015

avoid unnecessary code path through try/except #327

Merged

bintoro mentioned this pull request Oct 15, 2015

export_loop() vs. to_primitive() #337

Closed

bintoro force-pushed the master branch from bae2865 to 7ff2246 Compare November 3, 2015 21:59

bintoro added this to the 2.0 milestone Nov 5, 2015

bintoro mentioned this pull request Nov 16, 2015

Propagate environment during recursive import/export/validation #359

Merged

bintoro closed this Nov 16, 2015

bintoro deleted the meta branch November 16, 2015 19:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Create shared environment across nested import/validation loops via a metadata object #266

Create shared environment across nested import/validation loops via a metadata object #266

bintoro commented Dec 18, 2014

lkraider commented Apr 9, 2015

kstrauser commented Jul 19, 2015

jmsdnns commented Oct 20, 2015

jmsdnns commented Oct 20, 2015

bintoro commented Oct 20, 2015

jmsdnns commented Oct 20, 2015

bintoro commented Oct 20, 2015

jmsdnns commented Oct 20, 2015

kstrauser commented Oct 20, 2015

jmsdnns commented Oct 20, 2015

kstrauser commented Oct 20, 2015

jmsdnns commented Oct 20, 2015

kstrauser commented Oct 20, 2015

bintoro commented Oct 21, 2015

bintoro commented Nov 15, 2015

bintoro commented Nov 16, 2015

Create shared environment across nested import/validation loops via a metadata object #266

Create shared environment across nested import/validation loops via a metadata object #266

Conversation

bintoro commented Dec 18, 2014

Summary

Background

The patch

BREAKING CHANGE WARNING

Other changes incorporated here

lkraider commented Apr 9, 2015

kstrauser commented Jul 19, 2015

jmsdnns commented Oct 20, 2015

jmsdnns commented Oct 20, 2015

bintoro commented Oct 20, 2015

jmsdnns commented Oct 20, 2015

bintoro commented Oct 20, 2015

jmsdnns commented Oct 20, 2015

kstrauser commented Oct 20, 2015

jmsdnns commented Oct 20, 2015

kstrauser commented Oct 20, 2015

jmsdnns commented Oct 20, 2015

kstrauser commented Oct 20, 2015

bintoro commented Oct 21, 2015

bintoro commented Nov 15, 2015

bintoro commented Nov 16, 2015