New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Order of pre-processing and reading data into dict #179

Closed
taion opened this Issue Mar 17, 2015 · 4 comments

Comments

Projects
None yet
2 participants
@taion
Contributor

taion commented Mar 17, 2015

Currently, preprocessors are called after the output dictionary is constructed. This makes it impossible to e.g. unwrap the namespacing operation given in the examples purely within a deserializer, as far as I can tell.

It's also surprising that the preprocessor runs before output dictionary construction, because this makes it not parallel postprocessors on Schema.dump.

@sloria

This comment has been minimized.

Member

sloria commented Mar 18, 2015

Thanks @taion for reporting.

Currently, preprocessors are called after the output dictionary is constructed.

Preprocessors are executed prior to validation and passing to make_object.

It's also surprising that the preprocessor runs before output dictionary construction, because this makes it not parallel postprocessors on Schema.dump.

@davidism reported a similar issue in #153 . Let's continue discussion about this on that issue.

@taion

This comment has been minimized.

Contributor

taion commented Mar 18, 2015

I looked at that issue. I think I'm concerned about something different.

My concern is that there's no pluggable step in deserialization that happens before the deserialization of individual fields. The current pipeline for deserialization looks like:

  1. Deserialize fields into dictionary
  2. Run preprocessors
  3. Run validators
  4. Make object

By contrast, the pipeline for serialization looks like:

  1. Serialize fields into dictionary
  2. Run postprocessor/data handlers

I believe @davidism is talking about adding something before (1) above in serialization. My concern is different - it's that there's nothing before (1) in deserialization, that can reverse the effect of a data handler.

For example, I can't reverse the effects of the data handler in the example with a preprocessor: http://marshmallow.readthedocs.org/en/latest/extending.html#transforming-data

Additionally, because preprocessing is done on fields.Unmarshaller instead of on a method on the Schema object like Schema._postprocess, I can't even conveniently override any of the relevant methods, like I do to work around #177.

@sloria

This comment has been minimized.

Member

sloria commented Mar 20, 2015

Thanks for the clarification @taion . You raise a valid point. I think adding the pre- and post-load hooks suggested in #153 would be useful for the use case you're describing. The pipeline would then become:

  1. pre_load hook
  2. Deserialize fields into dict
  3. Validate
  4. post_load hook
@taion

This comment has been minimized.

Contributor

taion commented Mar 20, 2015

Works for me. Closing this issue then.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment