Skip to content

Extra documentation for new context implementation? #2824

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
RosanneZe opened this issue Apr 24, 2025 · 14 comments
Open

Extra documentation for new context implementation? #2824

RosanneZe opened this issue Apr 24, 2025 · 14 comments

Comments

@RosanneZe
Copy link

I was a bit surprised that in the documentation there is only an example for how to use the experimental Context class, and not an example of how to do it the none experimental way?

This seems a lot more complex than the old way of doing it, is there anyway to just say the context is a dict and leave it at that?

@lafrech
Copy link
Member

lafrech commented Apr 24, 2025

Nope. The old not experimental way is gone.

The rationale for this is explained in #1826 and #2707.

@RosanneZe
Copy link
Author

Ah, I think that I just misread the upgrade notes then. I took 'Use contextvars.ContextVar for passing context to fields, pre-/post-processing methods, and validators instead.' to mean that there was also a non-experimental way of doing it, that I just didn't understood because I never did anything with contextvars.

@lafrech
Copy link
Member

lafrech commented Apr 28, 2025 via email

@juledwar
Copy link

One way that I use the context is to pass information from a parent schema into a nested one for conditional validation. This change has made things quite a bit more complex, was this use case considered?

@sloria
Copy link
Member

sloria commented May 12, 2025

@juledwar can you post an example of the use case with the legacy API that you're trying to achieve with the new API? maybe we can assist with the migration

@juledwar
Copy link

I would appreciate that, thanks.

I'll try and pare it down here, I am not permitted to post the code as-is, so this is not something I've run but it should give you the rough idea:

class Foo(Schema):
    name = String(required=True)
    reason = String(required=True)

    @pre_load
    def set_required(self, data, **kwargs):
        if self.context.get('reason_not_required') is False:
            self.fields['reason'].required = False
        return data

class Bar(Schema):
    id = Integer()
    details = Nested(Foo)

    @pre_load
    def needs_reason(self, data, **kwargs):
        if data.get('some_thing'):  # This could be any property of the data
            self.context['reason_not_required'] = False
        return data

Previously I could encapsulate all of this decision making in the schema here, but now I need to set context vars outside of the schema. There's no way to enforce their presence as well, as far as I can tell, other than to just blow up at runtime. It would be nice to pass something via the field constructor but this is not something that can be controlled in the schema definition. The hacky way would be to do something like
self.fields.details.orig_data = data
and examine it in Foo's hooks.

This change also has some ramifications for OneOfSchema and I filed another issue on that project:
marshmallow-code/marshmallow-oneofschema#214

@lafrech
Copy link
Member

lafrech commented May 12, 2025

Sounds like a polymorphism issue, indeed. Indeed not really the case I had in mind for the context feature.

@juledwar
Copy link

juledwar commented May 12, 2025

This is absolutely not using polymorphism here, it's just one example of setting properties to affect changes in validation on the nested schema, which is one of the use cases described in the documentation.

The OneOfSchema thing is possibly a polymorphism issue but I've written more there.

@lafrech
Copy link
Member

lafrech commented May 13, 2025

We could think of it as Bar being polymorphic, some_thing being the key, and Bar1 nesting Foo1 in which reason is required and Bar2 nesting Foo2 in which reason is not required.

But maybe that's not the most natural way to see it. And it would only work on this possibly contrived example where the check on some_thing doesn't contain any complicated business logic. And it may require more code.

I don't see what part of the doc you're pointing to but such a use case may have been used to illustrate the old context feature. What I'm saying is that I didn't have this in mind when proposing the changes in v4.

I don't have time to try this now but doesn't it work if you just replace self.context with Context.get()?

@juledwar
Copy link

I think you're confusing OneOfSchema with this one? The doc I'm referring to is here https://marshmallow.readthedocs.io/en/latest/custom_fields.html#using-context where it talks about using external data to affect the serialization.

My use case is to encapsulate this into the schema itself because forcing callers to provide the context when it's already all available in the parent schema seems quite awkward.

I'll see if Context.get() works - I was initially concerned about the value persisting without a reset()

@lafrech
Copy link
Member

lafrech commented May 14, 2025

I'm not confusing with OneOfSchema. I somehow missed the page you're pointing to and only saw this one.

Anyway, you're right, it won't work if the context is not opened by the with Context call. This is how it is meant to work. It is what I thought initially but for some reason, after reading the doc, I was in doubt.

A different approach to this would be to do the validation in the parent: in needs_reason, raise if the field is missing from Foo payload. I understand you like the logic to be contained in Foo schema but in this example, reason_not_required is quite explicit already, so the abstraction is weak. Could be because the example is too contrived.

Also, I understand it sounds better to use schema/fields API rather than duplicate validation code, but modifying schema such as in self.fields['reason'].required is not something I'm happy supporting. There's been discussion about this in #1281, although it was more specifically triggered by concerns about other attributes.

I hope I'm being helpful. Sorry about the wrong hint in previous comment.

@juledwar
Copy link

You're right in that it was a super-contrived example for the sake of brevity. But it's a general pattern I feel should be easier to do as I have quite a few nested schemas that need to change behaviour based on some external data that is a property of the original parent schema's data.

I would rather just inject the data from the call to load() than duplicate the validation in every parent. I could also use inheritance of a common validator in the parent but that feels conceptually wrong too.

Perhaps if I rename the context variable like this it's less of a weak-looking abstraction:

class Foo(Schema):
    name = String(required=True)
    reason = String(required=True)

    @pre_load
    def set_required(self, data, **kwargs):
        if self.context.get('totem') is False:
            self.fields['reason'].required = False  # or could do anything else
        return data

class Bar(Schema):
    id = Integer()
    details = Nested(Foo)

    @pre_load
    def set_totem(self, data, **kwargs):
        if data.get('some_thing'):  # This could be any property of the data
            self.context['totem'] = False
        return data

This is why self.context was so useful: it was a short-lived piece of data that was scoped to the data being loaded and perfectly thread safe as it is on the stack.

Regarding changing self.fields['reason'].required I think if you don't want to support that, the variable should be "private" with a leading _ otherwise API users see it as fair game (as noted in the Issue you linked), but it's such a useful way of dynamically setting the flag it would be a shame to see it go. How else could you make a field conditionally required this easy? A custom validator, sure, but that's many more lines of code.

I hope I'm being helpful. Sorry about the wrong hint in previous comment.

You have been nothing but very helpful every time I have a question on this project!

@sloria
Copy link
Member

sloria commented May 15, 2025

just brainstorming: would adding a setter to Context meet your use case? so something like

from marshmallow.experimental.context import Context

class Foo(Schema):
    name = String(required=True)
    reason = String(required=True)

    @pre_load
    def set_required(self, data, **kwargs):
        if Context.get()["totem"] is False:
            self.fields['reason'].required = False  # or could do anything else
        return data

class Bar(Schema):
    id = Integer()
    details = Nested(Foo)

    @pre_load
    def set_totem(self, data, **kwargs):
        if data.get('some_thing'):  # This could be any property of the data
            Context.set({"totem": False})  # <-- doesn't yet exist
        return data

with Context({"totem": True}):
    Bar().load({...})

we held off on adding a setter to avoid pathological cases where the context isn't reset. we could avoid that by raising an error if the setter is called without entering the context manager.

@juledwar
Copy link

I'm not sure it would help as you've still got the context data being set outside the load(). Ideally this needs to be encapsulated in the schema itself.

I think a decent way to manage this might be to have a context hook on the base schema class, a bit like pre_load etc, where you can set up context based on the incoming data. The base schema class can then unset it later after the data was loaded.

class Bar(Schema):
    id = Integer()
    details = Nested(Foo)

    @load_context
    def set_totem(self, data, **kwargs):
        if data.get('some_thing'):  # This could be any property of the data
            return {'totem': False}
        return None

It's not returning data like the other hooks, it purely returns something to set in a contextvar, which is all managed for you by the base class. Similarly, a @dump_context could work in the same way.

This would effectively remove the need for users to know anything about contextvars, but would still use them "under the hood" by using a setter and a reset when needed.

Thoughts?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants