Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regression on list of nullable nested fields #1497

Closed
Meallia opened this issue Jan 30, 2020 · 7 comments · Fixed by #1503
Closed

Regression on list of nullable nested fields #1497

Meallia opened this issue Jan 30, 2020 · 7 comments · Fixed by #1503

Comments

@Meallia
Copy link

Meallia commented Jan 30, 2020

Hello,

While porting some code from 2.20 to 3.3, I stumbled upon this change in behavior regarding list of nullable fields.

in 2.20 the following code works:

from marshmallow import fields, Schema

class S(Schema):
    f = fields.List(fields.Nested(Schema, allow_none=True))

assert S().dump(dict(f=[None])).data == dict(f=[None])

whereas the same code in 3.3 fails with
AssertionError: assert {'f': [{}]} == {'f': [None]}

from marshmallow import fields, Schema

class S(Schema):
    f = fields.List(fields.Nested(Schema, allow_none=True))

assert S().dump(dict(f=[None])) == dict(f=[None])

I could not find anything related to this in the upgrade instructions or the documentation.
I'm fully aware that this is an edge case and I don't mind looking into it if it's actually considered as a bug.

here are two unit tests that can be used to reproduce this issue:

def test_list_of_nested_nullable_serialize_none_to_none(self):
    res = fields.List(fields.Nested(Schema, allow_none=True)).serialize("foo", {"foo": [None]})
    assert res == [None]

def test_nested_multiple_nullable_serialize_none_to_none(self):
    res = fields.Nested(Schema, allow_none=True, many=True).serialize("foo", {"foo": [None]})
    assert res == [None]
@Meallia Meallia changed the title Regression Regression for list of nulla Jan 30, 2020
@Meallia Meallia changed the title Regression for list of nulla Regression for list of nullable nested fields Jan 30, 2020
@Meallia Meallia changed the title Regression for list of nullable nested fields Regression on list of nullable nested fields Jan 30, 2020
@deckar01
Copy link
Member

I was able to track this change down to #1306. This behavior was inherited from Nested(many) when optimizing List(Nested) to use the same fast Schema.dump(many) call. This was not an expected side effect, but the behavior was already present in a closely related field.

Using 2.x you can reproduce this behavior using Nested(many):

class S(Schema):
    f = fields.Nested(Schema, allow_none=True, many=True)

S().dump(dict(f=[None]))
# MarshalResult(data={u'f': [{}]}, errors={})

allow_none doesn't actually have any effect during dump, and neither of these examples behave differently when it is omitted or set to False. None will generally be accepted during serialization, because we don't validate during serialization (anymore, we used to a little).

When the value of a field is None, all of the field dump methods, including Nested, explicitly check for None and return None. #1306 lets Schema.dump handle everything and it does not have any special None handling. Schema().dump(None) is {}.

@sloria @lafrech Should Schema().dump(None) return None instead of {}?

@deckar01
Copy link
Member

As far as I can tell there is no documentation that says None has to serialize to None. In fact, in the first example of the "Creating a field class", _serialize checks for None and returns an empty string.

https://marshmallow.readthedocs.io/en/stable/custom_fields.html#creating-a-field-class

@Meallia
Copy link
Author

Meallia commented Jan 31, 2020

Hi,
Thank you for the quick reply.
My examples were a bit sloppy. so I took some time to check again and test with load and dump

from marshmallow import Schema, fields

class S(Schema):
    string = fields.String(allow_none=True)
    list_of_strings = fields.List(fields.String(allow_none=True))

    object = fields.Nested(Schema, allow_none=True)
    list_of_objects = fields.List(fields.Nested(Schema, allow_none=True))


print(S().dump(dict(
        string=None,
        list_of_strings=[None],
        object=None,
        list_of_objects=[None],
)))
# marshmallow 2
{'string'         : None,
 'object'         : None,
 'list_of_objects': [None],
 'list_of_strings': [None]}
# marshmallow 3
{'object'         : None,
 'list_of_objects': [{}],
 'list_of_strings': [None],
 'string'         : None}

print(S().load(dict(
        string=None,
        list_of_strings=[None],
        object=None,
        list_of_objects=[None],
)).data)
# marshmallow 2
{'string'         : None,
 'object'         : None,
 'list_of_objects': [None],
 'list_of_strings': [None]}
# marshmallow 3 -> marshmallow.exceptions.ValidationError: {'list_of_objects': {0: {'_schema': ['Invalid input type.']}}}
# marshmallow 3, if not using the list_of_objects
{'list_of_strings': [None],
 'string': None,
 'object': None}

So, when trying to load a value from a Nested field with allow_none set to True

  • with many=False : the Nested field will first check if the value is None from its call to Nested._deserialize
  • with many=True / in a List field : if values in the collections are None, since we're already in Schema.load, there is no way to know if None was a valid value and the validation fails

Whereas, when doing the same thing with a scalar field (String in my example)

  • with many=False : the field will first check if the value is None
  • in a List field : the field will first check if the value is None for each value

Shouldn't allow_none behave the same way regardless of the kind of Field it's used in ?
It can probably be achieved by doing a first pass to check for None on elements from Nested(many=True) and List(Nested) fields (reverting #1306 if the nested field is nullable) , but how about schemas with pre/post-processors set with pass_many=True ?

@lafrech
Copy link
Member

lafrech commented Jan 31, 2020

We've been discussing in #1424 about reverting the #1306 optim. Would that close this?

@Meallia
Copy link
Author

Meallia commented Jan 31, 2020

in marshmallow 3.0rc9 (the release just before #1306 was merged) List(Nested(Schema())) correctly loads and dumps [None] but Nested(Schema(), many=True) returns [{}]

class S(Schema):
    string = fields.String(allow_none=True)
    list_of_strings = fields.List(fields.String(allow_none=True))

    object = fields.Nested(Schema, allow_none=True)
    list_of_objects = fields.List(fields.Nested(Schema, allow_none=True))
    many_nested_objects = fields.Nested(Schema, allow_none=True, many=True)

base_dict = dict(string=None, list_of_strings=[None], object=None, list_of_objects=[None], many_nested_objects=[None])

assert S().dump(base_dict) == base_dict # works in 3.0rc9
assert S().load(base_dict) == base_dict # fails,  'many_nested_objects': [{}]

some changes would have to be made on the Nested(many=True) field as well if it should behave the same way as List(Nested)

@deckar01
Copy link
Member

This can be fixed for List(Nested) and Nested(many) by simply adding if d else None in the list comprehension in Schema._serializes. This explicitly avoids passing None to Schema._serializes when it is a list similar to how the fields work. This would also fix Schema(many), which transforms None into {}. I think leaving the Schema().dump(None) -> {} behavior is fine for now.

@deckar01
Copy link
Member

deckar01 commented Feb 3, 2020

@Meallia pointed out in #1498 that Nested(many).load() ignores allow_none, because Schema.load() doesn't have any logic to handle allow_none. That seems like a pretty strong reason to revert #1306. I also updated #779, because this is a major draw back of Nested(many). Even if you try to fix it, the kwargs only apply to the container and there is no way to configure these options for the list items.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants