New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fields.List and generators #185

Closed
aganezov opened this Issue Mar 31, 2015 · 13 comments

Comments

Projects
None yet
3 participants
@aganezov
Contributor

aganezov commented Mar 31, 2015

Hi,

First, I would like to thank you for this wonderful library! Using it and getting happier and happier the more I use it.

Set up for the question:
Assume I have the following Schema

class MySchema(Schema)
    values = fields.List(fields.Int, attribute="my_generator", allow_none=False)

and a class

class MyClass(object):
    def __init__(pipe):
        self.pipe = pipe

    def my_generator(self):
        yield from self.pipe

and then I do the following:

schema = MySchema()
m_obj = MyClass([1, 2, 3, 4])
print(schema.dump(m_obj).data)

I get something like {"values": []}

but is I change generator to list

def my_generator(self):
        return list(self.pipe)

it works fine.

The question:
Is there some support for fields.List to work with generators?

@sloria

This comment has been minimized.

Member

sloria commented Apr 2, 2015

@sergey-aganezov-jr Thanks for kind words and for reporting this issue.

Yes, I think some modifications to fields.List#_serialize would allow generators to be serialized.

PRs welcome!

@davidism

This comment has been minimized.

Contributor

davidism commented Apr 2, 2015

Why not just take out both checks (generator and the one below it about index/string) and just allow any iterable? This would be way more convenient when, for example, you have a set that you want to serialize, since there is no set in JSON and a list is probably good enough.

The one downside is that it will now accept a bare string, but that "problem" will pretty much be immediately apparent when you get a list of single characters.

This would also do away with the Python anti-pattern of iterating over a range the length of the list and indexing.

@aganezov

This comment has been minimized.

Contributor

aganezov commented Apr 2, 2015

I don't think string issue will be that big, as in this case it would be a "hack" to go from string to a list of chars, by simply specifying fields.List(fields.String()) on the string value in the serialized class. And if people want to serialize string, they shall really use fields.String().

I believe this pull request is the first step, and then fields.List can be generalized into supporting all iterables. On the other hand, it is not that easy to check if something is iterable in python (link).

It seems that something like this can be done:

    if utils.is_indexable_but_not_string(value) and not isinstance(value, dict):
        return [self.container.serialize(idx, value) for idx in range(len(value))]
    if utils.is_iterable_but_not_string(value) and not isinstance(value, dict):
        return [self.container.serialize(0, [entry]) for entry in value]
    ....

and utils.is_iterable_but_not_string(value) can be implemented something like this:

    def is_iterable_but_not_string(obj):
        return hasattr(obj, "__iter__") and not not hasattr(obj, "strip")

aganezov added a commit to aganezov/marshmallow that referenced this issue Apr 2, 2015

continuation of marshmallow-code#185
fixed test case built-in set support from fields.List to be compatible with python 2.6 (set comprehension was removed)
@davidism

This comment has been minimized.

Contributor

davidism commented Apr 3, 2015

All that's necessary is isinstance(obj, Iterable). Or to be a little more forgiving:

try:
    iter(obj)
    return True
except TypeError:
    return False

Instead of is_indexable_but_not_string and is_iterable_but_not_string.

@aganezov

This comment has been minimized.

Contributor

aganezov commented Apr 3, 2015

I would prefer to wait for the author and hear his opinion. I believe there was some narrative to implement the check for iterables the way it was implemented.

@sloria

This comment has been minimized.

Member

sloria commented Apr 3, 2015

A straightforward apporach would be to only do the check for is_iterable_but_not_string and serialize each of the value's elements with the inner field's _serialize method.

    def _serialize(self, value, attr, obj):
        if value is None:
            return self.default
        if utils.is_iterable_but_not_string(value):
            return [self.container._serialize(each, attr, obj) for each in value]
        else:
            return [self.container.serialize(attr, obj)]

I believe this meets all the cases described above.

@aganezov

This comment has been minimized.

Contributor

aganezov commented Apr 3, 2015

Is it ok, that this approach would process the dict as iterable going through its keys?

@sloria

This comment has been minimized.

Member

sloria commented Apr 3, 2015

@sergey-aganezov-jr Good point. Might make sense to use utils.is_collection instead.

@sloria sloria removed the ready to claim label Apr 3, 2015

@aganezov

This comment has been minimized.

Contributor

aganezov commented Apr 3, 2015

Seems good enough. was confused by the signature of the _serialize method, as it has three arguments, rather than one. More careful code investigation revealed that if only processes the first one, for the most part.

@aganezov

This comment has been minimized.

Contributor

aganezov commented Apr 3, 2015

On the other hand, I still believe the indexable approach can stay in place, it can be convenient for some weird case, when a user has a self defined class, that supports indexing and length, but does not support iteration, and indexation performs some additional actions.

    class MyClass(object):
        def __init__(self, indexable_container):
            self.indexable_container = indexable_container

        def __getitem__(self, index):
            logging.debug("value at index" + str(index) + "was accessed")
            return self.indexable_container[index]

        def __len__(self):
            return len(self.indexable_container)

The idea is that for iterators the states are independent form each other, but for indexation this might not be the case.

aganezov added a commit to aganezov/marshmallow that referenced this issue Apr 3, 2015

continuation of marshmallow-code#185
updated fields.List.serialize implementation to process all iterables as well as custom indexables with length
updated respective test cases
@sloria

This comment has been minimized.

Member

sloria commented Apr 3, 2015

@sergey-aganezov-jr Perhaps, but that case seems to be an incomplete implementation of a collection. Just about every built-in type that implements len and indexing also supports iteration. Not sure if the additional code and type-checking is even worth supporting that edge case. But if you can think of a valid use case, I could be convinced otherwise.

@aganezov

This comment has been minimized.

Contributor

aganezov commented Apr 3, 2015

I don't really have a use case, but I was sort of thinking of "everything that can be covered". If you believe it is not needed, I'll remove that portion from the pull request.

Besides that, do you think the pull request with suited test cases is complete? Or shall I go over the documentation first as well? It just doesn't seem as a big change and I think documentation will not be changed because of it.

@sloria

This comment has been minimized.

Member

sloria commented Apr 3, 2015

@sergey-aganezov-jr Let's remove support for "indexable-but-not-iterable" types until need arises.

I think the PR looks good as is. i don't think any additional documentation is necessary.

aganezov added a commit to aganezov/marshmallow that referenced this issue Apr 3, 2015

continuation of marshmallow-code#185
removed support for the indexable objects that have not __iter__

aganezov added a commit to aganezov/marshmallow that referenced this issue Apr 3, 2015

continuation of marshmallow-code#185
quick follow up: unused import removed

@sloria sloria closed this Apr 3, 2015

@sloria sloria added this to the 2.0-a milestone Apr 3, 2015

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment