Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to PUT with marshmallow-sqlalchemy + webargs? #254

Closed
lafrech opened this issue Sep 24, 2019 · 14 comments
Closed

How to PUT with marshmallow-sqlalchemy + webargs? #254

lafrech opened this issue Sep 24, 2019 · 14 comments

Comments

@lafrech
Copy link
Member

lafrech commented Sep 24, 2019

Hi guys.

I'm discovering marshmallow-sqlalchemy, trying to build a simple REST API with marshmallow-sqlalchemy. I'm using flask-smorest (apispec + webargs inside).

Here's what it looks like:

class Member(db.Model):
    id = sa.Column(UUIDType, primary_key=True, default=uuid.uuid4)
    first_name = sa.Column(sa.String(length=40))
    last_name = sa.Column(sa.String(length=40))
    birthdate = sa.Column(sa.DateTime)
class MemberSchema(ModelSchema):
    class Meta(ModelSchema.Meta):
        model = Member

class MemberQueryArgsSchema(ma.Schema):
    first_name = ma.fields.Str()
    last_name = ma.fields.Str()
@blp.route('/')
class Members(MethodView):

    @blp.arguments(MemberQueryArgsSchema, location='query')
    @blp.response(MemberSchema(many=True))
    def get(self, args):
        """List members"""
        return db.session.query(Member).filter_by(**args)

    @blp.arguments(MemberSchema)
    @blp.response(MemberSchema, code=201)
    def post(self, item):
        """Add a new member"""
        db.session.add(item)
        db.session.commit()
        return item


@blp.route('/<uuid:item_id>')
class MembersById(MethodView):

    @blp.response(MemberSchema)
    def get(self, item_id):
        """Get member by ID"""
        return db.session.query(Member).get_or_404(item_id)

    @blp.arguments(MemberSchema)
    @blp.response(MemberSchema)
    def put(self, item, item_id):
        """Update a member"""
        db.session.add(item)
        db.session.commit()
        return item

    @blp.response(code=204)
    def delete(self, item_id):
        """Delete a member"""
        item = db.session.query(Member).get_or_404(item_id)
        db.session.delete(item)
        db.session.commit()

Looks like I'm almost there and I find it pretty neat, except for two things related to the update (PUT).

id must be passed in body

When PUTing a resource, the id is in the resource path, injected as item_id in the view function. It shouldn't have to be in the request body. In fact, I'd like to make the id field read-only since id should not be modified.

With the code above

  • the id in the path is unused
  • the user must pass the id in the request body

If the id is not passed in the request body but added in the view function, the make_instance method of MemberSchema does not find the existing instance and creates another one. Adding the id in the view generates a unique constraint failure on the id when committing.

I don't see how to get out of this. Since the instantiation takes place in a post_load method, I don't see how to pass the id from path parameter. (Perhaps webargs could fetch it from path. I gave it a quick try with no luck, not sure why, but the single-schema/multi-location case is about to be dropped from webargs anyway.)

Looks like I need to back-off from marshmallow-sqlalchemy's niceness and no-op make_instance to do the actual job in the view function.

I can't believe I'm the only one here. Any advice ?

missing fields are not nulled

Not really specific to marshmallow-sqlalchemy.

AFAIU, the update in make_instance is similar to a dict update in that it does not null missing fields. Technically, this is not a real PUT.

In REST the old representation should be completely replaced with the new one. Updates should be done using PATCH (and the dict update method is not really satisfying, as it can't remove fields, which is why some advocate for a patch language such as json-patch, but that's another story).

When the client PUTs a resource with a removed field (not null, removed, therefore missing), I expect the field to be removed (nulled or rather set to default value) from the object.

In another app, I use a custom method that uses the new data (as dict) and the schema to update the fields in the object and null all non-dump-only missing fields.

I thought make_instance would do that but it doesn't.

In fact, I thought this was the idea behind of #40. Since it reads "Require fields for non-nullable properties", I figured nullable properties would not be required, therefore allowed to be missing and this made me think that missing values would be nullified, but it does not appear to be true.

It could be patched to do it, though.

Again, I'm surprised. Am I the only one with this issue? Am I missing something?

@lafrech
Copy link
Member Author

lafrech commented Sep 24, 2019

As I imagined, I could solve the first issue (id must be passed in body) by removing some of the magic. I no-oped make_instance and I create the instance in the view function.

I also kinda "solved" the second one by adding some None-handling trickery in ModelSchema.

class ModelSchema(OrigModelSchema):
    """ModelSchema override"""

    class Meta:
        sqla_session = db.session

    # No-op make_instance
    def make_instance(self, data, **kwargs):
        return data

    def update(self, obj, data):
        """Update object nullifying missing data"""
        loadable_fields = [
            k for k, v in self.fields.items() if not v.dump_only
        ]
        for name in loadable_fields:
            setattr(obj, name, data.get(name))

    # FIXME: This does not respect allow_none fields
    @ma.post_dump
    def remove_none_values(self, data, **kwargs):
        return {
            key: value for key, value in data.items() if value is not None
        }
class Member(db.Model):
    """Member model class"""

    id = sa.Column(UUIDType, primary_key=True, default=uuid.uuid4)
    first_name = sa.Column(sa.String(length=40))
    last_name = sa.Column(sa.String(length=40))
    birthdate = sa.Column(sa.DateTime)
class MemberSchema(ModelSchema):
    id = field_for(Member, "id", dump_only=True)

    class Meta(ModelSchema.Meta):
        model = Member

class MemberQueryArgsSchema(Schema):
    first_name = ma.fields.Str()
    last_name = ma.fields.Str()
@blp.route('/')
class Members(MethodView):

    @blp.arguments(MemberQueryArgsSchema, location='query')
    @blp.response(MemberSchema(many=True))
    def get(self, args):
        """List members"""
        return db.session.query(Member).filter_by(**args)

    @blp.arguments(MemberSchema)
    @blp.response(MemberSchema, code=201)
    def post(self, new_item):
        """Add a new member"""
        item = Member(**new_item)
        db.session.add(item)
        db.session.commit()
        return item

@blp.route('/<uuid:item_id>')
class MembersById(MethodView):

    @blp.response(MemberSchema)
    def get(self, item_id):
        """Get member by ID"""
        return db.session.query(Member).get_or_404(item_id)

    @blp.arguments(MemberSchema)
    @blp.response(MemberSchema)
    def put(self, new_item, item_id):
        """Update an existing member"""
        item = db.session.query(Member).get_or_404(item_id)
        MemberSchema().update(item, new_item)
        print(item.first_name)
        db.session.add(item)
        db.session.commit()
        return item

    @blp.response(code=204)
    def delete(self, item_id):
        """Delete a member"""
        item = db.session.query(Member).get_or_404(item_id)
        blp.check_etag(item, MemberSchema)
        db.session.delete(item)
        db.session.commit()

@lafrech
Copy link
Member Author

lafrech commented Sep 24, 2019

A little bit more about the "missing fields are not nulled" issue.

I'm used to work with MongoDB with a stack based on umongo and my API treats missing fields as empty data. In SQL, there is no such thing as missing data. Only nullable data.

It looks ugly to me but maybe the API ought to expect and return null rather than missing fields.

Still, for a real PUT, either missing fields should deserialize as None to null the fields in the base, either all fields (even nullable) should be required to enforce an explicit null is passed. Because currently, a missing field is treated as "don't touch this field". Well, arguably, this makes sense. But I suspect it is not "restful".

The ModelSchema.udpate method above (or an equivalent mechanism somewhere) seems necessary to support the restful PUT case.

However, if one doesn't mind the application spitting null on None data, the remove_none_values method is useless.

@lafrech
Copy link
Member Author

lafrech commented Sep 25, 2019

Regarding the first issue that led me to remove make_instance from my base ModelSchema, I have the feeling instantiating the object in webargs before injecting it in the resource may not be the right thing to do.

It may work if the API strictly matches the model, but as soon as the application get more complicated, the API schema may differ from the model and this is not possible anymore.

See marshmallow-code/flask-smorest#99.

Perhaps there could be a Meta option to opt out of make_instance.

There would still be a point in marshmallow-sqlalchemy: creating API schemas that match the model with minimal code duplication (types and validators are determined automatically).

This is very useful in an app we develop where we use umongo because almost everything is defined in the model. In sqlalchemy, it looks like there is not that much information to pick from the model and more to add manually, so lower benefit. But if it comes at low cost, it is better than nothing. Hence the importance of #240.

I'm still interested in a marshmallow-sqlalchemy + webargs example.

@lafrech
Copy link
Member Author

lafrech commented Sep 26, 2019

Actually, deleting make_instance is not enough to remove the "instantiate objects" feature.

Relations also deserialize as objects.

To generate pure marshmallow schemas, relations should be replaced with a field that validates whatever is expected to be passed in to represent the relation. AFAIU, this is either a primary key or a dict of keys, which would translate in a Field or Nested field respectively.

To ensure proper types are set to those fields, _get_field_class_for_property would need to access the model of the relation. For instance, if the primary key is an int, use an Int field, etc.

Looks more complicated than I expected.

Am I on the right track?

Any other way to generate pure marshmallow schemas from SQLalchemy models?

I think there is a use case for this : deserializing data before passing it to the view function. Once in the view func, only DB-related issues need to be handled (unique constraint, broken relation,...).

Ideas welcome.

@sloria
Copy link
Member

sloria commented Sep 26, 2019

Haven't read this whole thread, but the way to generate a schema that does not deserialize to model instances and does not fetch relationships is to use TableSchema instead of ModelSchema. This is what we do in Flask-RESTy. Would that solve your case?

@lafrech
Copy link
Member Author

lafrech commented Sep 26, 2019

Oh, boy. Lifesaver.

This seems to work:

class TableSchema(OrigTableSchema):
    """TableSchema override"""

    class Meta:
        include_fk = True

class Member(db.Model):
    """Member model class"""
    __tablename__ = "members"

    id = sa.Column(UUIDType, primary_key=True, default=uuid.uuid4)
    first_name = sa.Column(sa.String(length=40))
    last_name = sa.Column(sa.String(length=40))
    birthdate = sa.Column(sa.DateTime)
    team_id = sa.Column(UUIDType, sa.ForeignKey("teams.id"))

class MemberSchema(TableSchema):
    id = field_for(Member, "id", dump_only=True)

    class Meta(TableSchema.Meta):
        table = Member.__table__

I had to set include_fk to True. Any downside to this?

@sloria
Copy link
Member

sloria commented Sep 26, 2019

There shouldn't be a downside to setting include_fk to True.


We should really document this usage of TableSchema. I guess I was holding off because we might do #193 and #98 , which together I think would meet this same use case.

@antgel
Copy link
Contributor

antgel commented Oct 2, 2019

Haven't read this whole thread, but the way to generate a schema that does not deserialize to model instances and does not fetch relationships is to use TableSchema instead of ModelSchema. This is what we do in Flask-RESTy. Would that solve your case?

Following with great interest as I continue developing a REST API using this ecosystem. Without wishing to derail the thread, this is the first I've heard of Flask-RESTy. I was just about to start checking flask-rest-api as a potential replacement for flask-classful(*), and now another one I haven't heard of? :) @sloria How would you characterize Flask-RESTy versus "the others"?

(*) Not that I am dissatisfied with f-c, in fact I rather like it, it just seems to be a little bit removed from the marshmallow ecosystem so I sometimes worry about long-term.

@lafrech
Copy link
Member Author

lafrech commented Oct 2, 2019

I finally came up with a simple working example: https://github.com/lafrech/flask-smorest-sqlalchemy-example

The question in OP (how to PUT with webargs) is solved since using TableSchema makes it work. Thanks again @sloria for the hint.

I subclass it to add a few tricks, but I get what I want. Feedback welcome.

How people deal with null values generally is not a marshmallow-sqlalchemy issue.

Closing this since question is solved but still interested in the conversation.

@antgel one difference I know is that flask-resty is tied with SQLAlchemy.

@lafrech lafrech closed this as completed Oct 2, 2019
@sloria
Copy link
Member

sloria commented Oct 2, 2019

Nice. Glad you got it working. In flask-resty, we do the update logic in the view rather than the schema, but your solution works as well.

@antgel flask-resty is a bit more opinionated compared to flask-classful and flask-smorest. It's similar in spirit to django-rest-framework in its use of generic class-based views.

Copying my messages from Slack convos:

More control <----> More opinions/batteries

MethodView - classful - restplus/restful - rest-api - resty

this 'how to do apis in python' question comes up every few months or so on reddit and elsewhere. i think my current recommendations are something like:
you want to use python and don't have opinions on tooling? use DRF
you want openapi support and are more comfortable with flask? use flask-rest-api flask-smorest
you have experience with and like DRF but prefer flask and sqlalchemy? try flask-resty

@lafrech
Copy link
Member Author

lafrech commented Oct 2, 2019

I do the update in the view.

I added a method in TableSchema to factorize the update logic. The point is to avoid writing:

    item.attr_1 = data['attr_1']
    item.attr_2 = data['attr_2']

where the list of attributes can be known thanks to the schema.

@antgel
Copy link
Contributor

antgel commented Oct 2, 2019

Copying my messages from Slack convos:

Oops, I forgot about that and I consider myself a chatops freak. Will continue there. blush

this 'how to do apis in python' question comes up every few months or so on reddit and elsewhere. i think my current recommendations are something like:
you want to use python and don't have opinions on tooling? use DRF
you want openapi support and are more comfortable with flask? use flask-rest-api flask-smorest
you have experience with and like DRF but prefer flask and sqlalchemy? try flask-resty

I know we're all busy, but this needs to be shouted from the rooftops somewhere not buried in an unrelated issue, it's gold!

@kettenbach-it
Copy link

kettenbach-it commented Mar 11, 2020

Hi guys, I'm working on an API based on flask-smorest + sqla + ma-sqla.

I subclass it to add a few tricks, but I get what I want. Feedback welcome.

I'm using your update method (without the post dump part) in the BaseSchema extending ModelSchema of my API.

I have a couple of NxM relations nested into my MainSchema. My first implementation just left out the nested schemata and left it to the the frontend to deal with the modifications of the NxM tables.

But now I'd like to handle that in the API using the nested fields in the deserialization process, too.
POSTing an object with nested fields works, as long as I use marshmallow-sqlalchemy's Nested instead of marshmallow.fields.Nested like advised in #67

But PUT with nested fields does not work and things get really complicated:

Is there a way to leave the necessary checks if and how related tables need to be changed to marshmallow-sqlalchemy? I read the above thread and looked at the flask-smorest-sqlalchemy-example from lafrech, but I don't understand it.
using the TableSchema, not ModelSchema as I do, seems to help.
But can anyone tell me the differences between the two? I can't find anything useful in the documentary. Does the TableSchema have disadvantages compared to the ModelSchema?

And above all: how do I do my PUT when nested fields are involved?

P.S.: I'm running marshmallow-sqlalchemy 0.19.0 so SQLAlchemyAutoSchema is not yet available for me.
marshmallow==3.2.1
flask-smorest==0.18.2

@kettenbach-it
Copy link

kettenbach-it commented Mar 11, 2020

I found something on tableschema vs modellschema

#182 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants