# marshmallow

![image](images/marshmallow.png)


In [3]:
from datetime import date

from marshmallow import Schema, fields


class ArtistSchema(Schema):
    name = fields.Str()


class AlbumSchema(Schema):
    title = fields.Str()
    release_date = fields.Date()
    artist = fields.Nested(ArtistSchema())


bowie = dict(name="David Bowie")
album = dict(artist=bowie, title="Hunky Dory", release_date=date(1971, 12, 17))

schema = AlbumSchema()
result = schema.dump(album)
print(result)


{'title': 'Hunky Dory', 'release_date': '1971-12-17', 'artist': {'name': 'David Bowie'}}


In short, marshmallow schemas can be used to:

- Validate input data.

- Deserialize input data to app-level objects.

- Serialize app-level objects to primitive Python types. The serialized objects can then be rendered to standard formats such as JSON for use in an HTTP API.


In [8]:
import datetime as dt


class User:
    def __init__(self, name, email):
        self.name = name
        self.email = email
        self.created_at = dt.datetime.now()

    def __repr__(self):
        return "<User(name={self.name!r})>".format(self=self)

In [9]:
from marshmallow import Schema, fields


class UserSchema(Schema):
    name = fields.Str()
    email = fields.Email()
    created_at = fields.DateTime()

## Creating Schemas From Dictionaries

You can create a schema from a dictionary of fields using the `from_dict` method.

`from_dict` is especially useful for generating schemas at runtime.

In [6]:
from marshmallow import Schema, fields

UserSchema = Schema.from_dict(
    {"name": fields.Str(), "email": fields.Email(), "created_at": fields.DateTime()}
)

## Serializing Objects (“Dumping”)¶

Serialize objects by passing them to your schema’s `dump` method, which returns the formatted result.

In [11]:
user = User(name="Monty", email="monty@python.org")
schema = UserSchema()
result = schema.dump(user)
print(result)
print(type(result))

{'name': 'Monty', 'email': 'monty@python.org', 'created_at': '2024-01-22T18:09:44.739282'}
<class 'dict'>


You can also serialize to a JSON-encoded string using `dumps`.

In [12]:
json_result = schema.dumps(user)
print(json_result)
print(type(json_result))

{"name": "Monty", "email": "monty@python.org", "created_at": "2024-01-22T18:09:44.739282"}
<class 'str'>



## Filtering Output

You may not need to output all declared fields every time you use a schema. You can specify which fields to output with the `only` parameter.
You can also exclude fields by passing in the `exclude` parameter.

In [16]:
summary_schema = UserSchema(only=["name", "email"])
print(summary_schema.dump(user))

summary_schema = UserSchema(exclude=["name"])
print(summary_schema.dump(user))

{'name': 'Monty', 'email': 'monty@python.org'}
{'email': 'monty@python.org', 'created_at': '2024-01-22T18:09:44.739282'}


## Deserializing Objects (“Loading”)

The reverse of the `dump` method is `load`, which validates and deserializes an input dictionary to an application-level data structure.

By default, `load` will return a dictionary of field names mapped to deserialized values (or raise a `ValidationError` with a dictionary of validation errors).

Notice that the `datetime` string was converted to a `datetime` object.

In [17]:
user_data = {
    "created_at": "2014-08-11T05:26:03.869245",
    "email": "ken@yahoo.com",
    "name": "Ken",
}
schema = UserSchema()
result = schema.load(user_data)
print(result)

{'name': 'Ken', 'email': 'ken@yahoo.com', 'created_at': datetime.datetime(2014, 8, 11, 5, 26, 3, 869245)}


## Deserializing to Objects

In order to deserialize to an object, define a method of your Schema and decorate it with `post_load`. The method receives a dictionary of deserialized data.

In [19]:
from marshmallow import Schema, fields, post_load


class UserSchema(Schema):
    name = fields.Str()
    email = fields.Email()
    created_at = fields.DateTime()

    @post_load
    def make_user(self, data, **kwargs):
        return User(**data)
    
user_data = {"name": "Ronnie", "email": "ronnie@stones.com"}
schema = UserSchema()
result = schema.load(user_data)
print(result)

<User(name='Ronnie')>


## Handling Collections of Objects

Set `many=True` when dealing with iterable collections of objects.

In [21]:
user1 = User(name="Mick", email="mick@stones.com")
user2 = User(name="Keith", email="keith@stones.com")
users = [user1, user2]
schema = UserSchema(many=True)
result = schema.dump(users)  # OR UserSchema().dump(users, many=True)
print(result)
print(type(result))

[{'name': 'Mick', 'email': 'mick@stones.com', 'created_at': '2024-01-22T18:32:56.023602'}, {'name': 'Keith', 'email': 'keith@stones.com', 'created_at': '2024-01-22T18:32:56.023623'}]
<class 'list'>


## Validation

`Schema.load()` (and its JSON-decoding counterpart, `Schema.loads()`) raises a `ValidationError` error when invalid data are passed in. You can access the dictionary of validation errors from the `ValidationError.messages` attribute. The data that were correctly deserialized are accessible in `ValidationError.valid_data`. Some fields, such as the `Email` and `URL` fields, have built-in validation.

In [22]:
from marshmallow import ValidationError

try:
    result = UserSchema().load({"name": "John", "email": "foo"})
except ValidationError as err:
    print(err.messages)
    print(err.valid_data)

{'email': ['Not a valid email address.']}
{'name': 'John'}


You can perform additional validation for a field by passing the `validate` argument. There are a number of built-in validators in the `marshmallow.validate` module.

In [31]:
from pprint import pprint

from marshmallow import Schema, fields, validate, ValidationError


class UserSchema(Schema):
    name = fields.Str(validate=validate.Length(min=1))
    permission = fields.Str(validate=validate.OneOf(["read", "write", "admin"]))
    age = fields.Int(validate=validate.Range(min=18, max=40))


in_data = {"name": "", "permission": "invalid", "age": 71}
try:
    UserSchema().load(in_data)
except ValidationError as err:
    pprint(err.messages)


{'age': ['Must be greater than or equal to 18 and less than or equal to 40.'],
 'name': ['Shorter than minimum length 1.'],
 'permission': ['Must be one of: read, write, admin.']}


You may implement your own validators. A validator is a callable that accepts a single argument, the value to validate. If validation fails, the callable should raise a `ValidationError` with a useful error message or return `False` (for a generic error message).

You may also pass a collection (list, tuple, generator) of callables to validate.

In [26]:
from marshmallow import Schema, fields, ValidationError


def validate_quantity(n):
    if n < 0:
        raise ValidationError("Quantity must be greater than 0.")
    if n > 30:
        raise ValidationError("Quantity must not be greater than 30.")


class ItemSchema(Schema):
    quantity = fields.Integer(validate=validate_quantity)


in_data = {"quantity": 31}
try:
    result = ItemSchema().load(in_data)
except ValidationError as err:
    print(err.messages)

{'quantity': ['Quantity must not be greater than 30.']}


## Validation Without Deserialization¶

If you only need to validate input data (without deserializing to an object), you can use` Schema.validate()`

In [34]:
class UserSchema(Schema):
    name = fields.Str()
    email = fields.Email()
    created_at = fields.DateTime()

errors = UserSchema().validate({"name": "Ronnie", "email": "invalid-email"})
print(errors)

{'email': ['Not a valid email address.']}


## Specifying Defaults

`load_default` specifies the default deserialization value for a field. Likewise, `dump_default` specifies the default serialization value.

In [27]:
import uuid


class UserSchema(Schema):
    id = fields.UUID(load_default=uuid.uuid1)
    birthdate = fields.DateTime(dump_default=dt.datetime(2017, 9, 29))


print(UserSchema().load({}))
print(UserSchema().dump({}))


{'id': UUID('4f8b9eb8-b951-11ee-8d3d-26851adf4b1c')}
{'birthdate': '2017-09-29T00:00:00'}


## Handling Unknown Fields
By default, load will raise a `ValidationError` if it encounters a key with no matching `Field` in the schema.

This behavior can be modified with the unknown option, which accepts one of the following:

   - `RAISE` (default): raise a `ValidationError` if there are any unknown fields

   - `EXCLUDE`: exclude unknown fields

   - `INCLUDE`: accept and include the unknown fields


In [28]:
from marshmallow import Schema, INCLUDE

# You can specify `unknown` in the _class Meta_ of your `Schema`,
class UserSchema(Schema):
    class Meta:
        unknown = INCLUDE

# at instantiation time,      
schema = UserSchema(unknown=INCLUDE)

# or when calling `load`
# The unknown option value set in load will override the value applied at instantiation time, which itself will override the value defined in the class Meta.
UserSchema().load(data, unknown=INCLUDE)

## Specifying Serialization/Deserialization Keys



Schemas will (de)serialize an input dictionary from/to an output dictionary whose keys are identical to the field names. If you are consuming and producing data that does not match your schema, you can specify the output keys via the `data_key` argument.

In [36]:
class UserSchema(Schema):
    name = fields.String()
    email = fields.Email(data_key="emailAddress")


s = UserSchema()

data = {"name": "Mike", "email": "foo@bar.com"}
result = s.dump(data)
print(result)

data = {"name": "Mike", "emailAddress": "foo@bar.com"}
result = s.load(data)
print(result)

{'name': 'Mike', 'emailAddress': 'foo@bar.com'}
{'name': 'Mike', 'email': 'foo@bar.com'}


## Nesting Schemas
Schemas can be nested to represent relationships between objects (e.g. foreign key relationships). For example, a `Blog` may have an author represented by a `User` object.

In [42]:
from marshmallow import Schema, fields
import datetime as dt


class User:
    def __init__(self, name, email):
        self.name = name
        self.email = email
        self.created_at = dt.datetime.now()
        self.friends = []
        self.employer = None


class Blog:
    def __init__(self, title, author):
        self.title = title
        self.author = author


class UserSchema(Schema):
    name = fields.String()
    email = fields.Email()
    created_at = fields.DateTime()


class BlogSchema(Schema):
    title = fields.String()
    author = fields.Nested(UserSchema)
    # If the field is a collection of nested objects, pass the Nested field to List.
    # collaborators = fields.List(fields.Nested(UserSchema))


In [43]:
from pprint import pprint

user = User(name="Monty", email="monty@python.org")
blog = Blog(title="Something Completely Different", author=user)
result = BlogSchema().dump(blog)
pprint(result)

{'author': {'created_at': '2024-01-22T19:26:25.486175',
            'email': 'monty@python.org',
            'name': 'Monty'},
 'title': 'Something Completely Different'}


## Pre-processing and Post-processing Methods

Data pre-processing and post-processing methods can be registered using the `pre_load`, `post_load`, `pre_dump`, and `post_dump` decorators.

In [44]:
from marshmallow import Schema, fields, post_load


class UserSchema(Schema):
    name = fields.Str()
    slug = fields.Str()

    @post_load
    def slugify_name(self, in_data, **kwargs):
        in_data["slug"] = in_data["slug"].lower().strip().replace(" ", "-")
        return in_data


schema = UserSchema()
result = schema.load({"name": "Steve", "slug": "Steve Loria "})
result["slug"]

'steve-loria'


## Passing “many”

By default, pre- and post-processing methods receive one object/datum at a time, transparently handling the many parameter passed to the Schema’s dump()/load() method at runtime.

In cases where your pre- and post-processing methods needs to handle the input collection when processing multiple objects, add pass_many=True to the method decorators.

Your method will then receive the input data (which may be a single datum or a collection, depending on the dump/load call).


## Example: Enveloping

One common use case is to wrap data in a namespace upon serialization and unwrap the data during deserialization.


In [47]:
from marshmallow import Schema, fields, pre_load, post_load, post_dump


class BaseSchema(Schema):
    # Custom options
    __envelope__ = {"single": None, "many": None}
    __model__ = User

    def get_envelope_key(self, many):
        """Helper to get the envelope key."""
        key = self.__envelope__["many"] if many else self.__envelope__["single"]
        assert key is not None, "Envelope key undefined"
        return key

    @pre_load(pass_many=True)
    def unwrap_envelope(self, data, many, **kwargs):
        key = self.get_envelope_key(many)
        return data[key]

    @post_dump(pass_many=True)
    def wrap_with_envelope(self, data, many, **kwargs):
        key = self.get_envelope_key(many)
        return {key: data}

    @post_load
    def make_object(self, data, **kwargs):
        return self.__model__(**data)


class UserSchema(BaseSchema):
    __envelope__ = {"single": "user", "many": "users"}
    __model__ = User
    name = fields.Str()
    email = fields.Email()


user_schema = UserSchema()

user = User("Mick", email="mick@stones.org")
user_data = user_schema.dump(user)
print(user_data, "\n")

users = [
    User("Keith", email="keith@stones.org"),
    User("Charlie", email="charlie@stones.org"),
]
users_data = user_schema.dump(users, many=True)
print(users_data, "\n")

user_objs = user_schema.load(users_data, many=True)
print(user_objs, "\n")

{'user': {'name': 'Mick', 'email': 'mick@stones.org'}} 

{'users': [{'name': 'Keith', 'email': 'keith@stones.org'}, {'name': 'Charlie', 'email': 'charlie@stones.org'}]} 

[<__main__.User object at 0x109255650>, <__main__.User object at 0x109033990>] 
