New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support encoding and decoding attrs types #323
Conversation
A quick example: In [1]: import attrs
In [2]: import msgspec
In [3]: @attrs.define
...: class User:
...: name: str
...: groups: list[str] = []
...: email: str | None = None
...:
In [4]: alice = User("alice", groups=["admin", "engineering"])
In [5]: msg = msgspec.json.encode(alice)
In [6]: msg
Out[6]: b'{"email":null,"groups":["admin","engineering"],"name":"alice"}'
In [7]: msgspec.json.decode(msg, type=User)
Out[7]: User(name='alice', groups=['admin', 'engineering'], email=None)
In [8]: msgspec.json.decode(b'{"name": 123}', type=User)
---------------------------------------------------------------------------
ValidationError Traceback (most recent call last)
Cell In [8], line 1
----> 1 msgspec.json.decode(b'{"name": 123}', type=User)
ValidationError: Expected `str`, got `int` - at `$.name` |
Does this still work if the field had defined a converter? |
Converters and validators aren't currently supported with the above model. I'm not an attrs user, so I'm not sure what the expected behavior is here - I was hoping to get some feedback on this from @Tinche on what's expected before moving forward with this. |
Looks pretty cool to me! I personally don't use validators, and converters sparingly so this is a good first step. Why do you skip attributes with a leading underscore though? We use MongoDB a lot (through our own ODM based on attrs ;) and Mongo documents require an |
Thanks for the review! Just to clarify - do you think this is good to merge as is, and would be useful for you (or others) without immediate further changes?
For performance reasons, we don't touch
For struct types, we support renaming fields for this use case. The assumption is that user code probably wants to work with python-friendly attribute names (e.g. In [1]: import msgspec
In [2]: class Person(msgspec.Struct, rename={"id": "_id"}):
...: id: int
...: name: str
...:
In [3]: alice = Person(1, "alice")
In [4]: alice.id # attribute is original field name
Out[4]: 1
In [5]: alice
Out[5]: Person(id=1, name='alice')
In [6]: msgspec.json.encode(alice) # encoding uses renamed name
Out[6]: b'{"_id":1,"name":"alice"}'
In [7]: msgspec.json.decode(_, type=Person) # as does decoding
Out[7]: Person(id=1, name='alice') Currently this feature is only exposed for In summary:
|
These uses the same code path as `dataclasses`, with the same behaviors and restrictions. In particular any attribute lacking a leading underscore on an `attrs` type is encoded, regardless of if it's declared as an attribute.
These follow the same code path as `dataclasses`, and have the same restrictions: - The generated `__init__` is not called. `msgspec` calls the proper hooks in the proper order. - `__attrs_pre_init__` and `__attrs_post_init__` hooks are both supported and will be called on decode. - `attrs.Factory(..., takes_self=True)` defaults aren't currently supported, but could be if someone asks for it.
This fixes a bug in both the dataclass and attrs decoder implementations that previously prevented decoding a message into a frozen dataclass or attrs type.
This adds support for encoding/decoding attrs types. It's mostly built off of our existing
dataclasses
functionality, and has the same restrictions:__attrs_attrs__
at encoding time comes at a perf cost.__init__
method on the class is not called. This is for both efficiency and correctness reasons. As such, classes that define their own__init__
or__attrs_init__
will not have this called during the decoding process.__attrs_pre_init__
and__attrs_post_init__
methods are called at the proper times though.takes_self=True
are not supported. We could support this, but it was more work than I wanted to do right now.Note that right now neither attrs' validators or convertors are run on decode. This is fixable, but would require a lot more work since this would diverge from the existing
dataclasses
implementation.All in all this was pretty straightforward to get working. It's nice that
dataclasses
andattrs
implementations haven't diverged too much here.This also fixes a bug in the existing dataclass decoder that prevented decoding dataclasses with
frozen=True
configured.Fixes #51.
Related to (but doesn't resolve) #316.