More types & some refactoring #80

ydylla · 2021-01-31T17:06:49Z

This pull request adds support for more types and also utilizes the code generation capabilities for them.

The following types where added:

uuid.UUID
PosixPath, WindowsPath, PurePath, PurePosixPath, PureWindowsPath from pathlib
IPv4Address, IPv6Address, IPv4Network, IPv6Network, IPv4Interface, IPv6Interface from ipaddress
date & datetime from datetime

I needed these types because I wanted to use pyserde in combination with asyncpg.
asyncpg returns dict like objects which can be converted to dataclasses by pyserde's from_dict function.

Since asyncpg also supports the above mentioned types, the dicts already contain instances of them.
That is the reason why I also added a new argument reuse_instances for from_dict & from_tuple.
When this is set to True these functions check if the field already contains an instances of the target type and reuse it when possible.

This is faster for Path & IPAddress then calling the constructor again.
For UUID it is also the only way to handle existing instances because uuid.UUID() does not accept them as an argument.
It is also possible to change the default value of reuse_instances via the serialize & deserialize decorators.

To not cause slowdowns when serializing or deserializing to json/msgpack/toml/yaml reuse_instances is always set to False there, because we will never see existing instances there.

That is the part where this pull request drifted into refactoring.
Since I had to add the reuse_instances argument at all these places, I also used the opportunity to remove some unused arguments (named & strict) and renamed asdict to to_dict and astuple to to_tuple.
These are breaking changes, but I felt it was worth the gained similarity between serialization and deserialization code.
In my option the public exposed functions are now also named more uniformly.

For msgpack external types I also changed the behaviour slightly. It is not required anymore that the dataclasses have a special meaning _type attribute.
Instead to_msgpack & from_msgpack search the ext_dict for the correct type or type code and also throw exceptions if they can not find them.

I am sorry that this became so intertwined, I understand if you don't want to merge the breaking changes.
I tried to make the commits cherry-pickable so maybe you only want to use some commits.

During development, I also noticed that Unions do not work properly for these types.
I will make a separate pull request to fix that.

Finally, I have a question: What is the purpose of setting SE_NAME and why is it used in is_serializable?
Could it be removed? is_serializable could use TO_ITER and TO_DICT for the check like is_deserializable.

* msgpack: classes don't need a special _type attribute anymore. The ext_dict contains all information we need.

* se.py now internally uses to_obj similar to de.py

* add **opt to Nested class to work with reuse_instances

codecov · 2021-01-31T18:26:16Z

Codecov Report

Merging #80 (6a086f4) into master (56b3f53) will decrease coverage by 2.60%.
The diff coverage is 73.70%.

@@            Coverage Diff             @@
##           master      #80      +/-   ##
==========================================
- Coverage   88.35%   85.74%   -2.61%     
==========================================
  Files          10       11       +1     
  Lines         773      863      +90     
  Branches      162      178      +16     
==========================================
+ Hits          683      740      +57     
- Misses         61       83      +22     
- Partials       29       40      +11

Impacted Files	Coverage Δ
serde/core.py	`82.20% <ø> (-0.15%)`	⬇️
serde/py36_datetime_compat.py	`48.71% <48.71%> (ø)`
serde/de.py	`91.41% <74.64%> (+0.89%)`	⬆️
serde/se.py	`94.08% <84.00%> (+0.67%)`	⬆️
serde/json.py	`100.00% <100.00%> (ø)`
serde/more_types.py	`100.00% <100.00%> (+30.00%)`	⬆️
serde/msgpack.py	`100.00% <100.00%> (ø)`
serde/toml.py	`100.00% <100.00%> (ø)`
serde/yaml.py	`100.00% <100.00%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 56b3f53...6a086f4. Read the comment docs.

ydylla · 2021-01-31T18:53:34Z

I totally forgot that this also adds support for date & datetime from datetime 😃

python 3.6 has no fromisoformat functions so we have to ship our own (copied from python 3.8)

yukinarit · 2021-02-01T10:10:15Z

Hi @ydylla,

Thank you for the huge contribution! Let me check little by little 👍

yukinarit · 2021-02-01T11:19:59Z

@ydylla

Finally, I have a question: What is the purpose of setting SE_NAME and why is it used in is_serializable?
Could it be removed? is_serializable could use TO_ITER and TO_DICT for the check like is_deserializable.

SE_NAME was used because it was just added before TO_ITER and TO_DICT as far as remember. You can remove it and use TO_ITER and TO_DICT instead 🙂

yukinarit

@ydylla
I have reviewed the first half of commits.
Please check the review commits, thank you!

.gitignore

serde/msgpack.py

serde/__init__.py

ydylla · 2021-02-01T16:46:10Z

SE_NAME was used because it was just added before TO_ITER and TO_DICT as far as remember. You can remove it and use TO_ITER and TO_DICT instead 🙂

Okay I will change it. I also guessed this was the reason 😄

adsharma · 2021-02-01T16:58:25Z

serde/msgpack.py

+    ext_type_code = None
+    if ext_dict is not None:
+        obj_type = type(obj)
+        ext_type_code = next((code for code, ext_type in ext_dict.items() if obj_type is ext_type), None)


ext_type_code = ext_dict.get(obj_type, None)

seems simpler

I know these kind of one liners are bad practice.
But I wanted both functions to accept the same ext_dict: Dict[int, Type] (like in the test).
And for to_msgpack we have to get the key (integer type code) of the dict by searching with its value (the type).
A simple ext_dict.get( only works for the other direction. See from_msgpack.

On second thoughts I think it would be more efficient if the from_msgpack accepted a reversed dict. That way, the app would do the one time reversal and lookups would be efficient in both directions.

I think you meant to_msgpack because from_msgpack already uses a simple lookup.

Yes two dicts would be more efficient/faster. Another way could be to add the ext_type_code directly to the to_msgpack function as argument. Maybe the app already knows which type code to use.

In the original code the type code was saved as _type attribute. In my opinion the knowledge about the name of this special attribute should not be part of pyserde.
But with the ext_type_code as argument one could use it like this:

d = DerivedA(i=1, s="a", j=10) to_msgpack(d, ext_type_code=d._type)

So basically the app has to implement the lookup itself.

@yukinarit Thanks for the review.

I changed to_msgpack to use a Dict[Type, int] but I am not really happy with it
I guess it will confuse at least some users and also forces the application to save two dicts with basically the same information. But for know it is good enough, maybe someone else has a better idea in future.

Also keep in mind this is now a breaking change.

serde/msgpack.py

serde/de.py

serde/__init__.py

yukinarit · 2021-02-02T14:01:18Z

@ydylla
Also, please update supported types section in README.md

adsharma · 2021-02-02T16:11:17Z

I'm neutral. No big deal either way.

…

On Tue, Feb 2, 2021 at 5:22 AM yukinarit ***@***.***> wrote: ***@***.**** commented on this pull request. ------------------------------ In serde/de.py <#80 (comment)>: > @@ -379,7 +379,7 @@ def opt(self, arg: DeField) -> str: exists = f'{arg.data} is not None' else: exists = f'{arg.datavar}.get("{arg.name}") is not None' - return f'{self.render(value)} if {exists} else None' + return f'({self.render(value)}) if {exists} else None' Is this parenthesis necessary? 🤔 ------------------------------ In serde/__init__.py <#80 (comment)>: > @@ -16,7 +16,7 @@ 'is_deserializable', 'from_dict', 'from_tuple', - 'asdict', - 'astuple', + 'to_dict', + 'to_tuple', Thanks for the explanation! I am curious of other people's opinion 🙂 @adsharma <https://github.com/adsharma> @jfuechsl <https://github.com/jfuechsl> @andreymal <https://github.com/andreymal> @alexmisk <https://github.com/alexmisk> @pranavvp10 <https://github.com/pranavvp10> There is a proposal to renaming asdict/astuple to to_dict/to_tuple What do you guys think? 🤔 — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#80 (review)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAFA2AYCPCFX36YFM37HV3TS474ANANCNFSM4W3Q3O7A> .

ydylla · 2021-02-02T20:55:18Z

Also, please update supported types section in README.md

Done, I found it to verbose to add all types of ipaddress & pathlib separately. But feel free to change it if you want a longer list 😃

jfuechsl · 2021-02-02T21:07:11Z

I'd say it's a question of preference. as* has a more declarational conotation (which I like) and to_* is more imperative (which would be technically "more correct").
If you decide to go with to_*, I would ask to keep the original names as aliases for backwards compatibility.

yukinarit · 2021-02-03T12:40:03Z

@adsharma @jfuechsl
Thanks for your input!

I'd say it's a question of preference. as* has a more declarational conotation (which I like) and to_* is more imperative (which would be technically "more correct").

Yeah, from_dict/to_dict naming sounds more consistent with other formats but renaming at this time is confusing for the exiting users (although pyserde user base is still small).

@ydylla
Thank you for the good suggestion, but I would like not to change. Could you please remove the commit?

ydylla · 2021-02-03T19:56:31Z

I readded asdict & astuple in ae28a33 with their old signatures and behavior.
I did not revert the commit because I wanted to keep the to_obj function, it simplifies the code.
Another reason is in principle you seam to agree with me, so I would suggest to keep the to_* functions and also document them as the new default one.

adsharma · 2021-02-03T21:28:16Z

The code that uses these APIs is here: https://github.com/adsharma/raft/tree/master/raft/messages I like the reversed dicts to keep the symmetry. The only reason why _type exists is for serialization/pyserde. We can discuss a different name if that's inconvenient.

…

On Wed, Feb 3, 2021 at 12:22 PM ydylla ***@***.***> wrote: ***@***.**** commented on this pull request. ------------------------------ In serde/msgpack.py <#80 (comment)>: > """ If `ext_dict` option is specified, `obj` is encoded as a `msgpack.ExtType` """ - return se.serialize(obj, **opts) + ext_type_code = None + if ext_dict is not None: + obj_type = type(obj) + ext_type_code = next((code for code, ext_type in ext_dict.items() if obj_type is ext_type), None) I think you meant to_msgpack because from_msgpack already uses a simple lookup. Yes two dicts would be more efficient/faster. Another way could be to add the ext_type_code directly to the to_msgpack function as argument. Maybe the app already knows which type code to use. In the original code the type code was saved as _type attribute. In my opinion the knowledge about the name of this special attribute should not be part of pyserde. But with the ext_type_code as argument one could use it like this: d = DerivedA(i=1, s="a", j=10)to_msgpack(d, ext_type_code=d._type) So basically the app has to implement the lookup itself. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#80 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAFA2A355QN645PU56WEAPLS5GV63ANCNFSM4W3Q3O7A> .

yukinarit · 2021-02-05T12:15:03Z

LGTM 👍

yukinarit · 2021-02-05T12:22:05Z

I will merge once this thread is resolved 👍

A single ext_type_code argument would probably be even better, so that the application code can implement the lookup however it likes

yukinarit · 2021-02-08T11:59:25Z

@adsharma Is everything ok? Could you check if your reviews were all addressed?

adsharma · 2021-02-09T02:29:23Z

Everything looks good and tests pass. Thanks for making the changes! I need to push a small commit for the reversed dict once this is merged.

…

On Mon, Feb 8, 2021 at 3:59 AM yukinarit ***@***.***> wrote: @adsharma <https://github.com/adsharma> Is everything ok? Could you check if your reviews were all addressed? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#80 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAFA2AY3AJAGJ772E4L6OJDS57G2ZANCNFSM4W3Q3O7A> .

ydylla · 2021-02-09T16:30:01Z

Nice. Thanks @yukinarit and @adsharma.

ydylla added 10 commits January 20, 2021 23:37

feat: add more types & use code generation

d352d2d

chore: adapt toml, yaml & json to reuse_instances

458be1e

chore: improve msgpack & adapt to reuse_instances

3e937aa

* msgpack: classes don't need a special _type attribute anymore. The ext_dict contains all information we need.

chore: rename asdict, astuple to to_dict & to_tuple

ed67547

* se.py now internally uses to_obj similar to de.py

fix: Ellipsis overwriting configured default for reuse_instances

b0366e5

test: fix tests using invalid kwargs

d890f3d

* add **opt to Nested class to work with reuse_instances

fix: forward reuse_instances & fix call order for optionals

c56128c

test: add tests for reuse_instances=True

d837dc4

test: add tests for new types & msgpack exceptions

9fb2012

test: fix doctests

cd45f41

ydylla added 2 commits January 31, 2021 19:51

fix: correct brackets for deserializing date types

e9b8287

test: add date types to tests

84810b6

fix: compatibility with python 3.6

7ae87b4

python 3.6 has no fromisoformat functions so we have to ship our own (copied from python 3.8)

yukinarit reviewed Feb 1, 2021

View reviewed changes

.gitignore Outdated Show resolved Hide resolved

serde/msgpack.py Show resolved Hide resolved

serde/__init__.py Show resolved Hide resolved

adsharma reviewed Feb 1, 2021

View reviewed changes

refactor: remove unused SE_NAME

94a6f1b

yukinarit reviewed Feb 2, 2021

View reviewed changes

serde/de.py Show resolved Hide resolved

serde/__init__.py Show resolved Hide resolved

docs: add new supported types to readme

2ff878a

ydylla added 2 commits February 3, 2021 19:51

chore: recreate asdict & astuple for backwards compatibility

ae28a33

test: increase coverage

a13edc6

chore: use reversed ext_dict for to_msgpack

6a086f4

A single ext_type_code argument would probably be even better, so that the application code can implement the lookup however it likes

yukinarit merged commit c7750a5 into yukinarit:master Feb 9, 2021

github-actions bot pushed a commit that referenced this pull request Feb 9, 2021

docs: Update docs for #80

bba0b0f

ydylla deleted the more-types branch May 8, 2021 18:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

More types & some refactoring #80

More types & some refactoring #80

ydylla commented Jan 31, 2021 •

edited

codecov bot commented Jan 31, 2021 •

edited

ydylla commented Jan 31, 2021 •

edited

yukinarit commented Feb 1, 2021

yukinarit commented Feb 1, 2021

yukinarit left a comment

ydylla commented Feb 1, 2021 •

edited

adsharma Feb 1, 2021

ydylla Feb 1, 2021

adsharma Feb 3, 2021

ydylla Feb 3, 2021

ydylla Feb 5, 2021

yukinarit commented Feb 2, 2021

adsharma commented Feb 2, 2021 via email

ydylla commented Feb 2, 2021

jfuechsl commented Feb 2, 2021

yukinarit commented Feb 3, 2021

ydylla commented Feb 3, 2021

adsharma commented Feb 3, 2021 via email

yukinarit commented Feb 5, 2021

yukinarit commented Feb 5, 2021

yukinarit commented Feb 8, 2021

adsharma commented Feb 9, 2021 via email

ydylla commented Feb 9, 2021

More types & some refactoring #80

More types & some refactoring #80

Conversation

ydylla commented Jan 31, 2021 • edited

codecov bot commented Jan 31, 2021 • edited

Codecov Report

ydylla commented Jan 31, 2021 • edited

yukinarit commented Feb 1, 2021

yukinarit commented Feb 1, 2021

yukinarit left a comment

Choose a reason for hiding this comment

ydylla commented Feb 1, 2021 • edited

adsharma Feb 1, 2021

Choose a reason for hiding this comment

ydylla Feb 1, 2021

Choose a reason for hiding this comment

adsharma Feb 3, 2021

Choose a reason for hiding this comment

ydylla Feb 3, 2021

Choose a reason for hiding this comment

ydylla Feb 5, 2021

Choose a reason for hiding this comment

yukinarit commented Feb 2, 2021

adsharma commented Feb 2, 2021 via email

ydylla commented Feb 2, 2021

jfuechsl commented Feb 2, 2021

yukinarit commented Feb 3, 2021

ydylla commented Feb 3, 2021

adsharma commented Feb 3, 2021 via email

yukinarit commented Feb 5, 2021

yukinarit commented Feb 5, 2021

yukinarit commented Feb 8, 2021

adsharma commented Feb 9, 2021 via email

ydylla commented Feb 9, 2021

ydylla commented Jan 31, 2021 •

edited

codecov bot commented Jan 31, 2021 •

edited

ydylla commented Jan 31, 2021 •

edited

ydylla commented Feb 1, 2021 •

edited