Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Nested tuple with union of tuple does not deserialize correctly #502

Open
Edgeworth opened this issue Nov 19, 2023 · 7 comments
Open
Assignees
Labels
api/v0 bug Something isn't working

Comments

@Edgeworth
Copy link

Edgeworth commented Nov 19, 2023

Description

With this code:

from dataclasses import dataclass

from dataclasses_json import Undefined, dataclass_json


@dataclass_json(undefined=Undefined.RAISE)
@dataclass(eq=True, kw_only=True, order=True, frozen=True)
class TestInput1:
    cols: tuple[(tuple[str, str] | str), ...] = ()


print(TestInput1.from_json('{"cols": [["a", "b"], "c"]}'))


@dataclass_json(undefined=Undefined.RAISE)
@dataclass(eq=True, kw_only=True, order=True, frozen=True)
class TestInput2:
    cols: tuple[tuple[str, str], ...] = ()


print(TestInput2.from_json('{"cols": [["a", "b"], ["c", "d"]]}'))

The output should contain only tuples. However, with the tuple[(tuple[str, str] | str), ...] type, the inner tuple is actually converted as a list. Output:

TestInput1(cols=(['a', 'b'], 'c'))
TestInput2(cols=(('a', 'b'), ('c', 'd')))

Describe the results you expected

TestInput1(cols=(('a', 'b'), 'c'))
TestInput2(cols=(('a', 'b'), ('c', 'd')))

Python version you are using

Python 3.11.5

Environment description

dataclasses-json==0.6.2

@Edgeworth Edgeworth added the bug Something isn't working label Nov 19, 2023
@george-zubrienko george-zubrienko self-assigned this Nov 27, 2023
@george-zubrienko
Copy link
Collaborator

Will take a look soon, sorry bit slow on our side here. Tough times.

@Edgeworth
Copy link
Author

No worries, thank you for your work.

@george-zubrienko
Copy link
Collaborator

george-zubrienko commented Jan 26, 2024

Apparently this is how python converts a compatible collection into tuples (3.11):

> tuple(["a", "b"])
('a', 'b')
> tuple([["a", "b"], "b"])
(['a', 'b'], 'b')

So the library in general will get into trouble with nested generics like this one. It can technically be fixed by traversing the type args, but I fear things might go wrong there. Currently we do this:

        try:
            materialize_type = _get_type_cons(type_)
        except (TypeError, AttributeError):
            pass
        res = materialize_type(xs)

where type_ is the field type hint. Fixing this issue would require some effort and I idk about v0 if that's worth it. Fun fact:

@dataclass_json(undefined=Undefined.RAISE)
@dataclass(eq=True, kw_only=True, order=True, frozen=True)
class TestInput1:
    cols: tuple[str, ...] = ()

gives TestInput1(cols=(['a', 'b'], 'c')) as well which seems like a blown up coercion :D

@george-zubrienko
Copy link
Collaborator

le me know your thoughts @Edgeworth

@Edgeworth
Copy link
Author

It looks to me like the library currently works for nested tuples, see the OP

print(TestInput2.from_json('{"cols": [["a", "b"], ["c", "d"]]}'))

gives TestInput2(cols=(('a', 'b'), ('c', 'd'))), but the union case gives it trouble. Not sure if that is related to the tuple([["a", "b"], "b"]) behaviour (just dunno).

For my personal use case, I worked around this. But I think it's reasonable to expect cols: tuple[(tuple[str, str] | str), ...] to deserialize into something that matches the type hint (and works with hashable etc dataclasses)

TestInput1(cols=(['a', 'b'], 'c')) does seem very weird also lmao

w.r.t. whether to implement this now or in the new version, personally I don't mind. Since I have a workaround, I think it's reasonable to leave it unless there are other people who need this to work.

WDYT?

@george-zubrienko
Copy link
Collaborator

It is because DCJ relies on outer type only. So when you have coherent types inside (only tuples), it will work fine. But when there is a variation - stuff will look ugly. I'll think if there is an easy way to fix this w/o blowing the rest of the lib

@jfloyd1697
Copy link

jfloyd1697 commented Mar 7, 2024

This works

@dataclass_json(undefined=Undefined.RAISE)
@dataclass(eq=True, kw_only=True, order=True, frozen=True)
class TestInput3:
    cols: tuple[(tuple[str, str] | str), ...] = ()
    _: InitVar[bool] = False

    def __post_init__(self, _: bool = False):
        if not _:
            cols = tuple(tuple(x) if not isinstance(x, str) else x for x in self.cols)
            config = dict(cols=cols, _=True)
            self.__init__(**config)



print(TestInput3.from_json('{"cols": [["a", "b"], "c"]}'))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api/v0 bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants