Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

to_dict_msgpack does not work with inheritance #87

Closed
MichelleArk opened this issue Nov 28, 2022 · 6 comments
Closed

to_dict_msgpack does not work with inheritance #87

MichelleArk opened this issue Nov 28, 2022 · 6 comments

Comments

@MichelleArk
Copy link

  • mashumaro version: 3.1 (with msgpack mixin)
  • Python version: 3.10
  • Operating System: macOS 12.2.1

Description

Base classes prevent correct subclass compilation of to_dict_msgpack packer/unpacker methods because of order of execution in adding to_dict_msgpack methods to DataClassDictMixin classes that are used in DataClassMessagePackMixins.

Similar issue: #78

What I Did

from dataclasses import dataclass

from mashumaro import DataClassDictMixin
from mashumaro.mixins.msgpack import DataClassMessagePackMixin

@dataclass
class Entity1(DataClassDictMixin):
    field1: str

@dataclass
class Entity2(Entity1):
    field2: str

@dataclass
class Entity1Wrapper(DataClassMessagePackMixin):
    entity: Entity1

@dataclass
class Entity2Wrapper(DataClassMessagePackMixin):
    entity: Entity2

entity = Entity2(field1='foo', field2='bar')
wrapper = Entity2Wrapper(entity)

print(wrapper.to_msgpack())
# b'\x81\xa6entity\x81\xa6field1\xa3foo' - missing field2!
print(wrapper.entity.to_dict_msgpack())
# {'field1': 'foo'} - Entity2.to_dict_msgpack is using Entity1.to_dict_msgpack implementation.
print(wrapper.entity.to_dict())
# {'field1': 'foo', 'field2': 'bar'} - as expected

Possible Solution

In mashamuro/core/meta/builder.py - CodeBuilder checks for the existence of a packer method using hasattr(origin_type, method_name), which also searches ancestor classes. Instead of using hasattr, the builder could use origin_type.__dict__ to check for the existence of a packer method defined by the origin_type class itself. Same solution would apply to unpacker methods.

def _unpack_field_value(...):
  ... 
  if is_dataclass_dict_mixin_subclass(origin_type):
            arg_types = get_args(ftype)
            method_name = self.get_unpack_method_name(
                arg_types, self.format_name
            )
            # if not hasattr(origin_type, method_name):
            if method_name not in origin_type.__dict__:
                builder = CodeBuilder(
                    origin_type,
                    arg_types,
                    dialect=self.dialect,
                    format_name=self.format_name,
                    default_dialect=self.default_dialect,
                )
                builder.add_unpack_method()
...

def _pack_value(...):
  ... 
  if is_dataclass_dict_mixin_subclass(origin_type):
            arg_types = get_args(ftype)
            method_name = self.get_pack_method_name(
                arg_types, self.format_name
            )
            # if not hasattr(origin_type, method_name):'
            if method_name not in origin_type.__dict__:
                builder = CodeBuilder(
                    origin_type,
                    arg_types,
                    dialect=self.dialect,
                    format_name=self.format_name,
                    default_dialect=self.default_dialect,
                )
                builder.add_pack_method()

A much less ideal workaround to this issue demonstrated here in dbt-core's upgrade for bumping mashumaro[msgpack] to 3.1.

@Fatal1ty
Copy link
Owner

Oops! I did it again. I'd better be more careful with hasattr next time.

Thank you for this little investigation. I will fix it soon and release a hot fix.

@Fatal1ty
Copy link
Owner

@MichelleArk can you check your code with this branch? If it's ok, I will release 3.1.1 then.

@MichelleArk
Copy link
Author

@MichelleArk can you check your code with this branch? If it's ok, I will release 3.1.1 then.

Unfortunately the changes are causing a new error. I believe the second part of the new predicate in this branch (comparing method names) is overly restrictive and not adding pack/unpack methods with longer inheritance chains.

Here's another example that breaks with the changes in the branch:

from dataclasses import dataclass

from mashumaro import DataClassDictMixin
from mashumaro.mixins.msgpack import DataClassMessagePackMixin

@dataclass
class Entity1(DataClassDictMixin):
    field1: str

@dataclass
class Entity2(Entity1):
    field2: str

@dataclass
class Entity1WrapperDict(DataClassDictMixin):
    entity: Entity1

@dataclass
class Entity2WrapperMessagePack(DataClassMessagePackMixin):
    entity: Entity2

@dataclass
class EntityWrapperMessagePack(DataClassMessagePackMixin):
    entity1wrapper: Entity1WrapperDict
    entity2wrapper: Entity2WrapperMessagePack

entity1w = Entity1WrapperDict(Entity1(field1='foo'))
entity2w = Entity2WrapperMessagePack(Entity2(field1='foo', field2='bar'))

wrapper = EntityWrapperMessagePack(entity1w, entity2w)

print(wrapper.to_msgpack())

Output:

AttributeError: 'Entity1' object has no attribute 'to_dict_msgpack'

Removing the method name comparison check from the branch resolves the issue, and also passes all unit and integration tests in dbt-core locally. Is there a scenario that check is specifically looking to avoid?

@Fatal1ty
Copy link
Owner

Is there a scenario that check is specifically looking to avoid?

Yes, just checking for the method to exist wasn't enough in a case with self reference. Here you would get RecursionError:

from __future__ import annotations

@dataclass
class EntityWrapperMessagePack(DataClassMessagePackMixin):
    entity1wrapper: Entity1WrapperDict
    entity2wrapper: Entity2WrapperMessagePack
    self: Optional[EntityWrapperMessagePack]

So, we have two possible scenarios to check:

A) We should avoid building A.from_dict inside A.from_dict but allow A.from_dict_msgpack inside A.from_msgpack. This was the reason to have that method name comparison.
B) We should allow building B.from_dict_msgpack inside A.from_dict_msgpack. The naive method name comparison didn't take into account this case.

I added additional check origin_type != self.cls so that we could resolve case B. Hope it will work well, but if you find another edge case, let me see it :)

@MichelleArk
Copy link
Author

Nice! Thank you for that breakdown and explanation. I've confirmed the newest changes in the branch handle all cases in dbt-core and would unblock our update path 👍

@Fatal1ty
Copy link
Owner

Done! I've published a new version 3.1.1.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants