Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Underspecified result type of EZPackOverlay._ez_unpack_auth and EZPackOverlay._ez_unpack_noauth #1094

Closed
kozlovsky opened this issue May 6, 2022 · 7 comments
Labels
priority: unknown Requires further investigation

Comments

@kozlovsky
Copy link
Contributor

kozlovsky commented May 6, 2022

Currently, the type of _ez_unpack_auth unpacked payload is not understood correctly by PyCharm:

2022-05-06_15-25-39

You can see that PyCharm can only see the base Payload methods and not the methods of IntroductionRequestPayload.

In the following PR, I suggest a fix that provides the correct result type:

2022-05-06_15-38-02

Also, while fixing that issue, I noticed that currently, DataclassPayload defined as

DataclassPayload = typing.TypeVar('DataclassPayload')
...
AnyPayload = typing.Union[Payload, DataclassPayload]
AnyPayloadType = typing.Union[typing.Type[Payload], typing.Type[DataclassPayload]]

In my opinion, it is not correct to use TypeVar here. Type variables are for generic functions when the function result type should be derived from the function arguments'. Here, DataclassPayload is not used for generics and does not specify that the class is related to data classes. So, as a part of a fix, I'm replacing it with a protocol that checks that the class is actually a data class.

@kozlovsky kozlovsky added the priority: unknown Requires further investigation label May 6, 2022
@kozlovsky kozlovsky self-assigned this May 6, 2022
@qstokkink
Copy link
Collaborator

In my opinion, it is not correct to use TypeVar here.

Strictly speaking, it is not correct to even assign a type---at all---to a dataclass. I made up this fictional type of a DataclassPayload to bridge the gap between real Payload classes and dataclass runtime instances, that become Payload instances at runtime. I understand that this is not the prettiest solution, but it was a conscious decision to do it like this rather than trying to bend over backwards to try and type dataclass-ed classes, which aren't a class/type to begin with. Please don't change this (though I would be open to removing this DataclassPayload definition entirely).

@kozlovsky
Copy link
Contributor Author

kozlovsky commented May 6, 2022

Your solution for defining DataclassPayload was pretty practical, and I like it. My only objection was that the TypeVar functionality is unnecessary here. TypeVar is useful when you need to specify the same type in two different parts of function signature or class definition. AnyPayload uses this way, not DataclassPayload, so the type variable is necessary for AnyPayload, while DataclassPayload can be a non-generic class.

Unfortunately, there is no common base class for data classes, so in Python 3.7, the correct non-generic class for a data class is Type[object], which can be replaced with just type. But in Python 3.8, it is possible to specify a correct DataclassPayload protocol that prevents non-data classes from being used here.

But it is not critical for this PR, so if you think the TypeVar has some benefits here compared to the protocol, I can revert it.

@qstokkink
Copy link
Collaborator

qstokkink commented May 9, 2022

I do agree with you that your typing proposal is stricter (and therefore better). However, my main objection is that your proposal is further reinforcing my somewhat ugly (though indeed practical) patch job of the type hierarchy. If we do change this, I'd like to see it move (more) to the ideal situation, visualized:

typeorg

Edit, erratum: I mislabeled the DataclassPayload as a fake class in this image. This is false: it is a real class.

If you have time for this, I'd love to an implementation of the "ideal" situation. Otherwise, I'd prefer a reversion. Alternatively, if you have a suggestion for an even more ideal ideal, I'd love to hear other suggestions for the type hierarchy.

@kozlovsky
Copy link
Contributor Author

I think the "Current" scheme on the left does not reflect how the current code looks. I'd say it's something like that:

image

The left "Current" diagram assumes that Payload and DataclassPayload are inherited from AnyPaload or somehow depend on it. In the actual code, it is the reverse - AnyPayload is defined as a union of Payload and arbitrary dataclass, represented in the code using the DataclassPayload.

The right "Ideal" diagram assumes that DataclassPayload is not an arbitrary dataclass type, but a special subclass of it also inherited from Payload using multiple inheritance. Is it beneficial? I think it may be more convenient for a user to use an arbitrary dataclass just as in the current implementation of IPv8, as it imposes fewer restrictions on the user. Also, the current IPv8 code allows arbitrary dataclass as a payload, and imposing restrictions on it in future versions of IPv8 will break the backward compatibility.

@qstokkink
Copy link
Collaborator

The right "Ideal" diagram assumes that DataclassPayload is not an arbitrary dataclass type, but a special subclass of it also inherited from Payload using multiple inheritance.

It doesn't just assume that, quite literally it is exactly that:

class DataClassPayload(origin, VariablePayload):
names = list(get_type_hints(cls).keys())
format_list = list(map(type_map, get_type_hints(cls).values()))
if msg_id is not None:
setattr(DataClassPayload, "msg_id", msg_id)
DataClassPayload.__name__ = cls.__name__
DataClassPayload.__qualname__ = cls.__qualname__
return vp_compile(DataClassPayload)

This can be used as both an arbitrary dataclass (which is not actually a class) and as a VariablePayload. Therefore, I disagree with your conclusion that typing it as such would somehow change backward compatibility. Furthermore, I do not understand your argument on having more options restricts the user.

By the way, I mislabeled the DataClassPayload as a fictional class before. This was my mistake, I'll add an erratum to my previous post.

@qstokkink
Copy link
Collaborator

On a side note, how you draw the arrows depends on what convention you follow. In the existing IPv8 documentation we draw arrows from a stricter subtype to a more general supertype. For instance:

  • A subclass points to its superclass.
  • An interface implementation points to its interface.
  • A type points to a union it is a part of.

Please follow this convention so we don't mix styles.

@qstokkink
Copy link
Collaborator

qstokkink commented Feb 3, 2023

After almost a year of inactivity, I still see no clear actionable conclusion in this issue and I'll close this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority: unknown Requires further investigation
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants