Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mypy doesn't like when I use type variables to subscript generic type #13619

Open
domderen opened this issue Sep 7, 2022 · 6 comments
Open
Labels
bug mypy got something wrong

Comments

@domderen
Copy link

domderen commented Sep 7, 2022

Bug Report

Hey, I'm trying to figure out if this mypy output is a bug or if I'm doing something incorrectly. I need to pass a type variable to a function to properly deserialize my model class. Versions 1 & 2 of my build_generic_* function work correctly, as visible in the code output, but mypy doesn't like what I'm doing there saying that I can't use a type variable this way. I created versions of this function both with Type and TypeAlias annotation just to check if mypy will be fine with either one.

from typing import Generic, Type, TypeAlias, TypeVar
from pydantic import BaseModel
from pydantic.generics import GenericModel


class SomeModel(BaseModel):
    a: int


T = TypeVar("T", bound=BaseModel)


class SomeGenericModel(GenericModel, Generic[T]):
    some_model_instance: T


def build_generic_type_from_str_1(some_generic_model_str: str,
                                  t_type: Type[T]) -> SomeGenericModel[T]:
    return SomeGenericModel[t_type].parse_raw(some_generic_model_str)


def build_generic_type_from_str_2(some_generic_model_str: str,
                                  t_type: TypeAlias) -> SomeGenericModel[T]:
    return SomeGenericModel[t_type].parse_raw(some_generic_model_str)


def build_generic_type_from_str_3(
        some_generic_model_str: str) -> SomeGenericModel[T]:
    return SomeGenericModel.parse_raw(some_generic_model_str)


def main():
    print(
        build_generic_type_from_str_1('{"some_model_instance": {"a": 1}}',
                                      SomeModel))
    print(
        build_generic_type_from_str_2('{"some_model_instance": {"a": 1}}',
                                      SomeModel))
    print(build_generic_type_from_str_3('{"some_model_instance": {"a": 1}}'))


if __name__ == "__main__":
    main()

carbon

Output from running this code & mypy:

$ python tests/mypy_typealias_problem.py
some_model_instance=SomeModel(a=1)
some_model_instance=SomeModel(a=1)
some_model_instance=BaseModel()

$ mypy tests/mypy_typealias_problem.py  
tests/mypy_typealias_problem.py:19: error: Variable "t_type" is not valid as a type
tests/mypy_typealias_problem.py:19: note: See https://mypy.readthedocs.io/en/stable/common_issues.html#variables-vs-type-aliases
tests/mypy_typealias_problem.py:24: error: Variable "t_type" is not valid as a type
tests/mypy_typealias_problem.py:24: note: See https://mypy.readthedocs.io/en/stable/common_issues.html#variables-vs-type-aliases
Found 2 errors in 1 file (checked 1 source file)

For comparison, pyright seems to suggest that the second approach with TypeAlias is incorrect.

$ pyright tests/mypy_typealias_problem.py                              
WARNING: there is a new pyright version available (v1.1.269 -> v1.1.274).
Please install the new version or set PYRIGHT_PYTHON_FORCE_VERSION to `latest`

No configuration file found.
pyproject.toml file found at /Users/dominikderen/dev/qomplx/argo-workflows/iostation-responders-client.
Loading pyproject.toml file at /Users/dominikderen/dev/qomplx/argo-workflows/iostation-responders-client/pyproject.toml
Assuming Python version 3.10
Assuming Python platform Darwin
Auto-excluding **/node_modules
Auto-excluding **/__pycache__
Auto-excluding **/.*
stubPath /Users/dominikderen/dev/qomplx/argo-workflows/iostation-responders-client/typings is not a valid directory.
Searching for source files
Found 1 source file
pyright 1.1.269
/Users/dominikderen/dev/qomplx/argo-workflows/iostation-responders-client/tests/mypy_typealias_problem.py
  /Users/dominikderen/dev/qomplx/argo-workflows/iostation-responders-client/tests/mypy_typealias_problem.py:24:29 - error: Expected class type but received "TypeAlias" (reportGeneralTypeIssues)
  /Users/dominikderen/dev/qomplx/argo-workflows/iostation-responders-client/tests/mypy_typealias_problem.py:23:74 - warning: TypeVar "T" appears only once in generic function signature (reportInvalidTypeVarUse)
  /Users/dominikderen/dev/qomplx/argo-workflows/iostation-responders-client/tests/mypy_typealias_problem.py:28:58 - warning: TypeVar "T" appears only once in generic function signature (reportInvalidTypeVarUse)
  /Users/dominikderen/dev/qomplx/argo-workflows/iostation-responders-client/tests/mypy_typealias_problem.py:38:39 - error: Argument of type "Type[SomeModel]" cannot be assigned to parameter "t_type" of type "TypeAlias" in function "build_generic_type_from_str_2"
    "Type[ModelMetaclass]" is incompatible with "Type[TypeAlias]" (reportGeneralTypeIssues)
2 errors, 2 warnings, 0 informations
Completed in 0.77sec

Versions:
Any suggestions on this would be greatly appreciated!

To Reproduce

Run the script provided above.

Expected Behavior

Either of the two approaches should not raise a mypy validation error.

Actual Behavior

Mypy is making case 1 & 2 as errors.

Your Environment

  • Mypy version used: mypy 0.971 (compiled: yes)

  • Mypy command-line flags: none, just providing a directory.

  • Mypy configuration options from mypy.ini (and other config files): none.

  • Python version used: 3.10.5

  • Operating system and version: MacOS Monterey 12.5.1

  • Pyright: 1.1.269

@domderen domderen added the bug mypy got something wrong label Sep 7, 2022
@terencehonles
Copy link
Contributor

It looks like you're mixing things up a bit here, and I'm just going based on what I understand which may also be a little incomplete, but TypeAlias (which I haven't used personally yet), is the type of a something like SomeGenericModel[SomeModel] and would be used as such:

SomeModelWrapper: TypeAlias = SomeGenericModel[SomeModel]

which would let type checkers know that the variable is for typing. So this is most certainly not what you want for def build_generic_type_from_str_2(some_generic_model_str: str, t_type: TypeAlias) -> SomeGenericModel[T]: since t_type would be passed at runtime and you wouldn't expect to use it for a type definition by itself (I would probably only really expect TypeAlias to be used at the module level and arguably your T could be declared as T: TypeAlias).

See: https://docs.python.org/3/library/typing.html#type-aliases

The next thing I would say you probably have a little mixed up is how to use your generics in a runtime context. For generics in both of your first two examples you don't subscript in the runtime context (since this is not a constructor), but you would want to in the typing context.

Specifically:

return SomeGenericModel[t_type].parse_raw(some_generic_model_str)

Doesn't really make a lot of sense because the subscripting will be thrown away at runtime and it's not doing what you expect.

Since your examples are pretty trivial it's hard to see how you'd use the typing, but a more concrete example might be the following:

class SomeGenericModel(...):
    @classmethod
    def parse_raw(data: str) -> SomeGenericModel[T]:
        model: SomeGenericModel[T]
        decoded = json.loads(data)
        if 'a' in decoded['some_model_instance']:
            model = SomeGenericModel[SomeModel](
                SomeModel.parse(decoded['some_model_instance']))
            # do things with the model: SomeGenericModel[SomeModel]
        if 'b' in decode['some_model_instance']:
            # do some other things with a different model subtype

        return model

Looking at your code more I am inclined to think that you're looking for some sort of generic function, and as far as I'm aware Python typing does not support what would exist in Java / TypeScript which would look something like def build_generic_type_from_str_3[T](data: str) -> SomeGenericModel[T]: and could be called in your code as model = build_generic_type_from_str[SomeModel](...) which would leave model defined as model: SomeGenericModel[SomeModel]. Because the type system is only for checking and not enforced at runtime that's very close to what you can already write by declaring model's type manually and now mypy will just check that build_generic_type_from_str returns a type that is compatible with what you declared otherwise it will continue checking with the generic return that you defined on the function.

Hopefully that helps, but that's all the time I have to write...

@domderen
Copy link
Author

Hey @terencehonles thanks for answering! I'm not sure I completely follow your answer... I understand that typing in Python doesn't really matter at runtime, but that's not the case in the example I posted.

Take a look a the output from my example. The first two lines of output present a correct output, where my model was correctly parsed from string and inner type was correctly identified and deserialised.

But in the third line of output, the one that was executed without type subscripting, the output is incorrect. The inner type is recognised as "BaseModel" and not as my "SomeModel".

I was also surprised to see that this type subscription actually changes something at runtime, but this is what I'm seeing in the executed code. And it makes some sense, coz how else would my the parser know to what object should it deserialise my string?

@terencehonles
Copy link
Contributor

Sorry, I didn't realize you were using pydantic (didn't look very closely at your imports). Normally type hints don't affect the runtime, but some libraries use them via reflection.

I haven't used pydantic, but looking at the documentation https://pydantic-docs.helpmanual.io/usage/models/#generic-models it works because you are abusing the type system, but still managing to do something pydantic can handle.

All the examples I see (and maybe I didn't look long enough) are things that mypy would not complain about. If you instead wrote:

def build_generic_type_from_str_4(some_generic_model_str: str) -> SomeGenericModel[SomeModel]:
    return SomeGenericModel[SomeModel].parse_raw(some_generic_model_str)

you'll see that mypy will not complain anymore. At runtime this is the same thing that you wrote, because it's passing the same class that the variable would be passing in and pydantic is taking that and running the deserialization based on that. However when type checking you'll note that mypy would know exactly what type that is and it's not a variable (it would be OK if it were a type alias too) and therefore you'll now please mypy.

I'm not sure how you'd please mypy otherwise since pydantic and mypy are using the variables in different ways and mypy is being more strict. You want to declare your second parameter as Type[T] as you have in the first example, but that implies it can be only used in the runtime context and therefore mypy can't use it and complains that it shouldn't be used when subscripting.

Without understanding what has already been discussed in the different typing lists I would think it would be OK to only allow Type[T] as a subscript anywhere and if mypy sees it it would be the same as declaring MyClass[T] where T is a type variable that may have a bound. You didn't provide line numbers on your code so it's really hard to look at your output and know what it matches up to, but it looks like pyright may be doing this.

@domderen
Copy link
Author

domderen commented Oct 8, 2022

Hey @terencehonles sorry for the late response, I was out on holidays. I updated my initial message to show line numbers in code, and matched that with outputs of code execution, mypy & pyright.

I understand that mypy would be fine with a snippet that you provided, but that would require me to re-implement my function for every model I would like to use it with. But generics allow me to re-use a lot of code. That's why I'm trying to use them.

I understand that Pydantic is abusing the type system using reflection, but I honestly think it is a valid usecase and will be coming up more often. Once we have types in a language, it seems quite natural to want to use those types also at runtime to influence program's logic.

That's why I think mypy should accept at least one of those solutions, and I would agree with you that would probably should be option #1.

What do you think?

@erictraut
Copy link

I would expect option 1 to work fine with mypy. The error mypy emits (Variable "t_type" is not valid as a type) would be appropriate if SomeGenericModel[t_type] appeared in a type annotation, but no error should be emitted for a runtime value expression like in this sample. (FWIW, pyright accepts this code without error. And for full disclosure, I'm the main author of pyright.)

Here is the code for option 1 in text form so others don't need to retype it from the above screen shot.

from typing import Generic, TypeVar
from pydantic import BaseModel
from pydantic.generics import GenericModel

class SomeModel(BaseModel):
    a: int

T = TypeVar("T", bound=BaseModel)

class SomeGenericModel(GenericModel, Generic[T]):
    some_model_instance: T

def bgt1(s: str, t_type: type[T]) -> SomeGenericModel[T]:
    return SomeGenericModel[t_type].parse_raw(s)

Option 2 won't work because it uses TypeAlias in an incorrect manner.

Option 3 won't work because it uses type variable T in an incorrect manner. In particular, T is a function-scoped type parameter that appears only once in the function's signature, so there's no way for a constraint solver to solve for T.

@jepperaskdk
Copy link

I just opened this SO thread, I think it concerns the same problem, except I'm extracting the type variable from self.__orig_class__.__args__: https://stackoverflow.com/questions/74144564/python-generics-reuse-class-generic-type

from typing import Generic, TypeVar


T = TypeVar('T')


class A(Generic[T]):
    def get_t(self) -> T:
        t = self.__orig_class__.__args__[0]
        obj = t.__new__(t)
        obj.__init__()
        return obj


class B(Generic[T]):
    def get_t(self) -> T:
        # This works, but mypy gives "Variable "t" is not valid as a type  [valid-type]mypy(error)"
        t = self.__orig_class__.__args__[0]
        a = A[t]()

        # This fails (object.__new__(X): X is not a type object (TypeVar))
        # I'm guessing it fails since T is just the variable above.
        a = A[T]()
        return a.get_t()


if __name__ == '__main__':
    a = A[str]()
    obj: str = a.get_t()

    b = B[str]()
    obj: str = b.get_t()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug mypy got something wrong
Projects
None yet
Development

No branches or pull requests

4 participants