Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent support of type hints with mypy #1389

Closed
2 of 3 tasks
miguel-mi-silva opened this issue Oct 19, 2023 · 3 comments
Closed
2 of 3 tasks

Inconsistent support of type hints with mypy #1389

miguel-mi-silva opened this issue Oct 19, 2023 · 3 comments
Labels
bug Something isn't working

Comments

@miguel-mi-silva
Copy link

Describe the bug
I have read through the documentation and a few issues here (example), but could not find if this is a bug or not yet supported. From the discussion on the above link, it seems that it should be supported.

The bugs are:

  • type hints to function arguments or return values are not correctly handled by mypy.
  • schema argument of check_input or check_output does not recognize SchemaModel as valid type.

  • I have checked that this issue has not already been reported.
  • I have confirmed this bug exists on the latest version of pandera.
  • (optional) I have confirmed this bug exists on the master branch of pandera.

Code Sample, a copy-pastable example

from pandera import SchemaModel, check_input, check_output
from pandas import DataFrame
from pandera.typing import DataFrame as TypeDataFrame


class InputSchema(SchemaModel):
    a: int
    b: bool

class OutputSchema(SchemaModel):
    a: int
    b: bool
    c: str

df = DataFrame({"a": [1], "b": [True]})

@check_input(schema=InputSchema)
@check_output(schema=OutputSchema)
def process(df: TypeDataFrame[InputSchema]) -> TypeDataFrame[OutputSchema]:
    return df.assign(c="c")

Expected behavior

Running mypy on this code should not raise any errors (this is how I understand it, based on this reply to a previous issue).
However, the following errors are raised:

  • Argument "schema" to "check_input" has incompatible type "type[InputSchema]"; expected "DataFrameSchema | SeriesSchema" [arg-type]
  • Argument "schema" to "check_output" has incompatible type "type[OutputSchema]"; expected "DataFrameSchema | SeriesSchema" [arg-type]
  • Incompatible return value type (got "pandas.core.frame.DataFrame", expected "pandera.typing.pandas.DataFrame[OutputSchema]") [return-value]
  • Argument 1 to "process" has incompatible type "pandas.core.frame.DataFrame"; expected "pandera.typing.pandas.DataFrame[InputSchema]" [arg-type]

Additional context

Pandera version: 0.17.2
Mypy version: 1.6.0
Pandas version: 2.1.1

@miguel-mi-silva miguel-mi-silva added the bug Something isn't working label Oct 19, 2023
@cosmicBboy
Copy link
Collaborator

Hi @miguel-mi-silva if you're using DataFrameModels you should use pa.check_types

@check_types
def process(df: TypeDataFrame[InputSchema]) -> TypeDataFrame[OutputSchema]:
    return df.assign(c="c")

check_input and check_output are for DataFrameSchemas

@miguel-mi-silva
Copy link
Author

miguel-mi-silva commented Oct 19, 2023

Hi @miguel-mi-silva if you're using DataFrameModels you should use pa.check_types

@check_types
def process(df: TypeDataFrame[InputSchema]) -> TypeDataFrame[OutputSchema]:
    return df.assign(c="c")

check_input and check_output are for DataFrameSchemas

Thanks for that. However, even with those changes, the error on the return type Incompatible return value type (got "pandas.core.frame.DataFrame", expected "pandera.typing.pandas.DataFrame[OutputSchema]") [return-value] still exists. Is that a known issue or am I doing something wrong?

Here's the new code to facilitate:

from pandera import SchemaModel, check_types
from pandas import DataFrame
from pandera.typing import DataFrame as TypeDataFrame


class InputSchema(SchemaModel):
    a: int
    b: bool

class OutputSchema(SchemaModel):
    a: int
    b: bool
    c: str

df = DataFrame({"a": [1], "b": [True]})

@check_types
def process(df: TypeDataFrame[InputSchema]) -> TypeDataFrame[OutputSchema]:
    return df.assign(c="c")

@cosmicBboy
Copy link
Collaborator

cosmicBboy commented Oct 19, 2023

What does your mypy config look like?

See the docs for a discussion on mypy support. If you care about type lints there's some guidance there on how to make you mypy linting pass, e.g.

@check_types
def process(df: TypeDataFrame[InputSchema]) -> TypeDataFrame[OutputSchema]:
    return df.assign(c="c").pipe(TypeDataFrame[OutputSchema])

or

from typing import cast

@check_types
def process(df: TypeDataFrame[InputSchema]) -> TypeDataFrame[OutputSchema]:
    return cast(TypeDataFrame[OutputSchema], df.assign(c="c"))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants