Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error on merge function of pandas data frames #1049

Closed
lrob opened this issue Sep 21, 2020 · 2 comments
Closed

Error on merge function of pandas data frames #1049

lrob opened this issue Sep 21, 2020 · 2 comments
Labels
as designed Not a bug, working as intended

Comments

@lrob
Copy link

lrob commented Sep 21, 2020

Hi,
I'm using pyright in combination with the stubs provided by data-science-types. I'm getting a problem when I check the following code:

import pandas as pd

d = {'a': [1, 2]}
df = pd.DataFrame(data=d)

df_merged: pd.DataFrame = df.merge(right=df)

By running pyright against this code I get the following errors:

 3:19 - error: Argument of type "Dict[_str, Series[TypeVar('_DType')] | Index[TypeVar('_T')] | ndarray[TypeVar('_DType')] | Sequence[int]]" cannot be assigned to parameter "data" of type "Series[TypeVar('_DType')] | Index[TypeVar('_T')] | ndarray[TypeVar('_DType')] | Sequence[TypeVar('_T_co')] | DataFrame | Dict[_str, Series[TypeVar('_DType')] | Index[TypeVar('_T')] | ndarray[TypeVar('_DType')] | Sequence[TypeVar('_T_co')]] | None" in function "__init__"
  Type "Dict[_str, Series[TypeVar('_DType')] | Index[TypeVar('_T')] | ndarray[TypeVar('_DType')] | Sequence[int]]" cannot be assigned to type "Series[TypeVar('_DType')] | Index[TypeVar('_T')] | ndarray[TypeVar('_DType')] | Sequence[TypeVar('_T_co')] | DataFrame | Dict[_str, Series[TypeVar('_DType')] | Index[TypeVar('_T')] | ndarray[TypeVar('_DType')] | Sequence[TypeVar('_T_co')]] | None"
    "Dict[_str, Series[TypeVar('_DType')] | Index[TypeVar('_T')] | ndarray[TypeVar('_DType')] | Sequence[int]]" is incompatible with "Series[TypeVar('_DType')]"
    "Dict[_str, Series[TypeVar('_DType')] | Index[TypeVar('_T')] | ndarray[TypeVar('_DType')] | Sequence[int]]" is incompatible with "Index[TypeVar('_T')]"
    "Dict[_str, Series[TypeVar('_DType')] | Index[TypeVar('_T')] | ndarray[TypeVar('_DType')] | Sequence[int]]" is incompatible with "ndarray[TypeVar('_DType')]"
    "Dict[_str, Series[TypeVar('_DType')] | Index[TypeVar('_T')] | ndarray[TypeVar('_DType')] | Sequence[int]]" is incompatible with "Sequence[TypeVar('_T_co')]"
    "Dict[_str, Series[TypeVar('_DType')] | Index[TypeVar('_T')] | ndarray[TypeVar('_DType')] | Sequence[int]]" is incompatible with "DataFrame"
    Cannot assign to "None"
      TypeVar "_VT" is invariant
... (reportGeneralTypeIssues)
  4:13 - error: "df.merge(right=df)" has type "Series[Unknown]" and is not callable (reportGeneralTypeIssues)
  4:1 - error: Type of "df_merged" is unknown (reportUnknownVariableType)

Ignoring error 3.19 (that however is not clear to me as well), I would like to focus on error 4:13. Why the function merge is said to have type Series[Unknown]?

To Reproduce
install data-science-types stubs:
pip install data-science-types

install pyright:
sudo npm install -g pyright

run the file with the above code:
pyright test.py

Note that I have enabled the option reportUnknownVariableType

@lrob lrob changed the title Error on merge function of pandas dataframes Error on merge function of pandas data frames Sep 21, 2020
@erictraut
Copy link
Collaborator

This is the correct behavior from the perspective of a type checker.

The expression df.merge(right=df) has the type Series[Unknown] because the DataFrame class has no merge method defined, but it does have a __getattr__ method defined, and it returns type Series.

I don't know enough about pandas to offer an informed workaround or recommended bug fix, but you may want to report the issue within the data-science-types repo. I can assure you that Pyright is doing the right thing based on the information it's provided in these stubs.

@erictraut erictraut added the as designed Not a bug, working as intended label Sep 21, 2020
@lrob
Copy link
Author

lrob commented Sep 22, 2020

I'll report it to data-science-types as you suggest.

Thanks for your support

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
as designed Not a bug, working as intended
Projects
None yet
Development

No branches or pull requests

2 participants