Skip to content

Should Series[Any] be used internally instead of Series? #1133

Closed
@MarcoGorelli

Description

@MarcoGorelli
Member

Currently, Series is used in several places where the inner type of the Series isn't known, e.g.:

@overload
def compare(
self,
other: Series,
align_axis: AxisColumn = ...,
keep_shape: bool = ...,
keep_equal: bool = ...,
) -> DataFrame: ...

There's a couple of issues I'm running into with this

First, the pyright-strict job marks this as partially unknown:

/home/runner/work/pandas-stubs/pandas-stubs/tests/test_series.py:1039:5 - error: Type of "compare" is partially unknown
Type of "compare" is "Overload[(other: Series[Unknown], align_axis: Literal['index', 0], keep_shape: bool = ..., keep_equal: bool = ...) -> Series[Unknown], (other: Series[Unknown], align_axis: Literal['columns', 1] = ..., keep_shape: bool = ..., keep_equal: bool = ...) -> DataFrame]" (reportUnknownMemberType)

Second, when using pyright with --verifytypes to look for uncovered parts of the public API, this is flagged as "unknown type":

            {
                "category": "function",
                "name": "pandas.testing.assert_series_equal",
                "referenceCount": 1,
                "isExported": true,
                "isTypeKnown": false,
                "isTypeAmbiguous": false,
                "diagnostics": [
                    {
                        "file": "/home/marcogorelli/type_coverage_py/.pyright_env_pandas/lib/python3.12/site-packages/pandas/_testing/__init__.pyi",
                        "severity": "error",
                        "message": "Type of parameter \"left\" is partially unknown\n  Parameter type is \"Series[Unknown]\"\n    Type argument 1 for class \"Series\" has unknown type",
                        "range": {
                            "start": {
                                "line": 4,
                                "character": 27
                            },
                            "end": {
                                "line": 4,
                                "character": 46
                            }
                        }
                    },

Would it be OK to use Series[Any] instead of just Series in such cases? Or, as some libraries do, to introduce a type alias Incomplete: TypeAlias = Any to mean "we should be able to narrow down the type but for now we're not doing so" and use that in some cases

The latter use-case (--verifytypes) can, I think, really help to prioritise which stubs to add

Activity

Dr-Irv

Dr-Irv commented on Feb 28, 2025

@Dr-Irv
Collaborator

We've had discussion about this in another PR. See #1093 (comment)

Idea is to create an UnknownSeries type that would correspond to when we don't know the type. I think this may solve the problem you raise above.

Jeitan

Jeitan commented on Mar 18, 2025

@Jeitan

Is there something similar happening for DataFrames? I'm also running afoul of things being partially unknown using strict with basedpyright, specifically in the return of read_excel for a particular overload (a single string passed into the io parameter).

Dr-Irv

Dr-Irv commented on Mar 19, 2025

@Dr-Irv
Collaborator

Is there something similar happening for DataFrames? I'm also running afoul of things being partially unknown using strict with basedpyright, specifically in the return of read_excel for a particular overload (a single string passed into the io parameter).

I don't think the return types of read_excel() should be seen as partially unknown. Can you provide a simple example that illustrates what you are seeing?

What might be happening is the following. Let's say you do df = pd.read_excel("your string") The result of that is either a dict of DataFrame objects or a single DataFrame. Let's say it is the latter. If you then do s = df["a"], then s will be partially unknown since it will have type Series[Any] . There are some contributions to change all references in pandas-stubs from Series to UnknownSeries. This is an ongoing effort.

Jeitan

Jeitan commented on Mar 20, 2025

@Jeitan

@Dr-Irv Okay gotcha. I probably need to do some better things on my end also, so I'll see if I can address it that way first. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Participants

      @Jeitan@Dr-Irv@MarcoGorelli

      Issue actions

        Should `Series[Any]` be used internally instead of `Series`? · Issue #1133 · pandas-dev/pandas-stubs