Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TYP: core/dtypes/cast.py #37024

Merged
merged 13 commits into from
Oct 14, 2020
Merged

Conversation

arw2019
Copy link
Member

@arw2019 arw2019 commented Oct 10, 2020

  • closes #xxxx
  • tests added / passed
  • passes black pandas
  • passes git diff upstream/master -u -- "*.py" | flake8 --diff
  • whatsnew entry

@@ -181,7 +181,7 @@ def maybe_downcast_to_dtype(result, dtype):
return result


def maybe_downcast_numeric(result, dtype, do_round: bool = False):
def maybe_downcast_numeric(result, dtype: Dtype, do_round: bool = False):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DtypeObj?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Switched to that

def maybe_cast_to_extension_array(cls: Type["ExtensionArray"], obj, dtype=None):
def maybe_cast_to_extension_array(
cls: Type["ExtensionArray"], obj, dtype: Dtype = None
):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

docstring says ExtensionDtype, optional, so Optional[ExtensionDtype]?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

def infer_dtype_from_array(arr, pandas_dtype: bool = False) -> Tuple[DtypeObj, Any]:
def infer_dtype_from_array(
arr, pandas_dtype: bool = False
) -> Tuple[DtypeObj, AnyArrayLike]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think ArrayLike, not AnyArrayLike

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

def maybe_convert_objects(values: np.ndarray, convert_numeric: bool = True):
def maybe_convert_objects(
values: np.ndarray, convert_numeric: bool = True
) -> Union[np.ndarray, ABCDatetimeIndex]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use "DatetimeIndex" instead of ABCDatetimeIndex

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -1183,7 +1205,7 @@ def convert_dtypes(
return inferred_dtype


def maybe_castable(arr) -> bool:
def maybe_castable(arr: np.ndarray) -> bool:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we never get EAs here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Atm the only use case is here:

def _try_cast(arr, dtype: Optional[DtypeObj], copy: bool, raise_cast_failure: bool):
"""
Convert input to numpy ndarray and optionally cast to a given dtype.
Parameters
----------
arr : ndarray, scalar, list, tuple, iterator (catchall)
Excludes: ExtensionArray, Series, Index.
dtype : np.dtype, ExtensionDtype or None
copy : bool
If False, don't copy the data if not needed.
raise_cast_failure : bool
If True, and if a dtype is specified, raise errors during casting.
Otherwise an object array is returned.
"""
# perf shortcut as this is the most common case
if isinstance(arr, np.ndarray):
if maybe_castable(arr) and not copy and dtype is None:
return arr

which according to docstring excludes ExtensionArray

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add an assert in the body itself to lock this down. (whether we want it or not is different, but guarantee is good for now)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added

@jreback jreback added Dtype Conversions Unexpected or buggy dtype conversions Typing type annotations, mypy/pyright type checking labels Oct 10, 2020
pandas/core/dtypes/cast.py Outdated Show resolved Hide resolved
@arw2019 arw2019 force-pushed the TYP-core_dtypes_cast branch 3 times, most recently from 136d7db to 5ae2c02 Compare October 11, 2020 03:57
@arw2019 arw2019 marked this pull request as ready for review October 11, 2020 04:01
pandas/core/dtypes/cast.py Show resolved Hide resolved
@@ -904,7 +936,7 @@ def maybe_upcast(values, fill_value=np.nan, dtype=None, copy: bool = False):
return values, fill_value


def invalidate_string_dtypes(dtype_set):
def invalidate_string_dtypes(dtype_set: Set):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Set[DtypeObj] I think?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -1183,7 +1205,7 @@ def convert_dtypes(
return inferred_dtype


def maybe_castable(arr) -> bool:
def maybe_castable(arr: np.ndarray) -> bool:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add an assert in the body itself to lock this down. (whether we want it or not is different, but guarantee is good for now)

@jreback
Copy link
Contributor

jreback commented Oct 11, 2020

cc @simonjayhawkins

@jreback jreback added this to the 1.2 milestone Oct 12, 2020
@jreback
Copy link
Contributor

jreback commented Oct 12, 2020

@simonjayhawkins i fyou can have a quick look and merge if good

def maybe_cast_to_extension_array(cls: Type["ExtensionArray"], obj, dtype=None):
def maybe_cast_to_extension_array(
cls: Type["ExtensionArray"], obj: ArrayLike, dtype: Optional[ExtensionDtype] = None
) -> ArrayLike:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ArrayLike is a TypeVar. so the return type is typed to be the same as obj.

I think mypy is accepting this since np.ndarray resolves to Any

We probably need an ArrayLikeUnion, but this is perhaps out of scope here as the discussion has been started elsewhere. xref #36100

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall I remove the annotation and add a TODO?

As an aside FWIW I like the ArrayLike/ArrayLikeT idea in https://github.com/pandas-dev/pandas/pull/36100/files#r483753222

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can leave as is for now. we have #36092 to track the issues once np.ndarray no longer resolves to Any.

@@ -468,7 +487,7 @@ def maybe_casted_values(index, codes=None):

# if we have the codes, extract the values with a mask
if codes is not None:
mask = codes == -1
mask: np.ndarray = codes == -1
Copy link
Member

@simonjayhawkins simonjayhawkins Oct 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why was this added? mypy is not reporting any errors for me with this removed.

generally, try to avoid adding variable type annotations. should be inferred from expression.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mistake, will revert

I had tried typing index and codes but there were a lot of errors - this was part of an attempted fix.

@@ -1184,7 +1220,7 @@ def soft_convert_objects(


def convert_dtypes(
input_array,
input_array: AnyArrayLike,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you update the docstring to match?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -1566,7 +1606,9 @@ def find_common_type(types: List[DtypeObj]) -> DtypeObj:
return np.find_common_type(types, [])


def cast_scalar_to_array(shape, value, dtype: Optional[DtypeObj] = None) -> np.ndarray:
def cast_scalar_to_array(
shape: Tuple, value: Scalar, dtype: Optional[DtypeObj] = None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we just pass shape onto np.empty, https://numpy.org/doc/1.19/reference/generated/numpy.empty.html

so could be int or tuple of int

I recall a discussion about adding an alias to pandas._typing or

Suggested change
shape: Tuple, value: Scalar, dtype: Optional[DtypeObj] = None
shape: Union[int, Tuple[int, ...]], value: Scalar, dtype: Optional[DtypeObj] = None

and the docstring would need to be updated too.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pushed the suggestion.

Adding an alias to _typing sounds like a good idea. Do we already have a PR on that or shall I submit one?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we already have a PR on that or shall I submit one?

I don't think so, go ahead.

@@ -1698,7 +1742,7 @@ def construct_1d_ndarray_preserving_na(
return subarr


def maybe_cast_to_integer_array(arr, dtype, copy: bool = False):
def maybe_cast_to_integer_array(arr, dtype: Union[str, np.dtype], copy: bool = False):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this limited to numpy dtypes? if not should use Dtype from typing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's not. Switched to Dtype here

@simonjayhawkins
Copy link
Member

Thanks @arw2019 lgtm one more comment #37024 (comment) ping when green

@arw2019 arw2019 force-pushed the TYP-core_dtypes_cast branch 2 times, most recently from 2e85988 to a3ae1be Compare October 12, 2020 23:00
@jreback jreback merged commit 85260bf into pandas-dev:master Oct 14, 2020
@jreback
Copy link
Contributor

jreback commented Oct 14, 2020

thanks @arw2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Dtype Conversions Unexpected or buggy dtype conversions Typing type annotations, mypy/pyright type checking
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants