Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API: NumericIndex([1 2, 3]).dtype should be int64 on 32-bit systems #49815

Merged
merged 3 commits into from Nov 22, 2022

Conversation

topper-123
Copy link
Contributor

"String dtype not supported, you may need "
"to explicitly cast to a numeric type"
)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was only used in NumericIndex.

if dtype is None and data.dtype.kind == "f":
if cls is UInt64Index and (data >= 0).all():
# https://github.com/numpy/numpy/issues/19146
data = np.asarray(orig, dtype=np.uint64)

if issubclass(data.dtype.type, str):
cls._string_data_error(data)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing this simplifies error handling by not special casing string data.

@@ -198,7 +201,8 @@ def _ensure_dtype(cls, dtype: Dtype | None) -> np.dtype | None:
return cls._default_dtype

dtype = pandas_dtype(dtype)
assert isinstance(dtype, np.dtype)
if not isinstance(dtype, np.dtype):
raise TypeError(f"{dtype} not a numpy type")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I strengthen this check here, though it will not be important in the end when everything is moved into Index

@mroeschke mroeschke added Dtype Conversions Unexpected or buggy dtype conversions 32bit 32-bit systems Index Related to the Index class or subclasses labels Nov 21, 2022
self._index_cls(data)

# shouldn't
data = ["0", "1", "2"]
with pytest.raises(TypeError, match=msg):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably requires a whatsnew note

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or could just keep the string_error_whatever to keep this tightly focused and avoid the API change

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This error can in pandas <2.0 only be hit by instantiating Int64Index et al. directly, because Index will give a object dtype to this. Int64Index will be removed in pandas 2.0, so IMO a whatsnew notes should not be added?

Doesn't hit this either it we combine string data with numeric dtype:

>>> import pandas as pd
>>> pd.Index(["1", "2"], dtype="int64")
ValueError: Trying to coerce float values to integers  # different code path

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so IMO a whatsnew notes should not be added

OK, but on the off-chance that the Int64Index removal doesn't happen in time, pls be sure to add a note.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@jbrockmendel jbrockmendel merged commit 5d8d323 into pandas-dev:main Nov 22, 2022
@jbrockmendel
Copy link
Member

thanks @topper-123

@topper-123 topper-123 deleted the NumericIndex_with_list_data branch November 22, 2022 22:34
mliu08 pushed a commit to mliu08/pandas that referenced this pull request Nov 27, 2022
…andas-dev#49815)

* API: NumericIndex([1 2, 3]).dtype should be int64 on 32-bit systems

* cleanups

* cleanups II
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
32bit 32-bit systems Dtype Conversions Unexpected or buggy dtype conversions Index Related to the Index class or subclasses
Projects
None yet
Development

Successfully merging this pull request may close these issues.

API: Should NumericIndex([1 2, 3]).dtype be int64 or int32 on 32-bit systems?
3 participants