Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: np.find_common_type is deprecated in numpy main branch #53236

Closed
3 tasks done
ngoldbaum opened this issue May 15, 2023 · 1 comment · Fixed by #53343
Closed
3 tasks done

BUG: np.find_common_type is deprecated in numpy main branch #53236

ngoldbaum opened this issue May 15, 2023 · 1 comment · Fixed by #53343
Labels
Compat pandas objects compatability with Numpy or Python functions

Comments

@ngoldbaum
Copy link
Contributor

ngoldbaum commented May 15, 2023

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

==================================================== FAILURES =====================================================
_______________________ TestGetitem.test_getitem_integer_array[True-pyarrow-integer-array] ________________________

self = <pandas.tests.extension.test_string.TestGetitem object at 0x7f3aa0ab8e10>
data = <ArrowStringArray>
['S', 'L', 'h', 'D', 'W', 'c', 'H', 'P', 'L', 't', 'T', 'i', 'r', 'r', 'd',
 'Z', 'F', 'i', 'W', 'V..., 'X', 'O', 'P', 'Q', 'd', 'S', 'h', 'N',
 'Q', 'm', 'U', 'Z', 'Z', 'h', 'D', 'P', 'x', 'p']
Length: 100, dtype: string
idx = <IntegerArray>
[0, 1, 2]
Length: 3, dtype: Int64

    @pytest.mark.parametrize(
        "idx",
        [[0, 1, 2], pd.array([0, 1, 2], dtype="Int64"), np.array([0, 1, 2])],
        ids=["list", "integer-array", "numpy-array"],
    )
    def test_getitem_integer_array(self, data, idx):
        result = data[idx]
        assert len(result) == 3
        assert isinstance(result, type(data))
        expected = data.take([0, 1, 2])
        self.assert_extension_array_equal(result, expected)
    
        expected = pd.Series(expected)
>       result = pd.Series(data)[idx]

pandas/tests/extension/base/getitem.py:249: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
pandas/core/series.py:1003: in __getitem__
    return self._get_with(key)
pandas/core/series.py:1030: in _get_with
    return self.loc[key]
pandas/core/indexing.py:1090: in __getitem__
    return self._getitem_axis(maybe_callable, axis=axis)
pandas/core/indexing.py:1319: in _getitem_axis
    return self._getitem_iterable(key, axis=axis)
pandas/core/indexing.py:1259: in _getitem_iterable
    keyarr, indexer = self._get_listlike_indexer(key, axis)
pandas/core/indexing.py:1449: in _get_listlike_indexer
    keyarr, indexer = ax._get_indexer_strict(key, axis_name)
pandas/core/indexes/base.py:5891: in _get_indexer_strict
    indexer = self.get_indexer_for(keyarr)
pandas/core/indexes/base.py:5878: in get_indexer_for
    return self.get_indexer(target)
pandas/core/indexes/base.py:3820: in get_indexer
    dtype = self._find_common_type_compat(target)
pandas/core/indexes/base.py:6096: in _find_common_type_compat
    dtype = find_result_type(self._values, target)
pandas/core/dtypes/cast.py:1277: in find_result_type
    new_dtype = find_common_type([left.dtype, dtype])
pandas/core/dtypes/cast.py:1386: in find_common_type
    return np.find_common_type(types, [])
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

array_types = [dtype('int64'), dtype('O')], scalar_types = []

    @set_module('numpy')
    def find_common_type(array_types, scalar_types):
        """
        Determine common type following standard coercion rules.
    
        .. deprecated:: NumPy 1.25
    
            This function is deprecated, use `numpy.promote_types` or
            `numpy.result_type` instead.  To achieve semantics for the
            `scalar_types` argument, use `numpy.result_type` and pass the Python
            values `0`, `0.0`, or `0j`.
            This will give the same results in almost all cases.
            More information and rare exception can be found in the
            `NumPy 1.25 release notes <https://numpy.org/devdocs/release/1.25.0-notes.html>`_.
    
        Parameters
        ----------
        array_types : sequence
            A list of dtypes or dtype convertible objects representing arrays.
        scalar_types : sequence
            A list of dtypes or dtype convertible objects representing scalars.
    
        Returns
        -------
        datatype : dtype
            The common data type, which is the maximum of `array_types` ignoring
            `scalar_types`, unless the maximum of `scalar_types` is of a
            different kind (`dtype.kind`). If the kind is not understood, then
            None is returned.
    
        See Also
        --------
        dtype, common_type, can_cast, mintypecode
    
        Examples
        --------
        >>> np.find_common_type([], [np.int64, np.float32, complex])
        dtype('complex128')
        >>> np.find_common_type([np.int64, np.float32], [])
        dtype('float64')
    
        The standard casting rules ensure that a scalar cannot up-cast an
        array unless the scalar is of a fundamentally different kind of data
        (i.e. under a different hierarchy in the data type hierarchy) then
        the array:
    
        >>> np.find_common_type([np.float32], [np.int64, np.float64])
        dtype('float32')
    
        Complex is of a different type, so it up-casts the float in the
        `array_types` argument:
    
        >>> np.find_common_type([np.float32], [complex])
        dtype('complex128')
    
        Type specifier strings are convertible to dtypes and can therefore
        be used instead of dtypes:
    
        >>> np.find_common_type(['f4', 'f4', 'i4'], ['c8'])
        dtype('complex128')
    
        """
        # Deprecated 2022-11-07, NumPy 1.25
>       warnings.warn(
                "np.find_common_type is deprecated.  Please use `np.result_type` "
                "or `np.promote_types`.\n"
                "See https://numpy.org/devdocs/release/1.25.0-notes.html and the "
                "docs for more information.  (Deprecated NumPy 1.25)",
                DeprecationWarning, stacklevel=2)
E       DeprecationWarning: np.find_common_type is deprecated.  Please use `np.result_type` or `np.promote_types`.
E       See https://numpy.org/devdocs/release/1.25.0-notes.html and the docs for more information.  (Deprecated NumPy 1.25)

../numpy/numpy/core/numerictypes.py:661: DeprecationWarning

It looks like np.find_common_type is used in five places inside pandas:

pandas/core/algorithms.py
521:        common = np.find_common_type([values.dtype, comps_array.dtype], [])

pandas/tests/dtypes/test_inference.py
1000:        common_kind = np.find_common_type(

pandas/core/arrays/sparse/dtype.py
456:        return SparseDtype(np.find_common_type(np_dtypes, []), fill_value=fill_value)

pandas/core/dtypes/cast.py
1386:    return np.find_common_type(types, [])

pandas/core/internals/array_manager.py
1416:        target_dtype = np.find_common_type(list(dtypes), [])

Issue Description

See https://github.com/numpy/numpy/blob/main/doc/release/upcoming_changes/22539.deprecation.rst for the corresponding release note. It looks like all of the usages in pandas leave the scalar_types argument to find_common_types as [], so we can follow the upgrade path outlined at the beginning of the release note (although it's possible pandas is relying on how find_common_types defaults to object dtype in error cases, so might need a try/except as well).

Expected Behavior

N/A

Installed Versions

INSTALLED VERSIONS

commit : 206d2f0
python : 3.10.9.final.0
python-bits : 64
OS : Linux
OS-release : 5.19.0-41-generic
Version : #42~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Tue Apr 18 17:40:00 UTC 2
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8

pandas : 2.1.0.dev0+769.g206d2f061d
numpy : 1.25.0.dev0+1445.g44bd27501
pytz : 2022.7.1
dateutil : 2.8.2
setuptools : 65.5.0
pip : 23.1.2
Cython : 0.29.33
pytest : 7.3.0
hypothesis : 6.68.2
sphinx : 6.1.3
blosc : 1.11.1
feather : None
xlsxwriter : 3.0.9
lxml.etree : 4.9.2
html5lib : 1.1
pymysql : 1.0.3
psycopg2 : 2.9.6
jinja2 : 3.1.2
IPython : 8.11.0
pandas_datareader: None
bs4 : 4.12.2
bottleneck : 1.3.7
brotli :
fastparquet : 2023.2.0
fsspec : 2023.4.0
gcsfs : 2023.4.0
matplotlib : 3.6.3
numba : None
numexpr : 2.8.4
odfpy : None
openpyxl : 3.1.0
pandas_gbq : None
pyarrow : 11.0.0
pyreadstat : 1.2.1
pyxlsb : 1.0.10
s3fs : 2023.4.0
scipy : 1.10.1
snappy :
sqlalchemy : 2.0.7
tables : 3.8.0
tabulate : 0.9.0
xarray : 2023.4.2
xlrd : 2.0.1
zstandard : 0.21.0
tzdata : 2022.7
qtpy : None
pyqt5 : None

@ngoldbaum ngoldbaum added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels May 15, 2023
@mroeschke
Copy link
Member

Tried doing this a while ago in #49569 but got some breaking changes, but maybe things have changed since then

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Compat pandas objects compatability with Numpy or Python functions
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants