Skip to content

Commit

Permalink
Backport PR #53175: Add np.intc to _factorizers in pd.merge (#5…
Browse files Browse the repository at this point in the history
…3276)

Co-authored-by: Simon Høxbro Hansen <simon.hansen@me.com>
  • Loading branch information
mroeschke and hoxbro committed May 17, 2023
1 parent a23c15c commit 8983c5d
Show file tree
Hide file tree
Showing 3 changed files with 8 additions and 2 deletions.
2 changes: 1 addition & 1 deletion doc/source/whatsnew/v2.0.2.rst
Original file line number Diff line number Diff line change
Expand Up @@ -14,11 +14,11 @@ including other versions of pandas.
Fixed regressions
~~~~~~~~~~~~~~~~~
- Fixed performance regression in :meth:`GroupBy.apply` (:issue:`53195`)
- Fixed regression in :func:`merge` on Windows when dtype is ``np.intc`` (:issue:`52451`)
- Fixed regression in :func:`read_sql` dropping columns with duplicated column names (:issue:`53117`)
- Fixed regression in :meth:`DataFrame.loc` losing :class:`MultiIndex` name when enlarging object (:issue:`53053`)
- Fixed regression in :meth:`DataFrame.to_string` printing a backslash at the end of the first row of data, instead of headers, when the DataFrame doesn't fit the line width (:issue:`53054`)
- Fixed regression in :meth:`MultiIndex.join` returning levels in wrong order (:issue:`53093`)
-

.. ---------------------------------------------------------------------------
.. _whatsnew_202.bug_fixes:
Expand Down
4 changes: 4 additions & 0 deletions pandas/core/reshape/merge.py
Original file line number Diff line number Diff line change
Expand Up @@ -123,6 +123,10 @@
np.object_: libhashtable.ObjectFactorizer,
}

# See https://github.com/pandas-dev/pandas/issues/52451
if np.intc is not np.int32:
_factorizers[np.intc] = libhashtable.Int64Factorizer


@Substitution("\nleft : DataFrame or named Series")
@Appender(_merge_doc, indents=0)
Expand Down
4 changes: 3 additions & 1 deletion pandas/tests/reshape/merge/test_merge.py
Original file line number Diff line number Diff line change
Expand Up @@ -1463,7 +1463,9 @@ def test_different(self, right_vals):
result = merge(left, right, on="A")
assert is_object_dtype(result.A.dtype)

@pytest.mark.parametrize("d1", [np.int64, np.int32, np.int16, np.int8, np.uint8])
@pytest.mark.parametrize(
"d1", [np.int64, np.int32, np.intc, np.int16, np.int8, np.uint8]
)
@pytest.mark.parametrize("d2", [np.int64, np.float64, np.float32, np.float16])
def test_join_multi_dtypes(self, d1, d2):
dtype1 = np.dtype(d1)
Expand Down

0 comments on commit 8983c5d

Please sign in to comment.