Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API: fix str-accessor on CategoricalIndex #24005

Merged
merged 6 commits into from Dec 2, 2018

Conversation

h-vetinari
Copy link
Contributor

@h-vetinari h-vetinari commented Nov 29, 2018

closes #23555
closes #23556

split off from #23167

  • tests added / passed
  • passes git diff upstream/master -u -- "*.py" | flake8 --diff
  • whatsnew entry

@pep8speaks
Copy link

Hello @h-vetinari! Thanks for submitting the PR.

@gfyoung gfyoung added Bug Indexing Related to indexing on series/frames, not to indexes themselves Categorical Categorical Data Type labels Nov 30, 2018
@@ -1254,6 +1254,7 @@ Categorical
- Bug in :meth:`Categorical.take` with a user-provided ``fill_value`` not encoding the ``fill_value``, which could result in a ``ValueError``, incorrect results, or a segmentation fault (:issue:`23296`).
- In meth:`Series.unstack`, specifying a ``fill_value`` not present in the categories now raises a ``TypeError`` rather than ignoring the ``fill_value`` (:issue:`23284`)
- Bug when resampling :meth:`Dataframe.resample()` and aggregating on categorical data, the categorical dtype was getting lost. (:issue:`23227`)
- Bug in the ``__name__`` attribute of several methods of :class:`Series.str`, which were set incorrectly (:issue:`23551`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused...how does this entry relate to your other changes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deleted the wrong line when splitting up #23167...

@@ -264,6 +261,7 @@ def test_api_per_method(self, box, dtype,
+ ['mixed', 'mixed-integer'] * mixed_allowed)

if inferred_dtype in allowed_types:
# i.a. GH 23555, GH 23556
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i.a. = ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

inter alia :)

@codecov
Copy link

codecov bot commented Nov 30, 2018

Codecov Report

Merging #24005 into master will not change coverage.
The diff coverage is 50%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master   #24005   +/-   ##
=======================================
  Coverage   42.45%   42.45%           
=======================================
  Files         161      161           
  Lines       51561    51561           
=======================================
  Hits        21888    21888           
  Misses      29673    29673
Flag Coverage Δ
#single 42.45% <50%> (ø) ⬆️
Impacted Files Coverage Δ
pandas/core/strings.py 33% <50%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 92d25f0...ecd08b9. Read the comment docs.

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good, small changes.

@@ -1254,6 +1254,7 @@ Categorical
- Bug in :meth:`Categorical.take` with a user-provided ``fill_value`` not encoding the ``fill_value``, which could result in a ``ValueError``, incorrect results, or a segmentation fault (:issue:`23296`).
- In meth:`Series.unstack`, specifying a ``fill_value`` not present in the categories now raises a ``TypeError`` rather than ignoring the ``fill_value`` (:issue:`23284`)
- Bug when resampling :meth:`Dataframe.resample()` and aggregating on categorical data, the categorical dtype was getting lost. (:issue:`23227`)
- Bug in many methods of the ``.str``-accessor, which always failed on `CategoricalIndex` (:issue:`23555`, :issue:`23556`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use double backticks


# for category, we do the stuff on the categories, so blow it up
# to the full series again
# But for some operations, we have to do the stuff on the full values,
# so make it possible to skip this step as the method already did this
# before the transformation...
if use_codes and self._is_categorical:
result = take_1d(result, self._orig.cat.codes,
# if self._orig is a CategoricalIndex, there is no .cat-accessor
result = take_1d(result, Series(self._orig).cat.codes,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can use copy=False here in the Series construction

@@ -264,6 +261,7 @@ def test_api_per_method(self, box, dtype,
+ ['mixed', 'mixed-integer'] * mixed_allowed)

if inferred_dtype in allowed_types:
# inter alia GH 23555, GH 23556
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just say xref

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this one

@jreback jreback added this to the 0.24.0 milestone Dec 2, 2018
@@ -1256,6 +1256,7 @@ Categorical
- Bug in :meth:`Categorical.take` with a user-provided ``fill_value`` not encoding the ``fill_value``, which could result in a ``ValueError``, incorrect results, or a segmentation fault (:issue:`23296`).
- In meth:`Series.unstack`, specifying a ``fill_value`` not present in the categories now raises a ``TypeError`` rather than ignoring the ``fill_value`` (:issue:`23284`)
- Bug when resampling :meth:`Dataframe.resample()` and aggregating on categorical data, the categorical dtype was getting lost. (:issue:`23227`)
- Bug in many methods of the ``.str``-accessor, which always failed on ``CategoricalIndex`` (:issue:`23555`, :issue:`23556`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re-word this to say CategoricalIndex.str

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jreback
Hope this is more like what you were after.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

close enough

@h-vetinari
Copy link
Contributor Author

@jreback
Ping on green

@jreback jreback merged commit 9d0f113 into pandas-dev:master Dec 2, 2018
@jreback
Copy link
Contributor

jreback commented Dec 2, 2018

thanks @h-vetinari

@h-vetinari h-vetinari deleted the cat_index_str branch December 2, 2018 23:14
Pingviinituutti pushed a commit to Pingviinituutti/pandas that referenced this pull request Feb 28, 2019
Pingviinituutti pushed a commit to Pingviinituutti/pandas that referenced this pull request Feb 28, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Categorical Categorical Data Type Indexing Related to indexing on series/frames, not to indexes themselves
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: many methods on CategoricalIndex.str are broken BUG: CategoricalIndex.str.extractall fails
4 participants