Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Categorical.from_union #13410

Closed
2 of 3 tasks
jreback opened this issue Jun 9, 2016 · 12 comments
Closed
2 of 3 tasks

ENH: Categorical.from_union #13410

jreback opened this issue Jun 9, 2016 · 12 comments
Labels
Categorical Categorical Data Type Closing Candidate May be closeable, needs more eyeballs Enhancement

Comments

@jreback
Copy link
Contributor

jreback commented Jun 9, 2016

xref #13361

@jreback jreback added Reshaping Concat, Merge/Join, Stack/Unstack, Explode API Design Categorical Categorical Data Type labels Jun 9, 2016
@jreback jreback added this to the 0.18.2 milestone Jun 9, 2016
@jreback
Copy link
Contributor Author

jreback commented Jun 9, 2016

@jreback
Copy link
Contributor Author

jreback commented Jun 9, 2016

I think the location is fine. This mostly is part of a developer/extender API, e.g. used internally by other parts of pandas and other packages (e.g. dask), rather than in an of itself useful to a regular user.

@jankatins
Copy link
Contributor

+1 for adding a Categorical.from_union(*cats, ignore_order=False) instead of pd.xxx() -> IMO it shouldn't be exposed as top level API and from_union() is a nice equivalent to from_codes().

jreback pushed a commit that referenced this issue Jul 29, 2016
xref #13410, #13524

Author: sinhrks <sinhrks@gmail.com>

Closes #13763 from sinhrks/union_categoricals_ordered and squashes the following commits:

9cadc4e [sinhrks] ENH: union_categorical supports identical categories with ordered
@jreback
Copy link
Contributor Author

jreback commented Sep 28, 2016

@chris-b1 this was partially closed by #14191 ?

@jreback jreback modified the milestones: 0.19.0, Next Major Release Sep 28, 2016
@chris-b1
Copy link
Contributor

It was #14199, but yes - I edited the top comment.

@js3711
Copy link

js3711 commented Jan 19, 2017

@jreback @JanSchulz
I am interested in starting to contribute to pandas and see this as a good first PR opportunity. Do you guys agree?

  • If so, what do you see as the desired behavior for "add ignore_order to ignore the raising on an ordered Categorical (and just have it work)"
  • I do like the idea of Categorical.from_union(...). Should pandas.types.concat.union_categoricals still be supported (with the implementation living in from_union)?

@chris-b1
Copy link
Contributor

Setup

In [15]: c1 = pd.Categorical(['a', 'a', 'b'], categories=['b', 'a', 'c'], ordered=True)

In [16]: c2 = pd.Categorical(['b', 'b', 'a'])

In [17]: union_categoricals([c1, c2])
TypeError: Categorical.ordered must be the same

For your first question - the idea would be to allow this

In [18]: union_categoricals([c1, c2], ignore_order=True)
[a, a, b, b, b, a]
Categories (3, object): [b, a, c]

On your second question - not sure if there's complete agreement on the API, but assuming there is a Categorical.from_union I would suggest leaving the implementation where it is, and calling the union_categoricals function inside Categorical.from_union

@jorisvandenbossche
Copy link
Member

jorisvandenbossche commented Jan 20, 2017

the union_categoricals function is itself mentioned in the docs (http://pandas.pydata.org/pandas-docs/stable/categorical.html#unioning), so to start I think it is good to just improve this function (with eg what @chris-b1 showed above)

@js3711
Copy link

js3711 commented Jan 25, 2017

Thank you all for the comments. I have made an attempt at a pull request to support the ignore_order argument. #15219

I will hold off on from_union until there is agreement on the API change.

jreback pushed a commit that referenced this issue Feb 22, 2017
xref #13410 (ignore_order portion)

Author: Justin Solinsky <justinsolinsky@Justins-MacBook-Pro.local>

Closes #15219 from js3711/GH13410-ENHunion_categoricals and squashes the following commits:

e9d00de [Justin Solinsky] GH15219 Documentation fixes based on feedback
d278d62 [Justin Solinsky] ENH union_categoricals supports ignore_order GH13410
9b827ef [Justin Solinsky] ENH union_categoricals supports ignore_order GH13410
@jreback
Copy link
Contributor Author

jreback commented Feb 22, 2017

so to close this issue, I think we need to add Categorical.from_union as a short-cut (last item on the list).

AnkurDedania pushed a commit to AnkurDedania/pandas that referenced this issue Mar 21, 2017
xref pandas-dev#13410 (ignore_order portion)

Author: Justin Solinsky <justinsolinsky@Justins-MacBook-Pro.local>

Closes pandas-dev#15219 from js3711/GH13410-ENHunion_categoricals and squashes the following commits:

e9d00de [Justin Solinsky] GH15219 Documentation fixes based on feedback
d278d62 [Justin Solinsky] ENH union_categoricals supports ignore_order GH13410
9b827ef [Justin Solinsky] ENH union_categoricals supports ignore_order GH13410
@mroeschke mroeschke added Enhancement and removed Reshaping Concat, Merge/Join, Stack/Unstack, Explode labels Jun 28, 2020
@mroeschke mroeschke changed the title ENH: union_categorical enhancements ENH: Categorical.from_union May 1, 2021
@mroeschke mroeschke removed this from the Contributions Welcome milestone Oct 13, 2022
@jbrockmendel
Copy link
Member

I haven't seen a huge demand for this in the 6 years since the last comment, so lean against adding this to the API.

@jbrockmendel jbrockmendel added the Closing Candidate May be closeable, needs more eyeballs label Feb 12, 2023
@mroeschke
Copy link
Member

Agreed, closing but we can reopen if there interest again

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Categorical Categorical Data Type Closing Candidate May be closeable, needs more eyeballs Enhancement
Projects
None yet
Development

No branches or pull requests

7 participants