Join GitHub today
GitHub is home to over 20 million developers working together to host and review code, manage projects, and build software together.
astype(unicode) does not work as expected #7758
Comments
|
you can do: what are you doing with this? pandas keeps all string-likes as |
jreback
added the
Unicode
label
Jul 15, 2014
|
I have a method that detects whether a column should be considered as a category based on its type and cardinality. Columns that are considered as categories are casted into unicode object. I know how to workaround this issue, but I thought I should report what I thought was a bug. Let me know if you need more information. |
|
ok, this could be more informative, but its fundamentally an issue. This would return a numpy array (and NOT a series, and that would simply recast, and lose the cast to unicode). I think that is a bit odd though. What do you think should happen? |
|
Ideally, I would have either wanted the cast to work as python unicode() function.
Does that make sense in Pandas? |
|
@fulmicoton Why do you need to convert to unicode? Do you have things that are convertible to unicode but aren't already converted? Can you give a more detailed example that illustrates why you need to do this. I think I'm just missing something. |
|
This could all be done I think (may need to allow an here's a picture of the internal structure:
|
jreback
added Dtypes Enhancement Bug
labels
Jul 15, 2014
jreback
added this to the
0.15.0
milestone
Jul 15, 2014
|
@fulmicoton interested in doing a pull-request for this? |
|
@cpcloud Just having a piece of code trying to coerce a bunch of columns marked as categorical into unicode strings. Some of them are already unicode, some of them have been detected as int but have such a low cardinality I want to handle them as categories. |
|
@jreback I'll take a look at that tonight. |
|
@fulmicoton you might wasn to explore this as well (just merged in): http://pandas-docs.github.io/pandas-docs-travis/categorical.html. Prob not a lot of tests for unicode (but it should work) |
fulmicoton
added a commit
to fulmicoton/pandas
that referenced
this issue
Jul 15, 2014
|
|
fulmicoton |
a92e593
|
fulmicoton
referenced
this issue
Jul 15, 2014
Closed
Closes #7758 - astype(unicode) returning unicode. #7765
|
Here is the pull requests. I didn't have to use infer_dtype, so I hope I didn't do anything wrong. |
fulmicoton
added a commit
to fulmicoton/pandas
that referenced
this issue
Jul 15, 2014
|
|
fulmicoton |
01d6897
|
fulmicoton commentedJul 15, 2014
astype unicode seems to call str, so that the following code throws
raises :