-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
R data frame #2443
R data frame #2443
Conversation
Great! Can't look at it right now, but pinging @jonathantaylor and @wesm who may actually care quite a bit about this functionality and might have some feedback. |
Convert the dtype of the resulting array, so that 'object' dtypes get converted into strings, this is to counter the fact that strings get converted to object in constructing the DataFrame.
Yes - another way is to change the way that pandas itself deals with strings. This fix is needed because strings get the 'object' data type in the DF representation. I tried asking on the pydata mailing list why strings get cast into 'objects' in DataFrame class instances, but haven't gotten an answer yet. I should try asking again. |
OK - here's the latest from the pydata mailing list: https://groups.google.com/forum/?fromgroups=#!topic/pydata/WU8Pq_e881k Seems like changing pandas behavior is a non-starter, but it also sounds like this hack is not necessary once things get sorted out (presumably at the rpy2 level? About 95% of what Wes said there went over my head). For the time being, I am using this PR for my own use-cases, but I am not sure this should be merged into ipython itself. |
Hmm, I'm not really sure how to handle this issue. On the one hand this seems like an odd bit of code to add to IPython, but on the other hand it seems necessary for some work with strings. I'm also concerned that non-string objects might also be in the pandas DataFrame, causing the NumPy cast to go awry. The good news is that it's also relatively simple to replace the pyconverter in the RMagics class at runtime. A simple custom extension could, for example, also provide this functionality: def converter(x):
...
# as before
...
def load_extension(ip):
"""Load rmagic extension and inject a custom Python -> R converter."""
ip.extension_manager.load_extension('rmagic')
ip.magics_manager.registry['RMagics'].pyconverter = converter |
This PR has been inactive for > 2 months. Can we close it and open an issue to track the broader work on this feature? https://github.com/ipython/ipython/wiki/Policy:-Closing-pull-requests |
I think that's OK. I am still using this branch myself for my own uses, but On Mon, Jan 14, 2013 at 10:53 AM, Brian E. Granger <notifications@github.com
|
@arokem Please feel free to open an issue to track any thoughts you have on this matter. Thanks! |
We are closing this as further discussion/design is needed before moving forward. Here is an issue to track it: #2787 |
This PR implements a custom python => R converter. The behavior is exactly as before (call to np.asarray), except in case that the python input is a pandas DataFrame object. In that case, we preprocess the dtype of the resulting struct-array, to account for the fact that pandas casts strings into 'object' dtype. Here we will assume that if you got something with 'object' as it's dtype, you want it to be a string.
See example here:
http://nbviewer.ipython.org/urls/raw.github.com/arokem/ipython/r_data_frame/docs/examples/notebooks/rmagic_extension.ipynb
This still seems somewhat limited in scope (what if you want the 'object'), so I would be happy to get feedback on this.