Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rmagic and pandas DataFrame #2787

Closed
arokem opened this issue Jan 15, 2013 · 5 comments
Closed

rmagic and pandas DataFrame #2787

arokem opened this issue Jan 15, 2013 · 5 comments
Milestone

Comments

@arokem
Copy link

arokem commented Jan 15, 2013

One usage pattern of the rmagic is to pass data from Pandas DataFrames on the python side, to R data-frames on the R side. However, strings are held in the Pandas DataFrames as 'objects', so you get unusable data-frames on the R side. There is a way to 'preprocess' the data-frame into a struct array, before passing it to R:

https://github.com/arokem/ipython/tree/r_data_frame

but there is the concern that other objects will also get swept up in that. We haven't really found a robust solution to this issue yet and though the code in that branch is somewhat usable within the limits of some use-cases, maybe it shouldn't be integrated into ipython for general use as it is right now.

Also, see:

https://groups.google.com/d/topic/pydata/WU8Pq_e881k/discussion

for some discussion of this on the pydata mailing list.

@bfroehle
Copy link
Contributor

Comparison view.

@takluyver
Copy link
Member

@arokem : Could you have a look at issue #2797, which is also about converting DataFrames? Is that another symptom of the problems you've encountered?

@arokem
Copy link
Author

arokem commented Jan 17, 2013

I don't think this is exactly the same thing. In particular, applying the preprocessing steps from my r_data_frame branch to the data-frame in this case does not solve this issue.

@takluyver
Copy link
Member

Thanks. So they're two separate issues relating to DataFrame conversion. I hope we can find a good way to solve both.

@takluyver
Copy link
Member

Our conversion machinery now uses rpy2's pandas2ri module, or pandas' convert_to_r_dataframe function if pandas2ri isn't available. I've had a look at the code, and I think both of those handle string columns. It also appears to work in my brief tests.

If any further improvements are needed, they should probably be chased in rpy2 or pandas, so I'm closing this, but feel free to reopen it if there's something else that you think IPython should do here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants