Series Changes breaks rpy2? #5698

Closed
janschulz opened this Issue Dec 13, 2013 · 63 comments

Comments

Projects
None yet
6 participants
Contributor

janschulz commented Dec 13, 2013

It seems that the changes to to Series break the data conversion to R: running this Notebook doesn't work anymore with some dev version from last week:
https://gist.github.com/kevindavenport/7771325/raw/87ab5603f406729c6a3866f95af9a1ebfedcf619/Mahalanobis_Outliers.ipynb

The resulting error is this:

#xydata=pandas.DataFrame(...)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-38-74fcaa767ca0> in <module>()
----> 1 get_ipython().run_cell_magic(u'R', u'-i xydata,xycols # list object to be transferred to python here', u'install.packages("ggplot2") # Had to add this for some reason, shouldn\'t be necessary\nlibrary(ggplot2)\ndf = data.frame(xydata)\nnames(df) <- c(xycols)\nprint(head(df))\nplot = ggplot(df, aes(x = X, y = Y)) + \ngeom_point(alpha = .8, color = \'dodgerblue\',size = 5) +\ngeom_point(data=subset(df, Y >= 6.7 | X >= 4), color = \'red\',size = 6) +\ntheme(axis.text.x = element_text(size= rel(1.5),angle=90, hjust=1)) +\nggtitle(\'Distance Pairs with outliers highlighted in red\')\nprint(plot)')

C:\portabel\Python27\lib\site-packages\IPython\core\interactiveshell.pyc in run_cell_magic(self, magic_name, line, cell)
   2141             magic_arg_s = self.var_expand(line, stack_depth)
   2142             with self.builtin_trap:
-> 2143                 result = fn(magic_arg_s, cell)
   2144             return result
   2145 

C:\portabel\Python27\lib\site-packages\IPython\extensions\rmagic.py in R(self, line, cell, local_ns)

C:\portabel\Python27\lib\site-packages\IPython\core\magic.pyc in <lambda>(f, *a, **k)
    191     # but it's overkill for just that one bit of state.
    192     def magic_deco(arg):
--> 193         call = lambda f, *a, **k: f(*a, **k)
    194 
    195         if callable(arg):

C:\portabel\Python27\lib\site-packages\IPython\extensions\rmagic.py in R(self, line, cell, local_ns)
    585                     except KeyError:
    586                         raise NameError("name '%s' is not defined" % input)
--> 587                 self.r.assign(input, self.pyconverter(val))
    588 
    589         if getattr(args, 'units') is not None:

C:\portabel\Python27\lib\site-packages\rpy2\robjects\functions.pyc in __call__(self, *args, **kwargs)
     84                 v = kwargs.pop(k)
     85                 kwargs[r_k] = v
---> 86         return super(SignatureTranslatedFunction, self).__call__(*args, **kwargs)

C:\portabel\Python27\lib\site-packages\rpy2\robjects\functions.pyc in __call__(self, *args, **kwargs)
     29 
     30     def __call__(self, *args, **kwargs):
---> 31         new_args = [conversion.py2ri(a) for a in args]
     32         new_kwargs = {}
     33         for k, v in kwargs.iteritems():

C:\portabel\Python27\lib\site-packages\rpy2\robjects\pandas2ri.pyc in pandas2ri(obj)
     26                 od[name] = StrVector(values)
     27             else:
---> 28                 od[name] = ro.conversion.py2ri(values)
     29         return DataFrame(od)
     30     elif isinstance(obj, PandasIndex):

C:\portabel\Python27\lib\site-packages\rpy2\robjects\pandas2ri.pyc in pandas2ri(obj)
     49         else:
     50             # converted as a numpy array
---> 51             res = original_conversion(obj)
     52         # "index" is equivalent to "names" in R
     53         if obj.ndim == 1:

C:\portabel\Python27\lib\site-packages\rpy2\robjects\numpy2ri.pyc in numpy2ri(o)
     56             raise(ValueError("Unknown numpy array type."))
     57     else:
---> 58         res = ro.default_py2ri(o)
     59     return res
     60 

C:\portabel\Python27\lib\site-packages\rpy2\robjects\__init__.pyc in default_py2ri(o)
    146         res = rinterface.SexpVector([o, ], rinterface.CPLXSXP)
    147     else:
--> 148         raise(ValueError("Nothing can be done for the type %s at the moment." %(type(o))))
    149     return res
    150 

ValueError: Nothing can be done for the type <class 'pandas.core.series.Series'> at the moment.

I'm not sure if this is something pandas cares, but even if not it would be nice to mention it in the release notes.

Contributor

jreback commented Dec 13, 2013

I think they might need to have a slightly different conversion function, can you post to their dev site?

Contributor

jreback commented Dec 16, 2013

ok...well make this an API issue here to 'track' it.

Contributor

y-p commented Dec 19, 2013

@jreback, this could be a big deal once 0.13.0 final is released. No word from rpy2?

Moved to 0.13 to make final decision right before release is due. Would be good to preempt
a backlash upon release in one way or another.

This also breaks the docs.

Contributor

jreback commented Dec 19, 2013

@janschulz can u ping rpy2?
see what their schedule is?

Contributor

y-p commented Dec 19, 2013

cc @lgautier, we're anxious to avoid releasing a point release with no mitigation for this.

Contributor

jtratner commented Dec 19, 2013

could be as simple as checking for __array__() method if necessary.

Contributor

y-p commented Dec 19, 2013

Hope it is, but it needs to happen on rpy2's side, no?

Contributor

jtratner commented Dec 20, 2013

@jreback why doesn't Series support the __array_interface__? Was that an explicit choice?

I think we might need to do that to be able to actually work with rpy2.

Contributor

jreback commented Dec 20, 2013

not necessary as array is called

but if u need it explicitly then go for it
didn't seem that hard

I don't have rpy so really can't even try it

Contributor

jreback commented Dec 20, 2013

here is an example

dahlia/wand#65

Contributor

jtratner commented Dec 20, 2013

@janschulz could you post a really simple example? I can't seem to get your notebook to work. I think the fix for rpy2 is actually really simple.

Contributor

jtratner commented Dec 20, 2013

@jreback could we just do:

def __array_interface__(self):
    return self.__array__().__array_interface__()

?

Contributor

jreback commented Dec 20, 2013

it's a property (don't need function call)
but might work

Contributor

janschulz commented Dec 20, 2013

@jtratner I also only worked with rpy2 the first time with that notebook. :-) The notebook is from @kevindavenport
@kevindavenport: do you have some more experience with rpy?

Here is a small notebook with three cells:

--- setup ---
%load_ext rmagic
import pandas as pd
df = pd.DataFrame({"x":[1,2,3,4,5], "y":[1,2,3,2,1]})
vals = df.values
cols = df.columns
---
--- using df.values: works ---
%%R -i vals,cols # list object to be transferred to python here
install.packages("ggplot2") # Had to add this for some reason, shouldn't be necessary
library(ggplot2)
df = data.frame(vals)
names(df) <- c(cols)
plot = ggplot(df, aes(x = x, y = y)) + 
geom_point(alpha = .8, color = 'dodgerblue',size = 5)
print(plot)
---
--- df directly: does not work ---
%%R -i df # list object to be transferred to python here
install.packages("ggplot2") # Had to add this for some reason, shouldn't be necessary
library(ggplot2)
df = data.frame(df)
plot = ggplot(df, aes(x = x, y = y)) + 
geom_point(alpha = .8, color = 'dodgerblue',size = 5)
print(plot)
---
Contributor

janschulz commented Dec 20, 2013

BTW: I don't think that adding a property is enough: currently the error happens because numpy2ri checks if the input is an instance of numpy.ndarray:

def numpy2ri(o):
    """ Augmented conversion function, converting numpy arrays into
    rpy2.rinterface-level R structures. """
    if isinstance(o, numpy.ndarray):
        # numpy handling
    else:
         res = ro.default_py2ri(o)

As the pandas part (pandas2ri) basically just iterates of each column and treats a datetime column differently and everything else is delegated to the numpy code, I think nothing can be done on the pandas side :-( and rpy2 needs to be adjusted :-(

I did a small change:

def numpy2ri(o):
    """ Augmented conversion function, converting numpy arrays into
    rpy2.rinterface-level R structures. """
    if isinstance(o, (numpy.ndarray, pd.Series)):
        # numpy handling
    else:
         res = ro.default_py2ri(o)

and I got the following error:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-3-3c6069b19c27> in <module>()
----> 1 get_ipython().run_cell_magic(u'R', u'-i df # list object to be transferred to python here', u'install.packages("ggplot2") # Had to add this for some reason, shouldn\'t be necessary\nlibrary(ggplot2)\ndf = data.frame(df)\nplot = ggplot(df, aes(x = x, y = y)) + \ngeom_point(alpha = .8, color = \'dodgerblue\',size = 5)\nprint(plot)')

C:\portabel\Python27\lib\site-packages\IPython\core\interactiveshell.pyc in run_cell_magic(self, magic_name, line, cell)
   2141             magic_arg_s = self.var_expand(line, stack_depth)
   2142             with self.builtin_trap:
-> 2143                 result = fn(magic_arg_s, cell)
   2144             return result
   2145 

C:\portabel\Python27\lib\site-packages\IPython\extensions\rmagic.pyc in R(self, line, cell, local_ns)

C:\portabel\Python27\lib\site-packages\IPython\core\magic.pyc in <lambda>(f, *a, **k)
    191     # but it's overkill for just that one bit of state.
    192     def magic_deco(arg):
--> 193         call = lambda f, *a, **k: f(*a, **k)
    194 
    195         if callable(arg):

C:\portabel\Python27\lib\site-packages\IPython\extensions\rmagic.pyc in R(self, line, cell, local_ns)
    585                     except KeyError:
    586                         raise NameError("name '%s' is not defined" % input)
--> 587                 self.r.assign(input, self.pyconverter(val))
    588 
    589         if getattr(args, 'units') is not None:

C:\portabel\Python27\lib\site-packages\rpy2\robjects\functions.pyc in __call__(self, *args, **kwargs)
     84                 v = kwargs.pop(k)
     85                 kwargs[r_k] = v
---> 86         return super(SignatureTranslatedFunction, self).__call__(*args, **kwargs)

C:\portabel\Python27\lib\site-packages\rpy2\robjects\functions.pyc in __call__(self, *args, **kwargs)
     29 
     30     def __call__(self, *args, **kwargs):
---> 31         new_args = [conversion.py2ri(a) for a in args]
     32         new_kwargs = {}
     33         for k, v in kwargs.iteritems():

C:\portabel\Python27\lib\site-packages\rpy2\robjects\pandas2ri.pyc in pandas2ri(obj)
     26                 od[name] = StrVector(values)
     27             else:
---> 28                 od[name] = ro.conversion.py2ri(values)
     29         return DataFrame(od)
     30     elif isinstance(obj, PandasIndex):

C:\portabel\Python27\lib\site-packages\rpy2\robjects\pandas2ri.pyc in pandas2ri(obj)
     49         else:
     50             # converted as a numpy array
---> 51             res = original_conversion(obj)
     52         # "index" is equivalent to "names" in R
     53         if obj.ndim == 1:

C:\portabel\Python27\lib\site-packages\rpy2\robjects\numpy2ri.py in numpy2ri(o)
     35         if o.dtype.kind in _kinds:
     36             # "F" means "use column-major order"
---> 37             vec = SexpVector(o.ravel("F"), _kinds[o.dtype.kind])
     38             dim = SexpVector(o.shape, INTSXP)
     39             res = ro.r.array(vec, dim=dim)

TypeError: ravel() takes exactly 1 argument (2 given)

as o is in this case a pandas.Series it seems that the ravel() implementation is different to the numpy.ndarray one :-(

Contributor

janschulz commented Dec 20, 2013

Changing pandas.Series.ravel() to this:

    def ravel(self, order=None):
        return self.values.ravel(order)

will, together with the fix for isinstance above, produce a plot.

So if pandas would add the order to pandas.Series.ravel() this would be "bearable" ("Just add this small fix in ...") until rpy2 fixes the code and produces a new release.

Contributor

jreback commented Dec 20, 2013

@janschulz easy enough....on ravel; you can do a run-time path for numpy2ri?

Contributor

janschulz commented Dec 20, 2013

@jtratner What is a run-time path?

Contributor

jreback commented Dec 20, 2013

run-time patch.....e.g. I presume this is defined somewhere in rpy2, you can overwrite the function on pandas import (not really nice to do, but should work)

Contributor

jreback commented Dec 20, 2013

@janschulz ok...ravel fix is in....give another try

I have experience with rpy, how can I be of service?

Contributor

y-p commented Dec 20, 2013

@jreback , you're not suggesting fixing by monkey patching rpy2 during numpy import are you?
This isn't ruby you know. :)

Contributor

jreback commented Dec 20, 2013

@y-p where exactly is numpy2ri defined in rpy2 right? (that IS what I am suggesting)

Contributor

jreback commented Dec 20, 2013

@kevindavenport

in 0.13 (releasing imminently), Series is no longer a sub-class of ndarray, but rather of the NDFrame (same as DataFrame for example).

so apparently their is some isinstance detection going on with rpy which fails.

Contributor

y-p commented Dec 20, 2013

We need either a way to modify pandas to keep it compatible or a reasonable PR submitted
against rpy2 which we can point users at, if we can't get something merged by launch.

As much as I'm in a position to urge against it: monkey-patching another library is not an
acceptable solution. Even shipping broken is preferable.

Contributor

jtratner commented Dec 20, 2013

I'm pretty sure I have a fix, I just didn't have an example to work with. Should be a pretty trivial fix too.

Contributor

janschulz commented Dec 20, 2013

The patch is easy:

--- rpy2\robjects\numpy2ri.py   Fri Dec 20 17:35:43 2013
+++ rpy2\robjects\numpy2ri.py.new   Fri Dec 20 17:42:42 2013
@@ -3,6 +3,7 @@
 import rpy2.rinterface as rinterface
 from rpy2.rinterface import SexpVector, INTSXP
 import numpy
+import pandas

 from rpy2.robjects.vectors import DataFrame, Vector, ListVector

@@ -26,7 +27,7 @@
 def numpy2ri(o):
     """ Augmented conversion function, converting numpy arrays into
     rpy2.rinterface-level R structures. """
-    if isinstance(o, numpy.ndarray):
+    if isinstance(o, (numpy.ndarray, pandas.Series)):
         if not o.dtype.isnative:
             raise(ValueError("Cannot pass numpy arrays with non-native byte orders at the moment."))

Source is here: https://bitbucket.org/lgautier/rpy2/src/e2d25d5bd6254c5e381d87c46c90cac30f18b5b2/rpy/robjects/numpy2ri.py?at=version_2.4.x

I don't have hg, so I', not currently able to do a PR.

Contributor

jtratner commented Dec 20, 2013

I'd change it to:

obj = getattr(obj, 'array', obj)

That way there's no pandas dep and works with anything that supports the
numpy array interface.

Contributor

jtratner commented Dec 20, 2013

Edit: __array__() is a function.

Contributor

janschulz commented Dec 20, 2013

The below patch "works", but rpy2 is a beast I don't really want to touch anymore (no git, fails to compile...)

I would suggest that pandas adds a note to the release file with the suggested patch:

--- rpy2\robjects\numpy2ri.py   Fri Dec 20 17:35:43 2013
+++ rpy2\robjects\numpy2ri.py.new   Fri Dec 20 17:42:42 2013
 def numpy2ri(o):
     """ Augmented conversion function, converting numpy arrays into
     rpy2.rinterface-level R structures. """
-    if isinstance(o, numpy.ndarray):
+    if hasattr(o, "__array__"):
         if not o.dtype.isnative:
             raise(ValueError("Cannot pass numpy arrays with non-native byte orders at the moment."))
Contributor

y-p commented Dec 26, 2013

Updated the release notes to reference this issue. Bumping the issue to 0.14 for tracking.

Contributor

lgautier commented Dec 27, 2013

@janschulz : rpy2 has a build bot for continuous integration, and according to it, it is both building and passing all unit tests. https://drone.io/bitbucket.org/lgautier/rpy2/15 .

Regarding the usage of a specific version control system, be very wary of using Python. I heard that the primary repository is not Git. ;-)

Contributor

lgautier commented Dec 27, 2013

@y-p I share your general reluctance to monkey-patch. There might justifiable usage of it, if documented.

Ideally there is no backward incompatible changes with a release series for rpy2 (current being 2.3.x). This is a problem for pandas, but not unexpected when pandas is itself making changes (one cannot expect third-party software to be forward-compatible with then-unannounced changes).

I am fine with having the next series of rpy2 (2.4.x) compatible with the latest release of pandas, but if you really wish to have pandas work with rpy2-2.3.x monkey patching is going to be the only way. You could test the rpy2 version, and if older than 2.4.0, monkey patch with something like:

import rpy2
if rpy2.__version__ < '2.4':
    msg = """
You are using rpy2 version %s, and >= 2.4.0 is recommended.
We are applying a patch to get you going, but upgrading rpy2 is strongly recommended.

Note: at the time pandas is release, rpy2-2.4.0 is not yet released.
Installing the latest automated "good" build from the development version can be done with:

pip install https://drone.io/bitbucket.org/lgautier/rpy2/files/dist/rpy2-2.4.0.tar.gz --upgrade

"""
    warnings.warn(msg % rpy2.__version__)`
    # monkey patch here
Contributor

y-p commented Dec 27, 2013

@lgautier glad to see the patch merged. When is the next release out?

Note your CI service seems to be testing against pypi pandas, rather then git master.
Your point about breaking changes breaking stuff is well taken, though where possible
making the transition easier is desirable.
You might choose to release a new 2.3.x release that works across pandas versions,
but naturally that's your call re your maintenance policy for rpy2.

Documenting the monkey patch as a workaround for users to apply knowingly is totally fine.
I still believe pandas should not patch other libraries on import and though everything has
exceptions (libraries that employ monkey-patching, especially) I don't think this is one of them.

Contributor

lgautier commented Dec 27, 2013

@y-p I hoped to have a beta around now, but I have been tremendously busy with other things. The current rpy2-2.4.x does not yet have everything I hoped to have, but is rather robust. Just missing stuff.

Testing against released pandas is to keep the parameters under control. The development of pandas and rpy2 is not specifically coordinated, and I am busy enough with keeping rpy2's own codebase passing the tests without the need for a fast moving pandas target. As soon as you release pandas on pypi, the rpy2 CI will pick it up (and force the rpy2 devs to look at any problem arising from that). In the meantime, we might have to rely on communications such as the one we are having (an alternative would to create a pandas-dev automated build, but drone does not seem to allow multiple build bots per repository).

Patching older rpy2 releases happens as time permits, but pull requests (preferably with additional unit tests where relevant) are welcome.

Contributor

jtratner commented Dec 27, 2013

@lgautier if @janschulz keeps having trouble I can submit a pull request on bitbucket, because I have everything installed (just moving right now so I don't have a lot of free time).

@y-p @jreback and other pandas peeps - Maybe we should consider making up a script that tests packages that we know depend on pandas so we can either make changes to pandas or notify those devs so they can work on a solution pre-release.

Contributor

y-p commented Dec 27, 2013

@lgautier, I agree on most points (especially your time being your own) but we would like
to ensure rpy2 is green prior to the release and we'll gladly help take care of that.

@jtratner, would you mind handling a backport of the patch + PR for the 2.3.x branch of rpy2 + testing
rpy2 suite against pandas master? If you're too busy, I'll volunteer myself.

Contributor

jtratner commented Dec 27, 2013

@y-p okay, I'll try to get that done tonight.

Contributor

y-p commented Dec 27, 2013

Thanks! let me know if you need to hand it off after all.

Contributor

jtratner commented Dec 28, 2013

Update: fix is relatively simple now that we pushed the ravel PR. I'm just checking to make sure that it ends up with the correct results and writing up some additional test cases. I'm not totally clear on the correct behavior, so I'm switching back and forth to make sure nothing broke.

Contributor

jtratner commented Dec 28, 2013

(and I'm on the same page with @janschulz 's one liner fix, with a small modification to make it work without pandas) - so is the goal to have a monkey patch in a gist that we can point to for 0.13? An actual function within pandas?

Just not clear how we handle incompatibilities like this and making it less painful for people with legacy setups.

Contributor

y-p commented Dec 28, 2013

2.4.x is already patched. @lgautier asked for a PR to release a new minor release of 2.3.x. so users
can upgrade with little disruption.

The release notes already point to this issue and can and users can find the fix and discussion here
if they look up rpy2 there. I think that's enough taken all together.

Contributor

jtratner commented Dec 28, 2013

Great

Contributor

y-p commented Dec 28, 2013

... I'm assuming that PR will come from you? :)

Contributor

jtratner commented Dec 28, 2013

Yes

Contributor

y-p commented Jan 24, 2014

0.13.0 is out, the dev version of rpy2 has been fixed. I hope @jtratner submitted that PR.

closing.

y-p closed this Jan 24, 2014

Contributor

lgautier commented Jan 25, 2014

He did. I just merged it (rpy2 branch version_2.3.x, and grafted onto version_2.4.x).

Contributor

y-p commented Jan 25, 2014

excellent. Thank you both.

Contributor

y-p commented Jan 25, 2014

Is it on pypi yet?

Contributor

lgautier commented Jan 25, 2014

Not yet. Probably some time over the week-end.
(Drone is currently looking at the candidate rpy2-2.3.9: https://drone.io/bitbucket.org/lgautier/rpy2/19 ).

Contributor

lgautier commented Jan 25, 2014

Looking fine with Python 2.7, but causing segfault with Python 3.3 and numpy 1.7.1 (https://drone.io/bitbucket.org/lgautier/rpy2/20). I cannot reproduce the segfault locally though.

Any chance someone else could try out ?

Contributor

y-p commented Jan 25, 2014

Rerun the build, if it consistently segfaults, I'll take a look.

Contributor

jtratner commented Jan 25, 2014

Thanks for putting that fix in @lgautier!

Contributor

lgautier commented Jan 25, 2014

@y-p The build on drone.io is consistently ending with a segfault on Python 3.3, while the same code does not locally. I'll hold the release of rpy2-2.3.9 until at least one of the 2 things happens: others report to have it working fine, or the problem on drone.io is identified (and fixed).

Contributor

y-p commented Jan 25, 2014

That makes good sense. I'll try to repro on my box.

Contributor

y-p commented Jan 25, 2014

I can reproduce the segfault, and I can reproduce it prior to @jtratner's commit,
on 3.3 with numpy 1.7.1. namely with rpy2-419ca01.

~/src/rpy2/ λ nosetests                        
ENotImplementedError: Device activation not implemented.
Closing device.
NotImplementedError: Device closing not implemented.
--> skipping PyMem_Free(((PyGrDevObject *)self)->grdev) 
Closing device.
--> skipping PyMem_Free(((PyGrDevObject *)self)->grdev) 
.Closing device.
--> skipping PyMem_Free(((PyGrDevObject *)self)->grdev) 
.Closing device.
--> skipping PyMem_Free(((PyGrDevObject *)self)->grdev) 
.Closing device.
--> skipping PyMem_Free(((PyGrDevObject *)self)->grdev) 
.Closing device.
--> skipping PyMem_Free(((PyGrDevObject *)self)->grdev) 
.Closing device.
--> skipping PyMem_Free(((PyGrDevObject *)self)->grdev) 
.NotImplementedError: Device activation not implemented.
Closing device.
NotImplementedError: Device closing not implemented.
--> skipping PyMem_Free(((PyGrDevObject *)self)->grdev) 
Closing device.
--> skipping PyMem_Free(((PyGrDevObject *)self)->grdev) 
.NotImplementedError: Device activation not implemented.
Closing device.
NotImplementedError: Device closing not implemented.
--> skipping PyMem_Free(((PyGrDevObject *)self)->grdev) 
Closing device.
--> skipping PyMem_Free(((PyGrDevObject *)self)->grdev) 
.NotImplementedError: Device activation not implemented.
Closing device.
NotImplementedError: Device closing not implemented.
--> skipping PyMem_Free(((PyGrDevObject *)self)->grdev) 
Closing device.
--> skipping PyMem_Free(((PyGrDevObject *)self)->grdev) 
.Closing device.
--> skipping PyMem_Free(((PyGrDevObject *)self)->grdev) 
.F.NotImplementedError: Device mode not implemented.
NotImplementedError: Device mode not implemented.
.E[1]    29630 segmentation fault (core dumped)  nosetests

You should make sure the CI output gives your more information.

Contributor

lgautier commented Jan 25, 2014

It might well be the case: there was no build on drone for the branch version_2.3.x prior to the merge of the pull request.

Now this is quite odd:

  • version 2.3.8 has been around for quite some time, and the only change is the pull request.
  • the message you report correspond to (broken) code that should not be in version_2.3.x. Did you try switch to the right branch ?
hg clone -b version_2.3.x https://bitbucket.org/lgautier/rpy2;
cd rpy2

or

hg clone https://bitbucket.org/lgautier/rpy2;
cd rpy2;
hg update version_2.3.x
  • what version of R are you building and trying this with ?
Contributor

y-p commented Jan 25, 2014

I probably used the wrong version of nose, I compiled for 3.3 but invoked the py2 nose.
That did produce a segfault, but honestly I can't say what did it.

Trying again:

~/src/rpy2/ λ python3 -m rpy2.tests
rpy2 version: 2.3.9   
built against R version: 3-0.2--63987
.................................................................................................................................................................................................................................................................................................E...........................................................
======================================================================
ERROR: testPandas2ri (rpy2.robjects.tests.testPandasConversions.PandasConversionsTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/lib64/python3.3/site-packages/rpy2/robjects/tests/testPandasConversions.py", line 74, in testPandas2ri
    pandas_df = robjects.conversion.ri2py(rdataf)
  File "/usr/lib64/python3.3/site-packages/rpy2/robjects/pandas2ri.py", line 63, in ri2pandas
    raise NotImplementedError("Conversion from rpy2 DataFrame to pandas' DataFrame")
NotImplementedError: Conversion from rpy2 DataFrame to pandas' DataFrame

----------------------------------------------------------------------
Ran 349 tests in 4.889s

FAILED (errors=1)

... but no segfault. This is with db6c132, the current tip of the version_2.3.x branch.
on 64bit fedora 20.

That's all I have.

Contributor

jtratner commented Jan 25, 2014

For what it's worth, I get a bunch of segfaults on the released version
with pandas 0.12 installed.

Contributor

lgautier commented Jan 25, 2014

This must be version-specific somewhere, making it OK on some machine (my computer, @y-p 's box) but not others (drone's VM, your machine).
I cannot seem to get a segfault with pandas 0.12.0 here.

C compiler ? R version ? something else ?

(The issue tracker for rpy2 on bitbucket might be a better place to follow up on this)

Contributor

y-p commented Jan 25, 2014

I've seen issues raised by differences between debian and ubuntu libc. I run fedora, what do you and
drone use?

Contributor

lgautier commented Jan 26, 2014

Ubuntu (me 13.10, not sure about the version used by drone)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment