-
Notifications
You must be signed in to change notification settings - Fork 326
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
when R process is embedded in a Python process #98
Comments
It's in principle possible but I'm not sure of all the mechanics. There are two issues to consider:
|
It turns out it is much easier than I thought. The only changes are https://github.com/rstudio/reticulate/pull/104/files?w=1 |
I am trying to force reticulate to use a specific python by
But it seems that it would still choose a different python if the requirements do not meet. Is there any way to force using a specific python. And throw an error instead of automatically choose a version of python if the requirements do not meet. |
The automatic binding to versions of Python is all based on the python package requirements implied by the R packages loaded. For example: library(keras) Implies not only that I want to use Python but that I also want to use a version of Python that has the keras Python package. Without this behavior the user will get an error when attempting to call functions that in turn call into Keras. If we blindly take the |
Several elements might be conflated together here:
Regarding the former, Python has a C-level function (note: this is obviously hinting toward having the complementary environment variable |
It's unfortunate that there isn't a way provided by R itself to determine if R is already initialized. There are a bunch of environment variables set by R, e.g. In terms of reticulate, no you don't have a dependency on it but reticulate does need to bind to a version of Python and if it's not advised of which one to bind to via something like an environment variable then it may bind to the wrong one. |
I think that answering the first point is going to be necessary if You are indeed right that setting environment variables will not work too well if child processes are created (rpy2 would fail to initialize R whenever Python's The alternative solution would be to create a common C library implementing a Note: Regarding the dynamic loading of R's shared library without initialization, I have an example of this happening right below: import rpy2.rinterface |
Okay, so it seems like we need to agree on the protocol for providing
embedding information then. Let me know your thoughts on the specifics and
I'll try to implement within reticulate (I agree that environment variables
with a PID filter is the best way)
…On Sun, Oct 1, 2017 at 12:43 PM, Laurent Gautier ***@***.***> wrote:
I think that answering the first point is going to be necessary if
reticulate and rpy2 are expected to work together. Without this. rpy2
will not know that R is not the embedded language but the embedding
language, and will initialize R a second time (leading to serious problems
and ultimately a likely segfault).
You are indeed right that setting environment variables will not work too
well if child processes are created (rpy2 would fail to initialize R
whenever Python's multiprocessing is used - a significant problem).
However, having the PID in which R was initialized in the additional
information about R's initialization status I was mentioning would solve
the problem.
The alternative solution would be to create a common C library
implementing a R_IsInitialized() function. The logistics for this seems
much more complicated.
Note: Regarding the dynamic loading of R's shared library without
initialization, I have an example of this happening right below:
import rpy2.rinterface
—
You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub
<#98 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAGXx59rodkfQi3YEu6iKqWBu_iYgl5Tks5sn8FEgaJpZM4PUEJw>
.
|
I can speak best about what rpy2 (as a language embedding R) would need. After thinking a bit, For Python, this could then be essentially the same in a variable called |
On the reticulate side I also need to know the location of the Python binary which hosts the session. This is because reticulate dynamically binds to the Python shared library and needs the location of Python to discover the location of the shared library (this is the |
Hopefully we can make something generic enough to work for all. I would be more than happy to have rpy2 define a couple of variables that make the interoperability with other languages "just work" rather than leave it to the user, but I would to see with you if something general for Python-R bridges using C can be worked out (today it is reticulate, later PyCall / RCall (Julia's bridges) whenever they realize that they have the same issue, etc...). I know that |
There is an candidate implementation in rpy2 (branch As soon as reticulate defines import os
import sys
# This won't be needed when rpy2 defines PYTHON_SESSION_INITIALIZED
# (and reticulate uses it).
os.environ['RETICULATE_PYTHON'] = sys.executable
from rpy2.robjects import r
r("""
library(reticulate)
robjects <- import('rpy2.robjects')
robjects$r('R.version')
""")
|
Just added the reticulate side here: b24eb08 Once you define PYTHON_SESSION_INITIALIZED let me know the format and I'll incorporate that into reticulate as well. |
I finally found a bit of time to do it. The environment variable This is in the branch |
Okay, I've implemented support for detecting and using the Python defined in @randy3k I think you should switch to defining |
As soon as I get a confirmation that this is all good I will backport the change to the branch 2.9.x in the rpy2 repository in order to include it in the upcoming release 2.9.1. |
Excellent!! |
FWITW I just tried with rpy2's HEAD in branch rpy2.robjects.r("""
reticulate::import(sys)
""") The backtrace is hinting as this happening during reticulate's own initialization:
|
@lgautier with randy3k/rpy2@440322c were you able to get things working? |
I have not tried again since my last comment. |
I am still getting a segfault: >>> import rpy2.robjects
>>> rpy2.robjects.r('library("reticulate")')
>>> rpy2.robjects.r('reticulate::import("sys")')
>>> rpy2.robjects.r('reticulate::import("sys")')
*** Error in `python': free(): invalid pointer: 0x00007f484af29078 ***
======= Backtrace: =========
/lib/x86_64-linux-gnu/libc.so.6(+0x7908b)[0x7f4853b1108b]
/lib/x86_64-linux-gnu/libc.so.6(+0x82c3a)[0x7f4853b1ac3a]
/lib/x86_64-linux-gnu/libc.so.6(cfree+0x4c)[0x7f4853b1ed2c]
/usr/lib/python3.5/config-3.5m-x86_64-linux-gnu/libpython3.5.so(PyUnicode_InternFromString+0x2a)[0x7f4845a6419a]
/usr/lib/python3.5/config-3.5m-x86_64-linux-gnu/libpython3.5.so(PyType_Ready+0x18c5)[0x7f48458ac4c5]
/usr/lib/python3.5/config-3.5m-x86_64-linux-gnu/libpython3.5.so(PyStructSequence_InitType2+0x181)[0x7f48458b13e1]
/usr/lib/python3.5/config-3.5m-x86_64-linux-gnu/libpython3.5.so(_PyLong_Init+0xb7)[0x7f48458c0867]
/usr/lib/python3.5/config-3.5m-x86_64-linux-gnu/libpython3.5.so(_Py_InitializeEx_Private+0xb4)[0x7f4845a1e5d4]
/usr/local/packages/R/3.4/lib/R/library/reticulate/libs/reticulate.so(_Z13py_initializeRKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEES6_S6_S6_bbS6_+0x400)[0x7f484a70bed0]
/usr/local/packages/R/3.4/lib/R/library/reticulate/libs/reticulate.so(_reticulate_py_initialize+0x1d1)[0x7f484a6fb131]
/usr/local/packages/R/3.4/lib/R/lib/libR.so(+0xd782a)[0x7f4851cd482a]
(...) |
@lgautier Is that with the companion commit in RPy2 you mentioned here: randy3k/rpy2@440322c#commitcomment-25703908 ? |
Yes. |
@lgautier , @jjallaire : please advise if this integration is supposed to work with latest versions of Can see the issue is closed but last report from Laurent was about segfault. With this
I get
Found "RuntimeError: Concurrent access to R is not allowed." here https://bitbucket.org/rpy2/rpy2/issues/182/process_revents-keeps-on-raising but cannot tell what causes this. The error suggests that the same R instance is used which is good, right? Error still happens if I comment calls to Am I not doing the right thing making the joint initialization sequence work? |
context = above My end goal is to load a large time series in python / Ideally there would not be any data conversion going on (I know Sorry for posting here but this might be material for a blog post documenting joint use of I will write that post if you are kind to help out! |
Just a note to point out to #208 (comment) |
reticulate
is not working if the R process is embedded in a Python process. Particularly,reticulate
doesn't run on rice which is a python program embedding R. I would also imagine that it will not work with rpy2.I think the issue is that
reticulate
failis to initialize a Python instance when another python instance is running. However, I am not sure how easy/difficult the fix would be.cf: randy3k/radian#7
The text was updated successfully, but these errors were encountered: