New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rmagic extension #1780
Rmagic extension #1780
Conversation
Great! I'll review this tomorrow and will give you some concrete feedback. A few points will be similar to issues we've seen with the Cython PR but are very easy to address; mainly we need to protect both the test suite and the main file with a couple of checks so that this code is simply skipped in environments without the rpy2 libraries instead of erroring out. |
def Rpull(self, line): | ||
''' | ||
A line-level magic for R that pushes | ||
variables from python to rpy2. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bad copy-past from Rpush
. I guess you wanted to write the other way around.
Is there an accepted way to write docstrings for magics which include tests? |
@jonathan-taylor, we don't do too much doctesting in IPython, leaving the docstrings mostly to contain illustrative examples rather than contort them into also being tests. so you can feel free to copy/paste from a session into an example block as the numpy docstring standard suggests. In order to avoid it being doctested, simply add the |
Notes | ||
----- | ||
|
||
Beware that R names can have '.' so this is not fool proof. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rpy2 in places will convert .
to _
to handle this; is that worth doing here? It could print out a message like foo.bar, baz.thing -> foo_bar, baz_thing
. But it's added complexity, so it's fine to leave it if it's a corner case.
Use the command |
Here is a bit more info on how the testing machinery is set up, let us know if any of it isn't clear and we'll be happy to help out. |
Test results for commit 59766df merged into master
Not available for testing: |
Those two failures are just from lacking isolation in the test suite. I'll make a small PR against your branch with the fix, so you don't have to go spelunking in our test machinery. |
names of objects in the python name space to be | ||
assigned to objects of the same name in the | ||
R name space. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be good to add a little example of usage here. You can copy-paste from an interactive session, and simply add the @skip_doctest
decorator above the @line_magic
one so the docstring doesn't get doctested automatically.
You can get that decorator with: from IPython.testing.skipdoctest import skip_doctest
.
This is looking great, thanks @jonathan-taylor for the work, and everyone else for the review! Jonathan, I think once these few issues are dealt with (and see my out-of-band fix for the test isolation issue), we'll be able to merge it. The cython one should also land soon, so very shortly we'll have built-in cython and R from IPython! |
Test results for commit 59766df merged into master
Not available for testing: |
Added the checks after installing rpy2 onto my 2.7 box. Once you merge the isolation fix I'll run again and we should see the python3 failure go away. |
Test results for commit ea80e2b (can't merge cleanly)
Not available for testing: |
@jonathan-taylor, just so you know: this is producing a merge conflict now, because a test for pygments arrived in the meantime. But the problem is trivial to fix, I'll send you a patch out of band so you don't have to deal with the git conflict. |
Actually, I'm not sure how to create a patch that corresponds to a rebase, but the rebase itself is in this case trivial to do, and it will be useful knowledge in any case :) Just type
It will then warn you of a conflict in
Change that to read:
that is, simply remove the Then type
and you should be done. You'll have to force-push your changes since this rewrites history:
and the PR will update with the new code and will become mergeable again. Sorry for the confusion with the conflict, it was actually caused by my fix :) |
BTW Jonathan, regarding the docstrings, I see we have some extra |
self.output.append(output) | ||
|
||
def flush(self): | ||
value = ''.join([s.decode('utf-8') for s in self.output]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In Python 3, rpy2 writes str (i.e. unicode) to our function, so trying to decode it again will fail. I think IPython.utils.py3compat.str_to_unicode(s, 'utf-8')
should do what you want.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've tried to fix this in a more recent commit but there seems to be some problems with this exception raising in that the notebook raises a unicode type exception here while the console raises the exception I expect:
Notebook
UnicodeDecodeError Traceback (most recent call last)
/Users/jonathantaylor/ipython/IPython/core/interactiveshell.pyc in run_code(self, code_obj)
2722 self.CustomTB(etype,value,tb)
2723 except:
-> 2724 self.showtraceback()
2725 else:
2726 outflag = 0
/Users/jonathantaylor/ipython/IPython/core/interactiveshell.pyc in showtraceback(self, exc_tuple, filename, tb_offset, exception_only)
1713 value, tb, tb_offset=tb_offset)
1714
-> 1715 self._showtraceback(etype, value, stb)
1716 if self.call_pdb:
1717 # drop into debugger
/Users/jonathantaylor/ipython/IPython/zmq/zmqshell.pyc in showtraceback(self, etype, evalue, stb)
505 u'traceback' : stb,
506 u'ename' : unicode(etype.name_),
--> 507 u'evalue' : unicode(evalue)
508 }
509
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 132: ordinal not in range(128)'
Console
RMagicError Traceback (most recent call last)
in ()
----> 1 get_ipython().magic(u'R library(nosuchlibrary)')
/Users/jonathantaylor/ipython/IPython/core/interactiveshell.pyc in magic(self, arg_s)
2129 magic_name, _, magic_arg_s = arg_s.partition(' ')
2130 magic_name = magic_name.lstrip(prefilter.ESC_MAGIC)
-> 2131 return self.run_line_magic(magic_name, magic_arg_s)
2132
2133 #-------------------------------------------------------------------------
/Users/jonathantaylor/ipython/IPython/core/interactiveshell.pyc in run_line_magic(self, magic_name, line)
2055 args.append(sys._getframe(stack_depth).f_locals)
2056 with self.builtin_trap:
-> 2057 result = fn(*args)
2058 return result
2059
/Users/jonathantaylor/ipython/IPython/extensions/rmagic.pyc in R(self, line, cell)
/Users/jonathantaylor/ipython/IPython/core/magic.pyc in (f, *a, **k)
188 # but it's overkill for just that one bit of state.
189 def magic_deco(arg):
--> 190 call = lambda f, *a, **k: f(*a, **k)
191
192 if callable(arg):
/Users/jonathantaylor/ipython/IPython/extensions/rmagic.pyc in R(self, line, cell)
330 if line_mode:
331 for line in code.split(';'):
--> 332 text_result, result = self.eval(line)
333 text_output += text_result
334 if text_result:
/Users/jonathantaylor/ipython/IPython/extensions/rmagic.pyc in eval(self, line)
107 except (ri.RRuntimeError, ValueError) as exception:
108 raise RMagicError(unicode_to_str('parsing and evaluating line "%s". R traceback: "%s"\n' %
--> 109 (line, str_to_unicode(exception.message, 'utf-8'))))
110 text_output = self.flush()
111 ri.set_writeconsole(old_writeconsole)
RMagicError: parsing and evaluating line "library(nosuchlibrary)". R traceback: "Error in library(nosuchlibrary) :
there is no package called ‘nosuchlibrary’
"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's the 'smart' quotes on ‘nosuchlibrary’
in the error message that is evidently messing it up.
Ultimately, this is a bug in rpy2: it should be safe to call unicode()
on an exception, as we are doing. I've filed an issue about it. But at the same time, IPython should protect against that failure, so I've opened an issue for us as well. I'll be sure to get it fixed before the next release, so don't worry about the extra exception.
Thanks for the hack. It seems to work now, at least for this example -- I just added an addition call to np.asarray (which is what self.Rconverter defaults to presently). In [7]: datapy= np.array([(1, 2.9, 'a'), (2, 3.5, 'b'), (3, 2.1, 'c')], In [8]: %R -i datapy In [9]: %%R -o datar In [10]: datar |
Since these seem to be a bit subtle, could you perhaps add some of those cases to the test suite? That will ensure that the problem doesn't reappear later on a refactoring or other changes, for one thing... |
After playing with the structured array conversion a little, I realized that this might try to convert almost every return value to a structured array which doesn't seem ideal as structured arrays are harder to work with than ordinary ndarrays (even when all fields have the same dtype). So, I've added the option to try to convert things to structured arrays, with a -d flag to R and a -d flag to Rpull. I also add a new magic, Rget which is like Rpull except it returns the value rather than pushing it to shell.user_ns. |
Tiny error:
you're missing a comma in the previous line. |
In trying to run the code in the notebook, I'm getting an error b/c I don't have the 'lars' R package. Would it be possible to have the illustrative notebook run only with the functionality that's built into R? Since it's meant only to teach the basics of using the magic, I would prefer if it runs on a vanilla R install (esp. since I couldn't find lars even in the fairly extensive ubuntu R package list). |
Other than that, and the unicode issue with exceptions which is not really your problem here, I think this is looking very good. At this point I think we're starting to hit diminishing returns for keeping it in review. Unless you have anything major in mind, I'd be happy to merge it once you can fix the example notebook to load without requiring additional packages and fix the missing comma. How does that sound? |
Sorry for the redundant comment, I was a little bit slower than Fernando .. In [61]: %R not_existing
---------------------------------------------------------------------------
RMagicError Traceback (most recent call last)
...
RMagicError: parsing and evaluating line "not_existing". R traceback: "Error in eval(expr, envir, enclos) : object 'not_existing' not found
"
In [62]: %R print("This does work")
Error in eval(expr, envir, enclos) : object 'not_existing' not found
[1] "This does work" So this seems double, however, if you get more messages, eg an error and a warning, only the first is printed when the line is executed, and all messages are printed the next time you use %R (and in this example it includes usefull information, namely which file wasn't found): In [65]: %R source("/data/PhD/IDEAsvn/trunk/R/ona_processing.R")
...
RMagicError: parsing and evaluating line "source("/data/PhD/IDEAsvn/trunk/R/ona_processing.R")". R traceback: "Error in file(filename, "r", encoding = encoding) :
cannot open the connection
"
In [66]: %R print("This does work")
Error in file(filename, "r", encoding = encoding) :
cannot open the connection
In addition: Warning message:
In file(filename, "r", encoding = encoding) :
cannot open file '/data/PhD/IDEAsvn/trunk/R/ona_processing.R': No such file or directory
[1] "This does work" |
Ah, you also saw the error thing. So the first part of my comment is resolved, but the second part not yet. Now you don't get the warning at all. |
For now, I've included the flushed error message in stdout. We now see the stderr twice, once in the exception and once in the flushed stdout. Maybe capturing stderr separately would solve this, but at least the message is intercepted. In [2]: %R source("nothing.R")RMagicError Traceback (most recent call last) /Users/jonathantaylor/ipython/IPython/core/interactiveshell.pyc in magic(self, arg_s) /Users/jonathantaylor/ipython/IPython/core/interactiveshell.pyc in run_line_magic(self, magic_name, line) /Users/jonathantaylor/ipython/IPython/extensions/rmagic.pyc in R(self, line, cell) /Users/jonathantaylor/ipython/IPython/core/magic.pyc in (f, _a, *_k) /Users/jonathantaylor/ipython/IPython/extensions/rmagic.pyc in R(self, line, cell) /Users/jonathantaylor/ipython/IPython/extensions/rmagic.pyc in eval(self, line) RMagicError: parsing and evaluating line "source("nothing.R")". In [3]: %R print('this works') |
Test results for commit 3ffc166 merged into master
Not available for testing: |
@jonathan-taylor, please disregard the previous failures, they were due to a stale cache on my hard drive. Rerunning the tests now. |
Test results for commit 3ffc166 merged into master
Not available for testing: |
Awesome, I just ran the tests (see report above) and the notebook and everything looks solid. @jonathan-taylor, thanks so much for all your patience with this! This has 31 commits and over 50 comments, so it was one of those 'thick' PRs. But I think it's all for the better, and we'll be giving people a really solid tool from the very start. Merging now. |
Rmagic extension to use R (the statistical package) seamlessly from IPython. The rmagic extension allows R inline code as well as cell level magics. An example notebook is provided in docs/examples/notebooks/rmagic_extension.ipynb to demonstrate its usage. Main points: 1) Allows capture of plots to R via inline png plots (like --pylab inline) 2) Allows capture of R's stdout() connection to the notebook 3) Allows simple push/pull for array data to/from R (via rpy2) with copy only on push to R -- this seems necessary.
Rmagic extension to use R (the statistical package) seamlessly from IPython. The rmagic extension allows R inline code as well as cell level magics. An example notebook is provided in docs/examples/notebooks/rmagic_extension.ipynb to demonstrate its usage. Main points: 1) Allows capture of plots to R via inline png plots (like --pylab inline) 2) Allows capture of R's stdout() connection to the notebook 3) Allows simple push/pull for array data to/from R (via rpy2) with copy only on push to R -- this seems necessary.
I've added an rmagic extension that allows R inline code as well as cell level magics. I've also included in docs/examples/notebooks/rmagic_extension.ipynb to demonstrate its usage.
Main points:
Allows capture of plots to R via inline png plots (like --pylab inline)
Allows capture of R's stdout() connection to the notebook
Allows simple push/pull for array data to/from R (via rpy2) with copy only on push to R -- this seems necessary.