New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

`Edit cells > Transform > Language` support for R #1226

Open
Henri-Lo opened this Issue Aug 3, 2017 · 12 comments

Comments

Projects
None yet
8 participants
@Henri-Lo
Copy link

Henri-Lo commented Aug 3, 2017

This is a feature-request to add R support in Edit cells > Transform > Language.

@ettorerizza

This comment has been minimized.

Copy link
Member

ettorerizza commented Aug 3, 2017

Would be a great extension of course, but a lot of work. Maybe with JRI ?

@thadguidry

This comment has been minimized.

Copy link
Member

thadguidry commented Apr 22, 2018

Better I think for us long term would be Renjin http://docs.renjin.org/en/latest/introduction.html
I also really like the fact that it uses javax.scripting interfaces http://docs.renjin.org/en/latest/library/evaluating.html
I think just having R within OpenRefine would REALLY expand our user base.

@akbertram

This comment has been minimized.

Copy link

akbertram commented Apr 23, 2018

I see there are a few language implementation now - can anyone point me to how these are defined? Would the jython extension be a good model?

@wetneb

This comment has been minimized.

Copy link
Member

wetneb commented Apr 23, 2018

@akbertram yes, the jython extension is a good model, as it isolates very well the support for that language.

@ettorerizza

This comment has been minimized.

Copy link
Member

ettorerizza commented Apr 23, 2018

@thadguidry +1000

R has a very active community, but a bit apart from other programmers. It's a booming language, as evidenced by each StackOverflow survey. R has also a steep learning curve, and being able to use it in a visual interface like that of OR would facilitate its learning.

Most importantly, a full support of R would bring a lot of potentialities to OpenRefine. The Jython extension was a great idea. The problem is that Jython only supports some of the Python modules (those that are not written in C). We can not use a whole range of great packages like numpy, pandas, NLTK and so on. With full support for R (including packages written in C++), there will be not much you cannot do in OpenRefine.

@akbertram

This comment has been minimized.

Copy link

akbertram commented Apr 23, 2018

@ettorerizza N.B. Renjin tries to provide the best of both worlds - platform independence and full support for modules with C, C++ and Fortran code. We have a tool chain that compiles these languages to JVM bytecode so they can be used as normal JVM libraries without having to help your users navigate the setup of a Fortran compiler. Builds of the CRAN packages are published to http://packages.renjin.org

@thadguidry

This comment has been minimized.

Copy link
Member

thadguidry commented Apr 23, 2018

UPDATE: So I posted a nice email to the R Lang users group to let them know we'd love to collaborate and explore some use cases, and not just this one. Who knows, maybe later after our UI Refresh, they could help build some cool extensions for data exploration. So we'll see some folks coming into this issue and also asking about things on our mailing list. Expect it to get busy around here ! Like Alex @akbertram coming into the picture to take a look at things and see where he can help out with Renjin ! Thanks Alex !!!

@psychemedia

This comment has been minimized.

Copy link

psychemedia commented May 11, 2018

This makes me wonder: if OpenRefine was a Jupyter client, and supported a Jupyter kernel connection on the back of Edit cells > Transform > Language, you would be opening up support for all manner of things, especially if there was Apache Arrow (#1469) data transfer in place (which would work with or python/pandas dataframes?).

I wonder if there are packages on the back of https://github.com/minrk/thebelab that might help with this?

@thadguidry

This comment has been minimized.

Copy link
Member

thadguidry commented May 12, 2018

@psychemedia Do you know R Lang well enough ? Would you like to have a Hangout this weekend to show me ideas ?

@psychemedia

This comment has been minimized.

Copy link

psychemedia commented May 14, 2018

@thadguidry apols for not picking this up sooner, I try to go offline over the w/e.

My R Lang knowledge is sketchy. Thinking a bit more about the Jupyter route as a way of providing different sorts of language support for transformations, this perhaps places an undesirable requirement of having a Jupyter server running. In which case, 'native' support for R, as with Python/Jython, is perhaps more appropriate.

That said, for people working in a Jupyter environment, the ability to connect to a Jupyter kernel (of which there are many) would provide a more general way of supporting transformations using arbitrary languages.

One way of doing this might be to provide a way of opening a connection to a Jupyter kernel and somehow (?!) passing state between it and OpenRefine. I'm pitching ideas as a user here, but will try to read up on the docs to see if I can better articulate what I have in mind. But my fieeling is that supporting a Jupyter client as part of OpenRefine would be a big part of that.

Related: Jupyter client docs, what looks like it might be an old proof of concept(?) Java Jupyter client; UI/UX: the ThebeLab Javascript Jupyter client, or the SageCell server.

i'm basically just riffing on the idea that Jupyter lets you run code in a "cell" against an arbitrary language kernel and then trying to imagine how OpenRefine could leverage it.

@thadguidry

This comment has been minimized.

Copy link
Member

thadguidry commented May 14, 2018

@psychemedia Thanks Tony. We are now documenting ideas and more technical details into a Wiki page so that its more readable. But feel free to continue discussion in this issue or on our mailing list.

@magdmartin

This comment has been minimized.

Copy link
Member

magdmartin commented May 18, 2018

Linking to the discussion on OpenRefine user mailing list.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment