New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature/Documentation Request]: Workflow -> Python Script option? #1341

Open
Pelonza opened this Issue Jun 15, 2016 · 19 comments

Comments

Projects
None yet
9 participants
@Pelonza

Pelonza commented Jun 15, 2016

Orange version

3.3

Expected behavior

The ability to convert a widget/gui workflow directly to an equivalent Python Script file. Even if it's ugly.

Actual behavior

As it seems right now, Orange supports either the GUI workflows OR directly using/writing python scripts that access the orange data-mining suite.

OR... if this is possible... it doesn't seem to be documented clearly anywhere that I could find.

Steps to reproduce the behavior

N/A

Additional info (worksheets, data, screenshots, ...)

I'm an instructor at a university, and teach some of our data mining and introduction to data science courses. I've used Weka before, but rather dislike it's interface and mechanisms. I also usually teach R as part of the data-mining, but would really like something with a much lower learning curve as an introductory software piece. Possibly to avoid even some of the initial issue of actually teaching PROGRAMMING instead of the bigger data-science picture.

Orange almost perfectly fits that bill with the GUI and being able to actually write python scripts directly to do the data-mining. HOWEVER.... there's a huge downside of needing to write it a 2nd time once you've figured out the work-flow AND of correctly using the python back-end.

Ideally, I'd love the option to turn on a 2nd "window" (or at least a widget or save option) that shows the equivalent python script calling the Orange mining procedures. I think this might also be very useful for actual USERS of Orange, as it would let them design a workflow at a high-level of abstraction and editing, then output to a python script. This would allow minor tweaking directly in the code or work to merge/enhance things outside of options available in a given widget.

@kernc

This comment has been minimized.

Member

kernc commented Jun 15, 2016

That's a great idea, thanks for bringing it up. We've had it blueprinted for this year's GSoC, but in the end it didn't make it on the short list. It is definitely something we consider.

@kernc kernc added the enhancement label Jun 15, 2016

@Pelonza

This comment has been minimized.

Pelonza commented Jun 15, 2016

So... if this is something you've got penciled in/outlined...

I've got a summer research student of my own who's familiar with Python. I
definitely don't want to set him an impossible task, but if it could have
fit under a GSoC project, it's possibly something I could ask him to
consider also. If you are willing to share the specifications/blueprint I
can talk with him about it.

As I said in the issue... that's partly selfish interest as I'd love to use
it for a teaching tool.. :)

@kernc

This comment has been minimized.

Member

kernc commented Jun 18, 2016

We seem to have an interest in common. 😃

I sent you an email with the outline, but any implementation details, if the project is decided upon, should probably be discussed here for others to scrutinize as well.

@Pelonza

This comment has been minimized.

Pelonza commented Jun 22, 2016

I got permission from my department chair to go ahead and have my summer student work on this project. So it's a go for the rest of the summer/fall depending on speed/progress. I'm meeting with him this afternoon (soon) to talk over the outline you sent me.

@Ameobea

This comment has been minimized.

Contributor

Ameobea commented Jul 6, 2016

Hi, I'm the research student assigned to work on this task. I just wanted to show my current progress and make myself open to input and suggestions. However, after some initial review by @kernc, it seems that what I have done so far isn't quite in line with the overall vision for this project so it's likely that most or all of the code generation will have to be re-written to meet the new model.

Current progress: https://github.com/Ameobea/orange3/commits/script-export-gui
Code generation example: https://ameo.link/u/bin/2j9 (Generated from testing workflow)
owfile code generator: https://ameo.link/u/bin/2jb
@kernc's vision: https://paste.debian.net/779226/

The first thing I did was create a topological sort function for the workflow DAG which created a sorted list of nodes to be processed in order so that all dependency nodes are processed before their children. The nodes are then converted into a widgets. Each widget's init_code_gen function is invoked in order to generate output which is organized and inserted into the final output script file.

The code generator consists of multiple parts including generating import statements for required modules, generating declarations that go inside __init__, as well as other subgenerators for external functions, internal function definitions, and text-level line deletion and modification.

The goal of the generator is to insert all necessary code from the widget into the output to perform the same function as the initial widget without modifying or re-writing already existing widget code. I went out of my way to avoid modifying any existing widget code or so much as copy and paste a line. However, it would certainly be much more efficient in terms of the size of the output code and simplicity of the generation process to do that.

@Pelonza

This comment has been minimized.

Pelonza commented Jul 6, 2016

Note: This is in response to a separate email where Kernc provided some sample "ideal" code.

So, looking at the two files that Kernc produced and you (Casey) produced, I partially agree with Kernc, but perhaps can point to the what (might) be the actual issue...

Kernc is using the orange data mining library in his script as if it was actually a python script written with the mining library initially.
(hence the loading of the file in two lines).

What you almost need is a 2nd "wrapper" around what you've generated that actually makes the final python lines or code.

Basically, your (as generated now) code would create a single "output string" from _describe --> that actual output string gets entered into the final python script/code either as a displayed line or comment.

Then, based on the full parsing of the 'init' , '_get_reader' and 'get_output' functions, generate 1-2 lines similar to Kernc's lines for file-loading that is correctly calling the actual mining library's read/load files.

You might also just be over-thinking what sorts of information you need from the actual widgets --> library use.

The orange documentation though doesn't do a great job of discussing the ability to load from a URL vs. a file-path...

Remember that while the widget makes the gui pretty and easy to use, theoretically at least as much functionality (including error catches) should be built into the library itself.

@Pelonza

This comment has been minimized.

Pelonza commented Jul 6, 2016

Looking deeper:
If you dive into the actual "table.py" in the full orange library, it has two functions:
orange.data.table.from_file
orange.data.table.from_url

Basically, your "code generator" from the canvas needs to get the attributes from the widget with the file or url path, and then call the appropriate table function with the path. So you can hide the "decision" in your code generator, and then generate the simple 1-2 line code for loading the table.

orange.data.table already contains all the needed imports and checking of the filenames etc. I don't know if it does the helpful output of how many of what attributes the data has, but those ought to be otherwise-callable if needed.

I think part of your challenge here may be that, unlike I initially thought, it looks like the widgets (or at least this file widget) doesn't actually call the mining-library functions via wrappers.

@kernc

This comment has been minimized.

Member

kernc commented Jul 7, 2016

Table's constructor accepts a string and then calls its from_file() or from_url() (etc.) as appropriate.

@Pelonza

This comment has been minimized.

Pelonza commented Jul 7, 2016

Even easier then. Perhaps most of the actual "work" is figuring out what has direct, easily used correspondences in orange's main library.

@kernc

This comment has been minimized.

Member

kernc commented Jul 7, 2016

Indeed. And widgets, save for the GUI handling/painting/manipulating/... code, mostly do or should do just that.

@kernc

This comment has been minimized.

Member

kernc commented Jul 7, 2016

@astaric, @janezd, @lanzagar, @ales-erjavec, @s-alexey For anyone interested, there's some technical discussion also in Ameobea#7.

@braunschweig

This comment has been minimized.

braunschweig commented Feb 17, 2017

Hi, may I ask what the status of this issue is?
Has this feature been commited to Orange and will be available any time soon?
Thanks

@kernc

This comment has been minimized.

Member

kernc commented Mar 8, 2017

Nobody is working on it. It's free to take if you're interested.

@MrMauricioLeite

This comment has been minimized.

MrMauricioLeite commented Aug 17, 2017

I must say that having a way to export any workflow to python code sound amazing. This would take the tool to a whole new level and enable it to kickstart code that can later be improved on code.

Is it in the roadmap?

@astaric astaric referenced this issue Sep 4, 2017

Merged

FAQ page #109

@lubianat

This comment has been minimized.

lubianat commented Aug 20, 2018

Hello, quite interested here in this exportation to python script too. Unfortunately, I do not have nor the skills or availability to fulfill such task now.
Was there progress on this matter in the past times? I was not able to find anything on this.

Thanks

@JoeB-UT

This comment has been minimized.

JoeB-UT commented Aug 20, 2018

I think @Ameobea worked on the request for a while, but it is not as easy as one might hope.
More Details:
Ameobea#7
https://ameo.link/u/bin/2jb
https://paste.debian.net/779226/

Having deployment functionality like this would make Orange a superior top tier development tool.

@Pelonza

This comment has been minimized.

Pelonza commented Aug 21, 2018

@hemangjoshi37a

This comment has been minimized.

hemangjoshi37a commented Aug 27, 2018

Actually I was developing my own software for machine learning GUI (Link) but then i found orange on intenet which is a peace of art i would say. But I can help to make python script converter. Please help me if you have any starting point from where should I start.

@aatarifi

This comment has been minimized.

aatarifi commented Nov 13, 2018

I just find this discussion actually it would be very useful feature once implemented, I was looking for such ability in the Orange data mining tool

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment