-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
"UX" improvements to the code -- accessing and exporting the results #34
Comments
First of all I think we should have a trivially intuitive way for the users to access the attractors and the succession diagram once it is generated. In the "run.py" code there is a nice code where the attractors are turned into a pandas dataframe.
We could turn this into a function withing the succession class, such as What do you think? |
While I agree that we need to improve the way attractors and so on are organized, I am not sold on the pandas approach -- isn't there a way to get the job done without an extra dependency? What about a json approach? This is what PyBoolNet uses, so it would make sense that we try to follow a similar format. We could take a closer look at how they output attractors in general and try to mirror that to the extent possible. |
The json (dictionary) approach is a good idea, however (in my sample of people using python) pandas is very common. We can have both as options. I'll propose an function for both and we can decide later what to keep. |
I don't dispute that pandas is commonly used. But pandas can already import from json, so I don't see why both approaches would be needed. It would just be one more thing to maintain & version check going forward. What are the benefits to using a pandas DataFrame for this output? Wouldn't a more lightweight structure be sufficient? If we want to save the data, we can export to json or the user can pickle the SuccessionDiagram object. If we want to use the data, that should be possible from the SuccessionDiagram object directly (and if it's not obvious how to use that class, we need to fix that first). |
I do see your point of using JSON to be consistent with PyBoolNet and the extra dependency maintenance problem. But, personally, I think having the output in the simplest human (and excel) readable format out-weights this. I think it is in the best interest of most users, who would just want to put in boolean rules and get as output a file with attractors + stable motifs + reprogramming interventions in a readable format. This is what the java StableMotifs program did, and the this was the reasoning behind it. I would also say this is even the case for me for when I want to do a first pass on a model. |
I agree that we need some better tools for human-readable and csv outputs and that it should be easy to get an attractors DataFrame. Where I disagree is that I don't think we need pandas as a dependency to do this. I envision something like:
or,
We could put stuff like this in a tutorial or example so people know they can easily use pandas if they want to, but pandas wouldn't be a dependency. That being said, you might be able to persuade me if you have good answers to these questions:
As an aside, before writing the export functions for any format I think we should revise the way attractors are counted and stored internally (see #31). The code from run.py can miscount in certain situations. |
I think I understand now exactly what you mean. I agree that something like what you suggest for the data frame output is a good way to avoid having to include pandas. I am not committed to having to use pandas, and I do not think we necessarily need to use it to accomplish what you suggest. But I am not familiar enough with the other libraries we have to know if they can or how to do it. This is why one of the first things I did when I started running benchmarks was write the code Dávid mentioned above. For me, a dataframe format is a natural-enough way to store the attractors that has the advantage of being direclty human readable. And if we are using dataframes, I think pandas will make our (and other's) lives easier, particularly because it plays nicely with Jupyter notebooks. But I can definitely see why a JSON/dictionary approach might be preferable because of its compactness and because of the way stable motifs, fixed nodes, etc. are currently being stored. |
I do get the idea that we want to keep this library neat and with as few dependencies as possible. However, it's a compromise to make things also somewhat "user friendly". We could have the exact same argument about using NetworkX to export the succession diagram, as it brings in an extra dependency, and storing network structure in json or a simple edge-list is also a clean way. I will attempt to answer Jordan's questions:
We shouldn't write anything into a csv and read it back again just to make the information human readable.
As Jorge pointed out, most people will use Jupyter notebooks, and having a pandas df available without the hassle of making conversions is just convenient. When it comes to exporting stuff, that's where I think pandas has a significant advantage; especially if the model they analyze has funky variable names. I believe making exports for our own purposes it's also advantageous to use pandas. Once again, if the main priority is keeping things neat and "pythonic" I too can be convinced. |
I would like to raise another UX issue (we can create a separate thread for this) regarding the succession diagrams. Right now there is a number of functions in the Succession class that do some network related transformation, but for me, as a "user" it's unclear which does what.
The graph in the second line is empty unless we call the first function. I suggest the following modifications: I don't see a reason to store |
I have several points to make, so I'll try to stay organized.
I would prefer pandas construction to use a set-in-stone standard format like json or csv so there's less chance of things breaking in 50 years. By the way, my first job involved translating and/or replacing deprecated 50-year-old Fortran code that was supposed to "last forever". The people I was working for still relied on that code, and it had already been translated from punchcards once! So I've been burned by excessive dependencies before :) |
From comments in other threads, it seems like a solution roughly similar to the one I suggested above (in 5.) and having pandas as an optional dependency is an acceptable compromise. Is everyone OK with that? We don't have to decide on all the details until after the new attractor repertoire classes are implemented (#37). |
This seems very reasonable to me.
…On Fri, May 8, 2020 at 1:11 PM Jordan Rozum ***@***.***> wrote:
From comments in other threads, it seems like a solution roughly similar
to the one I suggested above (in 5.) and having pandas as an optional
dependency is an acceptable compromise. Is everyone OK with that?
We don't have to decide on all the details until after the new attractor
repertoire classes are implemented (#37
<#37>).
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#34 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADLEUXJVXAQCUBU2K2FZGATRQQ4MHANCNFSM4MRZIG7A>
.
--
Reka Albert
Distinguished Professor of Physics and Biology
Pennsylvania State University
104 Davey Laboratory, PMB 261
University Park, PA 16802
Phone: (814) 865-1141
Web: *https://www.ralbert.me <https://www.ralbert.me>*
|
Hi all. I added |
This thread is out of date, as we have added in the AttractorRepertoire class. I will rephrase the issue with our updated organization: In the Export.py module, we want a function that takes an AttractorRepertoire object as input and returns a pandas data frame. That contains (at a minimum) the contents of the AttractorRepertoire.summary() output. |
I don't know who added this or when, but we have sm.Export.attractor_dataframe that does exactly this, so I'll close the issue. |
This should be a thread where we discuss the user experience issues and possible improvements.
The text was updated successfully, but these errors were encountered: