-
-
Notifications
You must be signed in to change notification settings - Fork 2.8k
Add UpSet plot function to figure_factory #4204
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…or subset inclusion.
|
It seems some tests are failing on older versions of Python because of a (presumably) older version of Also, if anyone has some insight in getting the notebook test to pass, that'd be great. It seems to be currently failing because it doesn't like the |
|
Thanks for the PR @rickymagner ! re: Pandas: looks like our Python 3.6 and 3.7 "optional" jobs still run against Pandas 0.24:
We do want to be flexible in the pandas versions we support, though 0.24 is pretty ancient. Looks like 1.1.5 is the last version that keeps Python 3.6 support and that's 2.5 years old so I'd be comfortable bumping the version in the above two files to 1.1.5 at this point. Is that new enough to support |
|
@rickymagner sorry for the contradictory notes but... thinking about this a bit more, I'd like us not to add more figure factories to plotly.py. As mentioned in #3833 (comment) further extensions like this would be better in a separate package - either one package to collect all sorts of new figure factories, or a package just for upset plots. The challenge for us of adding figure factories here is it confuses people about plotly.py vs other ways to make Plotly charts, such as direct usage of plotly.js. |
|
Thanks for getting back on this. If anything changes in the future and you'd like to discuss merging this into the FF package, let me know! |
This PR is heavily inspired by this one, building on this forum post, which gave a minimal implementation of UpSet plots using the
figure_factory. UpSet plots are a more natural way to generalize data represented via Venn diagrams, as they are more scalable and it's easier to see differences in bar sizes rather than circles. This PR builds on code introduced in the other PR, but vastly extends functionality, adds the characteristic "marginal" side plot, and includes "full" documentation.I know in the previous PR, it was stated resources are limited for
figure_factoryPRs, so I hope by trying to make this as "complete" as possible, it'll be much easier to get merged. I'd be happy to discuss the code with any reviewers, and try to provide more details on the code below. As this includes both new features and corresponding documentation, I have both checklists here. I greatly appreciate any feedback on trying to get this PR compliant and improving the code.Documentation PR
doc/README.mdfiledoc-prodbranch OR it targets themasterbranchpxexample if at all possibleplotly.graph_objects as go/plotly.express as px/plotly.io as piodffig = <something>call is high up in each new/modified example (eitherpx.<something>ormake_subplotsorgo.Figure)fig.add_*andfig.update_*rather thango.Figure(data=..., layout=...)in every new/modified examplefig.add_shapeandfig.update_xaxesare used instead of bigfig.update_layoutcalls in every new/modified examplefig.show()is at the end of each new/modified exampleplotly.plot()andplotly.iplot()are not used in any new/modified exampleCode PR
plotly.graph_objects, my modifications concern thecodegenfiles and not generated files.modified existing tests.
new tutorial notebook (please see the doc checklist as well).
Notes on the Code
To make it easier to review, I'll provide a brief description of the code layout. The code is somewhat similar to the
create_quivermethod in thefeature_factory. The maincreate_upsetmethod creates an instance of the_Upsetclass using the user inputs. Aside from a few utilities for doing some preprocessing, most of the plot generating methods are contained in this class. This structure was used to make it a little easier for the conceptual major steps in generating the plot to freely share data using the class attributes.The
_Upsetclass performs the following steps:sort_byis eitherCountsorIntersections).colorandx.xvalues).Any feedback is greatly appreciated!
A Preview
As motivation, here's a nice example plot generated in one line with a well-formatted DataFrame: