Color by categorical in Comparison Plot and fix filters in Report Plot #469
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
PR Checklist
Description of changes
Previously, Report Plots would show all data even when filters were applied (Issue), had to change how
get_report_plot
was extracting the filters in order to get the expected behavior where the plots only show samples that pass the applied filters.In the Comparison Plots, only continuous variables could be used for the color. Added the ability to detect if the color is set to a variable that is categorical (i.e. contains strings) and automatically generate a color scheme based on the unique values.
In the Comparison Plots, if, for example, a user specified an x and y variable and some samples were missing the y variable but had the x variable, the plot would error out. Fixed this by changing
None
tonp.nan
.Encountered error in Comparison Plots where the point size was not being set as an integer, so casted this variable as an integer.
Technical details
Used the pencils package to aid in the automatic generation of color schemes for categoricals. This package contains a large list of unique colors, so it should be sufficient for situations where even hundreds of unique values are present in the categorical variable.
Additional context