You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Additional function parameters / changed functionality / changed defaults?
New analysis tool: A simple analysis tool you have been using and are missing in sc.tools?
New plotting function: A kind of plot you would like to seein sc.pl?
External tools: Do you know an existing package that should go into sc.external.*?
Other?
Especially when we visualize large datasets with multiple categorical variables (e.g. patient, disease, cell type) using sc.pl.dotplot, and we use a sequence in the groupby argument (e.g. sc.pl.dotplot(ad, 'genex', groupby=['individual', 'disease_status', 'cell type'])), sometimes we end up with too few cells in some rows, in which summary statistics like fraction of nonzero expressors or mean expression are not very robust.
To avoid that, I think it'd be cool to have a minimum observation cutoff in the function, where e.g. min_cells=5 would show groupby combinations with at least 5 cells. Without this option, this sort of filtering becomes an annoying pandas exercise (which some might enjoy but possibly not everyone).
The text was updated successfully, but these errors were encountered:
I wonder if min_cells is too specific, and if there is a more generalizable way to handle this.
Do we give any indication of how many cells are in a group right now? I feel like this would be important for the user to even know the stats could be unreliable.
Misc other thoughts:
I think this could make sense to address in the plotting classes
Could be cool to be able to pass a grouped anndata to sc.pl.dotplot, something like:
Do we give any indication of how many cells are in a group right now? I feel like this would be important for the user to even know the stats could be unreliable.
sc.tools
?sc.pl
?sc.external.*
?Especially when we visualize large datasets with multiple categorical variables (e.g. patient, disease, cell type) using
sc.pl.dotplot
, and we use a sequence in thegroupby
argument (e.g. sc.pl.dotplot(ad, 'genex', groupby=['individual', 'disease_status', 'cell type'])
), sometimes we end up with too few cells in some rows, in which summary statistics like fraction of nonzero expressors or mean expression are not very robust.To avoid that, I think it'd be cool to have a minimum observation cutoff in the function, where e.g.
min_cells=5
would showgroupby
combinations with at least 5 cells. Without this option, this sort of filtering becomes an annoying pandas exercise (which some might enjoy but possibly not everyone).The text was updated successfully, but these errors were encountered: