New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added highest expressed genes QC #169

Merged
merged 4 commits into from Jun 8, 2018

Conversation

Projects
None yet
3 participants
@fidelram
Copy link
Collaborator

fidelram commented Jun 8, 2018

This PR adds the option to make an image like the following:

sc.pl.highest_expr_genes(adata, n_top=40)

This plot is similar to the one produced by scater function plotQC and is useful to identify highly expressed genes in a sample.

To keep the code tidy I added the new plot on scanpy/plotting/qc.py I imagine that other QC plots can be added in the future. Possible future improvements can plot multiple panels by splitting the data using batch for example.

Additionally, this PR:

  • Changes the grey dot color of _scatter_obs to ligh_grey. This results in a better contrast of colors. E.g.:
sc.pl.umap(bdata, color='batch', groups=['PBMC'])

  • Added option to select the number of panels for rank_genes_groups. Without this option, if there are too many louvain groups, then the image is too wide. With the new parameter, it is easy to select how many panels per row should be plotted.
@falexwolf

This comment has been minimized.

Copy link
Member

falexwolf commented Jun 8, 2018

Awesome! 😄

@falexwolf falexwolf merged commit b14e1f2 into theislab:master Jun 8, 2018

1 check was pending

continuous-integration/travis-ci/pr The Travis CI build is in progress
Details
"""
# compute the percentage of each gene per cell
dat = normalize_per_cell(adata, counts_per_cell_after=100, copy=True)

This comment has been minimized.

@fidelram

fidelram Jun 8, 2018

Author Collaborator

@falexwolf I am using normalize_per_cell to get the percentage of counts per gene in each cell. As a second thought this may problematic in case the normalize_per_cell is updated in the future. Maybe it is better to compute the values here directly. What do you think?

This comment has been minimized.

@falexwolf

falexwolf Jun 8, 2018

Member

Yes, I noted that. I don't see a problem, there is a test in place for that, I'll add a remark to the test.

This comment has been minimized.

@falexwolf

falexwolf Jun 8, 2018

Member

Here is the test and the remark:

# note that sc.pp.normalize_per_cell is also used in
# pl.highest_expr_genes with parameter counts_per_cell_after=100

This comment has been minimized.

@falexwolf

falexwolf Jun 8, 2018

Member

PS: I added you to the author list 🙂

This comment has been minimized.

@fidelram

fidelram Jun 8, 2018

Author Collaborator

Thanks for adding me to the author list! Such a honor!

@fidelram fidelram deleted the fidelram:plotting_options branch Jun 8, 2018

@flying-sheep

This comment has been minimized.

Copy link
Member

flying-sheep commented Jun 8, 2018

Another possible extension: Allow to select different distribution plots (for all of our distribution plots):

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment