Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Factor loadings dominated by mitochondrial and ribosomal genes #35

Closed
mdmanurung opened this issue Aug 17, 2022 · 1 comment
Closed

Comments

@mdmanurung
Copy link

Thank you for writing the blazing-fast package!

I have a more of a practical question regarding interpretation of the NMF results. So, I saw that my factor loadings are mostly dominated by mitochondrial and ribosomal genes. I have made sure that the cells are of good quality, so I am not sure how to interpret the results. It seems like the factors are picking up the highly expressed genes. Would it make sense to remove those genes prior to NMF?

Apologies in advance if I am asking on the wrong venue.

Regards,
Mikhael

@zdebruine
Copy link
Owner

Yes, you want to ensure that interesting signal contributes to the majority of the NNLS objective. You can do this two ways:

  1. Perform some sort of row-wise normalization (just as you perform column-wise normalization). There are some good ideas about how to do this in the Bioconductor deseq2 package.
  2. Remove features that you just aren't interested in (i.e. mtRNA and rRNA), and any features that have overwhelmingly high counts.

Note that it is NOT a good idea to select only variable features -- you will lose a lot of interesting information and statistical power.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants