Order of pre-processing steps #48

parashardhapola · 2017-11-15T09:44:22Z

Hi,

In the example notebook, seurat.ipynb, the function sc.pp.normalize_per_cell() is run before sc.pp.regress_out(). Is it not better to regress out the effect of n_counts before normalization? I do not completely understand this and it would be great if the authors could explain this order of pre-processing. Also, is there certain order(s) of steps which should always be avoided?

Thank you.

Best,
Parashar

The text was updated successfully, but these errors were encountered:

falexwolf · 2017-11-27T22:48:13Z

I'm very sorry for having forgotten about this issue...

Of course, sc.pp.normalize_per_cell() stores the total counts per cell prior to normalization as n_counts. See the examples here https://scanpy.readthedocs.io/en/latest/api/scanpy.api.pp.normalize_per_cell.html

Performing the normalization removes the effect of having different total counts per cell by scaling each gene with the total counts.

But one might want more: if there is still some correlation of a gene with n_counts after normalization, one concludes that the simple scaling done in normalization has not fully removed the effect of n_counts on that particular gene. Hence, using sc.pp.regress_out, one performs an additional gene-wise correction.

I have to admit that I have not investigated how necessary this is. As you know, this is adapted from the Seurat tutorial - I guess the authors of Seurat found it useful in some cases to fully remove the effect of n_counts on each single gene.

falexwolf closed this as completed Jan 1, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Order of pre-processing steps #48

Order of pre-processing steps #48

parashardhapola commented Nov 15, 2017 •

edited

falexwolf commented Nov 27, 2017

Order of pre-processing steps #48

Order of pre-processing steps #48

Comments

parashardhapola commented Nov 15, 2017 • edited

falexwolf commented Nov 27, 2017

parashardhapola commented Nov 15, 2017 •

edited