Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increase in library size differences with CSS normalization #61

Open
CarlyMuletzWolz opened this issue Oct 9, 2018 · 1 comment
Open

Comments

@CarlyMuletzWolz
Copy link

For some of my projects, I find that CSS normalization decreases the variance in library size compared to raw sequencing depth. I am assuming this is one of the objectives of CSS normalization.

In other microbiome projects, I find that CSS normalization increases the variance in library size. Or in other words, dividing the the sum of OTU counts for the highest coverage sample by the sum of the OTU counts for the lowest coverage sample can be greater after CSS normalization compared to raw library sizes.

According to a study by Weiss et al. 2017 (Normalization and microbial differential abundance strategies depend upon data characteristics in Microbiome) having library sizes with large differences (>10X) rarefying lowers the false discovery rate.

For instance, in one microbiome study library size difference was 17X for raw reads, and increased to 53X with CSS-normalization. Would you expect that may occur, and is it ok to proceed with the CSS-normalization? Or is this an indication of some problem?

In all of my microbiome datasets (n = 4). p <- cumNormStatFast(MGS) has always returned 'using the default value'

Any comment would be greatly appreciated.

@CarlyMuletzWolz
Copy link
Author

Hi @hcorrada @jnpaulson, any insight on the question above. Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant