Skip to content
eolesin edited this page Dec 3, 2021 · 3 revisions

Carbon oxidation state investigations of whole metagenomes

starting from human-cleaned data, we deduplicate the reads. This information is important for read mapping and abundance estimates, but apparently when it comes to the carbon oxidation state calculations we want only non-duplicated reads.

# dereplicate illumina paired-end reads
conda activate cd-hit
cd-hit-est -i 02_HUMAN_Decontam/GS19-ROV16-BS04-cleanR1.fq -j 02_HUMAN_Decontam/GS19-ROV16-BS04-cleanR2.fq -o 13_CDHIT/BS04_cdhitout_R1 -op 13_CDHIT/BS04_cdhitout_R2 -M 1000000

Clone this wiki locally