Skip to content

PHYLOFLASH

eolesin edited this page Feb 20, 2021 · 13 revisions

Examining diversity of rRNA genes within the quality filtered reads.

# Define path to reads
PATH_READS2="01_QC"

# Define path to the database
PATH_DB="/export/dahlefs/work/Databases/Phyloflash_db/138/"

# Run phyloseq on the reads
# I wanted to simultaneously change the names of the files from here too... 
for file in ${PATH_READS2}/*R1.fastq;
    do   
    nam=$(basename ${file} | rev | cut -f2- -d"-" | rev | cut -f2- -d"-"); \
    firstbit=$(basename ${file} | cut -f1 -d"-");
    
    echo "analyzing...."${nam} 
    
    phyloFlash.pl -lib ${nam} -read1 ${PATH_READS2}/$firstbit-${nam}-QUALITY_PASSED_R1.fastq \
    -read2 ${PATH_READS2}/$firstbit-${nam}-QUALITY_PASSED_R2.fastq -almosteverything -CPUs 10 \
    -readlength 150 -dbhome ${PATH_DB}

    done


# Compare different samples to one another, either in a heatmap
# or a barplot format.
phyloFlash_compare.pl --allzip --task heatmap, barplot

# Can also create long table outputs of the phyloFlash data from various comparisons. 
# Here I compared groups of samples I picked out using my pick_input3.py program
while read groups; \
    do set=$(echo $groups | cut -f2 -d' '); \
    nam=$(echo $groups | cut -f1 -d'\'); \
    phyloFlash_compare.pl --csv $set --task ntu_table --out $nam_SpecLvl --level 7; \
done < for_pfcompare_groups

Note:

I repeated the phyloflash analysis after human decontamination

Fishing for a certain phylogenetic group in your phyloflash data? Say no more!


# Example with Zetaproteobacteria:

# Taxa match & Abundance
# First I got the info about Zetaproteobacteria that were in some samples. 
grep "Zeta" {GS19-ROV16*extracted*,GS19-ROV14*extracted*,GS19-ROV17*extracted*,GS19-ROV18*extracted*} > Zetas_good.txt

In each line in Zetas_good.txt it looks like:
GS19-ROV17-BS06.PFspades_27,453,5.795771,JQLW01000012.87223.88743,Bacteria;Proteobacteria;Zetaproteobacteria ....
In this case above, the number 453 after the OTU name (PFspades_27) is the abundance of this OTU in the sample.

# OTU seqs
### Define each "OTU" name in the Zetas_good file as something to grep on, extract between two different delimiters (: and ,)
### Grep on that name, save the result to the file.
### In grep: Passing the -A flag fetches the line after that you got a grep match for (just the first line down for us, since that line has the sequence).
### In grep: Passing the -h flag prevents grep from including the name of the file the match came from in its output.
while read line; do OTU=$(echo $line | cut -d":" -f2 | cut -d"," -f1); grep $OTU"_" {GS19-ROV16*rRNAs.final*,GS19-ROV14*rRNAs.final*,GS19-ROV17*rRNAs.final*,GS19-ROV18*rRNAs.final*} -A 1 -h; done < Zetas_good.txt > Zeta_OTUs.fa

Clone this wiki locally