# Snail hemolymph microbiome


## Aim

This notebook contains the different steps to install the environment, download the data and perform the data analysis. 


Several folders present or will be created during the analysis. Here is the list of the folders and their content:
* **data**: Raw data files used for the analysis.
* **.env**: Files needed to create appropriate environment.
* **graphs**: Graphical representation of the data. If not existing, this will be created during the analysis
* **results**: Files that are generating through data processing. If not existing, this will be created during the analysis
* **scripts**: Scripts used for the analysis.


## Environment and data

### Creating environment

In [None]:
# Check if conda available
[[ ! $(which conda 2> /dev/null) ]] && echo "conda not available in \$PATH. Exiting..." && exit 1

# Creating conda environment
conda env create -f .env/env.yml

This cell must be run each time a new session of Jupyter is run.

In [9]:
# Activate the environment
source $(sed "s,/bin/conda,," <<<$CONDA_EXE)/etc/profile.d/conda.sh
conda activate ubiome_hml

(ubiome_hml) 

: 1

In [13]:
# Installing needed R packages
Rscript ".env/R package dependencies.R"

(ubiome_hml) Bioconductor version 3.8 (BiocManager 1.30.9), ?BiocManager::install for help

Attaching package: ‘BiocManager’

The following object is masked from ‘package:devtools’:

    install

Skipping install of 'microbiome' from a bioc_git2r remote, the SHA1 (11ae6af6) has not changed since last install.
  Use `force = TRUE` to force installation
Bioconductor version 3.8 (BiocManager 1.30.9), R 3.5.1 (2018-07-02)
Installing package(s) 'phyloseq'
trying URL 'https://bioconductor.org/packages/3.8/bioc/src/contrib/phyloseq_1.26.1.tar.gz'
Content type 'application/x-gzip' length 5494354 bytes (5.2 MB)
downloaded 5.2 MB

* installing *source* package ‘phyloseq’ ...
** R
** data
** inst
** byte-compile and prepare package for lazy loading
** help
*** installing help indices
** building package indices
** installing vignettes
** testing if installed package can be loaded
* DONE (phyloseq)

The downloaded source packages are in
	‘/tmp/RtmpFx6F5J/downloaded_packages’
Updating HTML index of

: 1

### Downloading sequencing data

This step downloads the fastq files of the different samples.

In [None]:
# Data directory
ldir="data/libraries"
[[ ! -d "$ldir" ]] && mkdir -p "$ldir"

# Bioproject
bioproject=PRJNA613098

# Download related information to data project
wget -q -O runinfo "http://trace.ncbi.nlm.nih.gov/Traces/sra/sra.cgi?save=efetch&rettype=runinfo&db=sra&term=${bioproject}"

# Field of interest (library name and weblink)
fdn=$(head -n 1 runinfo | tr "," "\n" | grep -w -n "LibraryName" | cut -d ":" -f 1)
fdr=$(head -n 1 runinfo | tr "," "\n" | grep -w -n "Run" | cut -d ":" -f 1)

# Download fastq files
while read line
do
    # Filename and download link
    fln=$(cut -d "," -f $fdn <<<$line)
    run=$(cut -d "," -f $fdr <<<$line)
    
    # Download
    echo "$fln"
    #wget -P "$ldir" -O "$fln" "http://trace.ncbi.nlm.nih.gov/Traces/sra/sra.cgi?cmd=dload&run_list=${run}&format=fastq"
    fastq-dump -O "$ldir" --split-files "$run"
    
    mv "$ldir/${run}_1.fastq" "$ldir/${fln}_R1.fastq"
    mv "$ldir/${run}_2.fastq" "$ldir/${fln}_R2.fastq"
        
done < <(tail -n +2 runinfo | head -1)

# Compress files
pigz "$ldir/"*

rm runinfo

### Downloading database

The Silva database is used to assign taxonomy to the ASVs generated from the sequencing data.

In [15]:
# Database directory
dbdir="data/Silva db"
[[ ! -d "$dbdir" ]] && mkdir -p "$dbdir"


# Download and extract the relevant Silva file
wget -P "$dbdir" 'https://www.arb-silva.de/fileadmin/silva_databases/qiime/Silva_132_release.zip'
unzip "$dbdir/Silva_132_release.zip" -d "$dbdir" && rm "$dbdir/Silva_132_release.zip"

# Import the sequence database in Qiime format
qiime tools import \
  --input-path "$dbdir/SILVA_132_QIIME_release/rep_set/rep_set_16S_only/99/silva_132_99_16S.fna" \
  --output-path "$dbdir/silva_132_99_16S.qza" \
  --type 'FeatureData[Sequence]'

# Import the taxonomy database in Qiime format
qiime tools import \
  --input-path "$dbdir/SILVA_132_QIIME_release/taxonomy/16S_only/99/taxonomy_all_levels.txt" \
  --output-path "$dbdir/silva_132_99_16S_taxa.qza" \
  --type 'FeatureData[Taxonomy]' \
  --input-format HeaderlessTSVTaxonomyFormat

(ubiome_hml) (ubiome_hml) (ubiome_hml) (ubiome_hml) (ubiome_hml) (ubiome_hml) --2020-04-10 12:41:44--  https://www.arb-silva.de/fileadmin/silva_databases/qiime/Silva_132_release.zip
Resolving www.arb-silva.de... 194.95.6.29
Connecting to www.arb-silva.de|194.95.6.29|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 2466176198 (2.3G) [application/zip]
Saving to: ‘database/Silva_132_release.zip’



IOPub data rate exceeded.
The notebook server will temporarily stop sending output
to the client in order to avoid crashing it.
To change this limit, set the config variable
`--NotebookApp.iopub_data_rate_limit`.

Current values:
NotebookApp.iopub_data_rate_limit=1000000.0 (bytes/sec)
NotebookApp.rate_limit_window=3.0 (secs)




2020-04-10 15:10:56 (269 KB/s) - ‘database/Silva_132_release.zip’ saved [2466176198/2466176198]

(ubiome_hml) Archive:  database/Silva_132_release.zip
   creating: database/SILVA_132_QIIME_release/
   creating: database/SILVA_132_QIIME_release/core_alignment/
  inflating: database/SILVA_132_QIIME_release/core_alignment/80_core_alignment.fna  
   creating: database/SILVA_132_QIIME_release/raw_data/
  inflating: database/SILVA_132_QIIME_release/raw_data/.DS_Store  
   creating: database/__MACOSX/
   creating: database/__MACOSX/SILVA_132_QIIME_release/
   creating: database/__MACOSX/SILVA_132_QIIME_release/raw_data/
  inflating: database/__MACOSX/SILVA_132_QIIME_release/raw_data/._.DS_Store  
  inflating: database/SILVA_132_QIIME_release/raw_data/raw_taxonomy_split_by_level.txt.zip  
  inflating: database/SILVA_132_QIIME_release/raw_data/raw_taxonomy.txt.zip  
  inflating: database/SILVA_132_QIIME_release/raw_data/initial_reads_SILVA132.fna.zip  
  inflating: database/SILVA_132_QIIME_rel

  inflating: database/SILVA_132_QIIME_release/taxonomy/16S_only/97/taxonomy_7_levels.txt  
  inflating: database/SILVA_132_QIIME_release/taxonomy/16S_only/97/taxonomy_all_levels.txt  
  inflating: database/SILVA_132_QIIME_release/taxonomy/16S_only/97/majority_taxonomy_all_levels.txt  
  inflating: database/SILVA_132_QIIME_release/taxonomy/16S_only/97/majority_taxonomy_7_levels.txt  
  inflating: database/SILVA_132_QIIME_release/taxonomy/16S_only/97/consensus_taxonomy_all_levels.txt  
  inflating: database/SILVA_132_QIIME_release/taxonomy/16S_only/97/consensus_taxonomy_7_levels.txt  
  inflating: database/__MACOSX/SILVA_132_QIIME_release/taxonomy/._16S_only  
   creating: database/SILVA_132_QIIME_release/taxonomy/18S_only/
   creating: database/SILVA_132_QIIME_release/taxonomy/18S_only/90/
  inflating: database/SILVA_132_QIIME_release/taxonomy/18S_only/90/raw_taxonomy.txt  
  inflating: database/SILVA_132_QIIME_release/taxonomy/18S_only/90/taxonomy_7_levels.txt  
  inflating: database/S

   creating: database/SILVA_132_QIIME_release/trees/99/
  inflating: database/SILVA_132_QIIME_release/trees/99/99_otus.tre  
  inflating: database/SILVA_132_QIIME_release/Silva_132_notes.txt  
  inflating: database/__MACOSX/SILVA_132_QIIME_release/._Silva_132_notes.txt  
(ubiome_hml) (ubiome_hml) (ubiome_hml) [32mImported database/SILVA_132_QIIME_release/rep_set/rep_set_16S_only/99/silva_132_99_16S.fna as DNASequencesDirectoryFormat to database/silva_132_99_16S.qza[0m
(ubiome_hml) (ubiome_hml) (ubiome_hml) [32mImported database/SILVA_132_QIIME_release/taxonomy/16S_only/99/taxonomy_all_levels.txt as HeaderlessTSVTaxonomyFormat to database/silva_132_99_16S_taxa.qza[0m
(ubiome_hml) 

: 1

## Qiime pipeline

This section process the data to generate ASVs and assign taxonomy.

In [18]:
# Qiime output directory
qdir="results/1-qiime"
[[ ! -d "$qdir" ]] && mkdir -p "$qdir"

# Metadata file
metadata="data/sample-metadata.tsv"

(ubiome_hml) (ubiome_hml) (ubiome_hml) (ubiome_hml) (ubiome_hml) (ubiome_hml) 

: 1

### Sequencing data

If sequencing data are present, they will be imported as a Qiime artifact.

In [25]:
# Check for sequencing data
[[ ! $(find "$ldir" -type f -name *fastq.gz) ]] && echo  "No sequencing data. Exiting..." && exit 1

# Create the manifest for importing data in artefact
## source: https://docs.qiime2.org/2019.4/tutorials/importing/#fastq-manifest-formats
for i in $(ls "$ldir"/* | sed "s,_R[12].fastq.*,,g" | uniq)
do
    nm=$(sed "s,$ldir/,," <<<$i)
    fl=$(ls -1 "$PWD/$i"* | tr "\n" "\t")

    echo -e "$nm\t$fl"
done > "$qdir/manifest"

# Add header
sed -i "1s/^/sample-id\tforward-absolute-filepath\treverse-absolute-filepath\n/" "$qdir/manifest"

# Import data
## source: https://docs.qiime2.org/2019.4/tutorials/importing/
qiime tools import \
  --type 'SampleData[PairedEndSequencesWithQuality]' \
  --input-path "$qdir/manifest" \
  --input-format PairedEndFastqManifestPhred33V2 \
  --output-path "$qdir/demux-paired-end.qza"

(ubiome_hml) (ubiome_hml) (ubiome_hml) (ubiome_hml) (ubiome_hml) (ubiome_hml) (ubiome_hml) (ubiome_hml) (ubiome_hml) (ubiome_hml) (ubiome_hml) (ubiome_hml) [32mImported results/1-qiime/manifest as PairedEndFastqManifestPhred33V2 to results/1-qiime/demux-paired-end.qza[0m
(ubiome_hml) 

: 1

**Note**:
* No need to remove adapters and barcodes. This has been done during `bcl2fastq`. This can be checked using `grep`.
* Importing with `--input-format PairedEndFastqManifestPhred33V2` instead of `--input-format CasavaOneEightSingleLanePerSampleDirFmt` for [custom sample names](https://docs.qiime2.org/2019.4/tutorials/importing/#fastq-manifest-formats).

### Data quality

To assess data quality, we need to generate a visualization to check data quality. The visualization can be view on [Qiime2 website](https://view.qiime2.org/)

In [26]:
# Make a summary to check read quality
qiime demux summarize \
  --i-data "$qdir/demux-paired-end.qza" \
  --o-visualization "$qdir/demux-paired-end.qzv"

(ubiome_hml) [32mSaved Visualization to: results/1-qiime/demux-paired-end.qzv[0m
(ubiome_hml) 

: 1

**Note**: read quality drops toward the end but are still above 10. So no trimming done.

### Sequence clustering and denoising

This steps generates ASVs from the sequencing data. This step is perform by the `dada2` module.

In [27]:
[[ -f "$qdir/table.qza" ]] && echo "ASV file (table.qza) exists already. Exiting..." && exit 1

# Run dada2
qiime dada2 denoise-paired \
  --i-demultiplexed-seqs "$qdir/demux-paired-end.qza" \
  --p-trunc-len-f 250 \
  --p-trunc-len-r 250 \
  --p-max-ee 5 \
  --p-n-threads 0 \
  --o-table "$qdir/table.qza" \
  --o-representative-sequences "$qdir/rep-seqs.qza" \
  --o-denoising-stats "$qdir/denoising-stats.qza"

# Add metadata information to the denoising stats
qiime metadata tabulate \
    --m-input-file "$metadata" \
    --m-input-file "$qdir/denoising-stats.qza" \
    --o-visualization "$qdir/denoising-stats.qzv"

(ubiome_hml) (ubiome_hml) (ubiome_hml) [32mSaved FeatureTable[Frequency] to: results/1-qiime/table.qza[0m
[32mSaved FeatureData[Sequence] to: results/1-qiime/rep-seqs.qza[0m
[32mSaved SampleData[DADA2Stats] to: results/1-qiime/denoising-stats.qza[0m
(ubiome_hml) (ubiome_hml) (ubiome_hml) [32mSaved Visualization to: results/1-qiime/denoising-stats.qzv[0m
(ubiome_hml) 

: 1

**Note:** Visual inspection shows that optimal number of features is 25079 which allows retaining 2,633,295 (46.33%) features in 105 (99.06%) samples. This leads to the exclusion of one sample: `Ba.Water.1`.

### Taxonomy identification

This step assigns taxonomy to the ASVs generated.

In [28]:
qiime feature-classifier classify-consensus-vsearch \
  --i-query "$qdir/rep-seqs.qza" \
  --i-reference-reads "$dbdir/silva_132_99_16S.qza" \
  --i-reference-taxonomy "$dbdir/silva_132_99_16S_taxa.qza" \
  --p-perc-identity 0.97 \
  --p-threads $(nproc) \
  --o-classification "$qdir/rep-seqs_taxa.qza"

[32mSaved FeatureData[Taxonomy] to: results/1-qiime/rep-seqs_taxa.qza[0m
(ubiome_hml) 

: 1

### Phylogeny

This step generates a phylogeny from the ASVs ([source](https://chmi-sops.github.io/mydoc_qiime2.html)).

In [29]:
[[ -f "$qdir/rooted-tree.qza" ]] && echo "A tree file (rooted-tree.qza) exists already. Exiting..." && exit 1

# Multiple seqeunce alignment using Mafft
qiime alignment mafft \
    --i-sequences "$qdir/rep-seqs.qza" \
    --o-alignment "$qdir/aligned-rep-seqs.qza"

# Masking (or filtering) the alignment to remove positions that are highly variable. These positions are generally considered to add noise to a resulting phylogenetic tree.
qiime alignment mask \
    --i-alignment "$qdir/aligned-rep-seqs.qza" \
    --o-masked-alignment "$qdir/masked-aligned-rep-seqs.qza"

# Creating tree using the Fasttree program
qiime phylogeny fasttree \
    --i-alignment "$qdir/masked-aligned-rep-seqs.qza" \
    --o-tree "$qdir/unrooted-tree.qza"

# Root the tree using the longest root
qiime phylogeny midpoint-root \
    --i-tree "$qdir/unrooted-tree.qza" \
    --o-rooted-tree "$qdir/rooted-tree.qza"

(ubiome_hml) (ubiome_hml) (ubiome_hml) [32mSaved FeatureData[AlignedSequence] to: results/1-qiime/aligned-rep-seqs.qza[0m
(ubiome_hml) (ubiome_hml) (ubiome_hml) [32mSaved FeatureData[AlignedSequence] to: results/1-qiime/masked-aligned-rep-seqs.qza[0m
(ubiome_hml) (ubiome_hml) (ubiome_hml) [32mSaved Phylogeny[Unrooted] to: results/1-qiime/unrooted-tree.qza[0m
(ubiome_hml) (ubiome_hml) (ubiome_hml) [32mSaved Phylogeny[Rooted] to: results/1-qiime/rooted-tree.qza[0m
(ubiome_hml) 

: 1

## Library analysis

Analyze Qiime visualization and generate table summarizing number of reads after each step of the pipeline.

In [52]:
scripts/library_stats.R

Output:  ../results/1-qiime/lib-seq-stat (supp. table 1).tsv 
Output:  ../graphs/seq.ct_1.pdf 
Output:  ../graphs/seq.final_1.pdf 
Output:  ../graphs/seq.ct_2.pdf 
Output:  ../graphs/seq.final_2.pdf 
Output:  ../results/1-qiime/lib-stat.tsv 
(ubiome_hml) 

: 1

## Microbiome diversity

This step analyze ASV data using different methods (rarefaction curve, $\alpha$ and $\beta$ diversity). Technical replicates are also evaluated to show the robustness of the library generation method. Details about the methods used are in the R script. Analysis of the results are in the manuscript.

In [63]:
Rscript scripts/microbiome_diversity.R

Loading packages...
Loading functions...
Setting variables...
1: Expected 7 pieces. Additional pieces discarded in 1045 rows [3, 4, 5, 7, 9, 12, 16, 23, 27, 28, 33, 34, 35, 42, 43, 48, 49, 50, 51, 55, ...]. 
2: Expected 7 pieces. Missing pieces filled with `NA` in 1643 rows [1, 2, 6, 8, 10, 11, 13, 14, 15, 17, 18, 19, 20, 21, 22, 24, 25, 26, 29, 30, ...]. 
rarefying sample Ba.1.1
rarefying sample Ba.1.2
rarefying sample Ba.2.1
rarefying sample Ba.2.2
rarefying sample Ba.3.1
rarefying sample Ba.3.2
rarefying sample Ba.4.1
rarefying sample Ba.4.2
rarefying sample Ba.5.1
rarefying sample Ba.5.2
rarefying sample Ba.Water.1
rarefying sample Ba.Water.2
rarefying sample Bg121.1.1
rarefying sample Bg121.1.2
rarefying sample Bg121.2.1
rarefying sample Bg121.2.2
rarefying sample Bg121.3.1
rarefying sample Bg121.3.2
rarefying sample Bg121.4.1
rarefying sample Bg121.4.2
rarefying sample Bg121.5.1
rarefying sample Bg121.5.2
rarefying sample Bg121.Water.1
rarefying sample Bg121.Water.2
rarefying sam

: 1

## Functional inference

Analysis following this [tutorial](https://github.com/picrust/picrust2/wiki/PICRUSt2-Tutorial-(v2.1.4-beta)#pathway-level-inference)

In [64]:
# PiCRUST output directory
pdir="results/2-picrust2"
[[ ! -d "$pdir" ]] && mkdir -p "$pdir"

(ubiome_hml) (ubiome_hml) (ubiome_hml) 

: 1

In [65]:
# Filter table to retain sample from each replicate
for i in {1..2}
do
    awk -v i=$i 'NR==1; $5 == i {print}' "$metadata" > "$qdir/.metadata"
    
    qiime feature-table filter-samples \
        --i-table "$qdir/table.qza"  \
        --m-metadata-file "$qdir/.metadata" \
        --o-filtered-table "$qdir/table_rep$i.qza"
done

# Clean
rm "$qdir/.metadata"

(ubiome_hml) [32mSaved FeatureTable[Frequency] to: results/1-qiime/table_rep1.qza[0m
[32mSaved FeatureTable[Frequency] to: results/1-qiime/table_rep2.qza[0m
(ubiome_hml) (ubiome_hml) (ubiome_hml) (ubiome_hml) 

: 1

In [66]:
qiime tools export \
    --input-path "$qdir/rep-seqs.qza" \
    --output-path "$pdir/"

for i in {1..2}
do
    qiime tools export \
        --input-path "$qdir/table_rep$i.qza" \
        --output-path "$pdir/rep$i"
done

[32mExported results/1-qiime/rep-seqs.qza as DNASequencesDirectoryFormat to directory results/2-picrust2/[0m
(ubiome_hml) (ubiome_hml) [32mExported results/1-qiime/table_rep1.qza as BIOMV210DirFmt to directory results/2-picrust2/rep1[0m
[32mExported results/1-qiime/table_rep2.qza as BIOMV210DirFmt to directory results/2-picrust2/rep2[0m
(ubiome_hml) 

: 1

In [67]:
# Place reads into reference tree
place_seqs.py -s "$pdir/dna-sequences.fasta" -o "$pdir/out.tre" -p $(nproc) \
    --intermediate "$pdir/intermediate/place_seqs"

# Hidden-state prediction of gene families
hsp.py -i 16S -t "$pdir/out.tre" -o "$pdir/marker_predicted_and_nsti.tsv.gz" -p $(nproc) -n
hsp.py -i EC -t "$pdir/out.tre" -o "$pdir/EC_predicted.tsv.gz" -p $(nproc) -n

# Number of outliers
zcat "$pdir/marker_predicted_and_nsti.tsv.gz"  | tail -n +2 | awk '$3 >= 2' | wc -l

(ubiome_hml) (ubiome_hml) (ubiome_hml) (ubiome_hml) (ubiome_hml) (ubiome_hml) (ubiome_hml) (ubiome_hml) 136
(ubiome_hml) 

: 1

In [68]:
# Run for each replicate
for i in {1..2}
do
    # Generate metagenome predictions
    metagenome_pipeline.py \
        -i "$pdir/rep$i/feature-table.biom" \
        -m "$pdir/marker_predicted_and_nsti.tsv.gz" \
        -f "$pdir/EC_predicted.tsv.gz" \
        --max_nsti 2.0 \
        -o "$pdir/rep$i/EC_metagenome_out" \
        --strat_out --metagenome_contrib

    # Pathway-level inference
    pathway_pipeline.py \
        -i "$pdir/rep$i/EC_metagenome_out/pred_metagenome_strat.tsv.gz" \
        -o "$pdir/rep$i/pathways_out" -p $(nproc)

    #Add functional descriptions
    add_descriptions.py -i "$pdir/rep$i/EC_metagenome_out/pred_metagenome_unstrat.tsv.gz" -m EC \
                        -o "$pdir/rep$i/EC_metagenome_out/pred_metagenome_unstrat_descrip.tsv.gz"

    add_descriptions.py -i "$pdir/rep$i/pathways_out/path_abun_unstrat.tsv.gz" -m METACYC \
                        -o "$pdir/rep$i/pathways_out/path_abun_unstrat_descrip.tsv.gz"
done

(ubiome_hml) 136 of 2688 ASVs were above the max NSTI cut-off of 2.0 and were removed.
136 of 2688 ASVs were above the max NSTI cut-off of 2.0 and were removed.
136 of 2688 ASVs were above the max NSTI cut-off of 2.0 and were removed.
136 of 2688 ASVs were above the max NSTI cut-off of 2.0 and were removed.
(ubiome_hml) 

: 1

In [14]:
scripts/pathway_analysis.R

Loading required package: zCompositions
Loading required package: MASS
Loading required package: NADA
Loading required package: survival

Attaching package: ‘NADA’

The following object is masked from ‘package:stats’:

    cor

Loading required package: truncnorm
Loading required package: car
Loading required package: carData

Attaching package: ‘igraph’

The following objects are masked from ‘package:stats’:

    decompose, spectrum

The following object is masked from ‘package:base’:

    union

aldex.clr: generating Monte-Carlo instances and clr values
operating in serial mode
removed rows with sums equal to zero
computing zero removal
data format is OK
dirichlet samples complete
transformation complete
aldex.ttest: doing t-test
running tests for each MC instance:
|------------(25%)----------(50%)----------(75%)----------|
aldex.effect: calculating effect sizes
operating in serial mode
sanity check complete
rab.all  complete
rab.win  complete
rab of samples complete
within sample di

computing center with all features
data format is OK
dirichlet samples complete
transformation complete
aldex.ttest: doing t-test
running tests for each MC instance:
|------------(25%)----------(50%)----------(75%)----------|
aldex.effect: calculating effect sizes
operating in serial mode
sanity check complete
rab.all  complete
rab.win  complete
rab of samples complete
within sample difference calculated
between group difference calculated
group summaries calculated
effect size calculated
summarizing output
aldex.clr: generating Monte-Carlo instances and clr values
operating in serial mode
removed rows with sums equal to zero
computing center with all features
data format is OK
dirichlet samples complete
transformation complete
aldex.ttest: doing t-test
running tests for each MC instance:
|------------(25%)----------(50%)----------(75%)----------|
aldex.effect: calculating effect sizes
operating in serial mode
sanity check complete
rab.all  complete
rab.win  complete
rab of samples com

removed rows with sums equal to zero
computing center with all features
data format is OK
dirichlet samples complete
transformation complete
aldex.ttest: doing t-test
running tests for each MC instance:
|------------(25%)----------(50%)----------(75%)----------|
aldex.effect: calculating effect sizes
operating in serial mode
sanity check complete
rab.all  complete
rab.win  complete
rab of samples complete
within sample difference calculated
between group difference calculated
group summaries calculated
effect size calculated
summarizing output
aldex.clr: generating Monte-Carlo instances and clr values
operating in serial mode
removed rows with sums equal to zero
computing center with all features
data format is OK
dirichlet samples complete
transformation complete
aldex.ttest: doing t-test
running tests for each MC instance:
|------------(25%)----------(50%)----------(75%)----------|
aldex.effect: calculating effect sizes
operating in serial mode
sanity check complete
rab.all  complete

operating in serial mode
removed rows with sums equal to zero
computing center with all features
data format is OK
dirichlet samples complete
transformation complete
aldex.ttest: doing t-test
running tests for each MC instance:
|------------(25%)----------(50%)----------(75%)----------|
aldex.effect: calculating effect sizes
operating in serial mode
sanity check complete
rab.all  complete
rab.win  complete
rab of samples complete
within sample difference calculated
between group difference calculated
group summaries calculated
effect size calculated
summarizing output
aldex.clr: generating Monte-Carlo instances and clr values
operating in serial mode
removed rows with sums equal to zero
computing center with all features
data format is OK
dirichlet samples complete
transformation complete
aldex.ttest: doing t-test
running tests for each MC instance:
|------------(25%)----------(50%)----------(75%)----------|
aldex.effect: calculating effect sizes
operating in serial mode
sanity check c

: 1

## Microbiome density

Could the differences observed between population explained by microbe density?

In [69]:
# qPCR output directory
ddir="data/qPCR"
[[ ! -d "$ddir" ]] && echo "$ddir and qPCR data are missing. Exiting..." && exit 1

# Analyze data
scripts/microbiome_density.R

(ubiome_hml) (ubiome_hml) (ubiome_hml) (ubiome_hml) (ubiome_hml) Raw 16S copy number

16S copy number:
	- mean:  4490.034 
	- range (min max):  185.7344 59486.62 

Data per population is not following normal distribution

	Kruskal-Wallis rank sum test

data:  qpcr.d[, 2] and qpcr.d[, 3]
Kruskal-Wallis chi-squared = 26.494, df = 7, p-value = 0.0004108

Pairwise comparisons of 16S copy number between population (mean of the replicates):  ab a ab a b ac c b 


Estimateed bacteria

Bacteria number:
	- mean:  2405.149 
	- range (min max):  165.6501 16452.82 

Data per population is not following normal distribution

	Kruskal-Wallis rank sum test

data:  asv.d[, 2] and asv.d[, 3]
Kruskal-Wallis chi-squared = 18.294, df = 7, p-value = 0.01071

Pairwise comparisons of estimated bacteria between population (mean of the replicates):  ab ab ab ab a ab b ab 

Mean bacteria number in snails:
	- mean:  2620.105 
	- standard error:  320.6233 

1: Removed 6 rows containing non-finite values (stat_boxp

: 1