Skip to content

Latest commit

 

History

History
25 lines (13 loc) · 1.5 KB

README.md

File metadata and controls

25 lines (13 loc) · 1.5 KB

This analysis is an additional step of the ebioKit tutorials to visualize the results of 16S metabarcoding using unsupervised learning techniques such as t-distributed stochastic neighbor embedding(t-SNE) and Principal component analysis(PCA) -- You can even use a barplot to find the answer to this question or a frequency table.

Read more about MiSeq SOP the Schloss Lab uses to process their 16S rRNA gene sequences that are generated using Illumina's MiSeq platform using paired end reads here.


Setting up your environment

  • Download Anaconda for your operating system for Python 3 anaconda

  • Create a conda environment like mine:

    conda env create -f environment.yml

    This creates an environment called py35. Activate it with this command in your terminal

    source activate py35

  • In your terminal, in the directory where you cloned this repository. Run this command

    jupyter notebook otu_data_viz.ipynb


A codebook is provided for the .csv file: I encourage you to go through the exercise here. to generate the 0.16.cons.taxonomy.csv dataset. Which was created by processing .fastq files obtained from Mus musculus with Mothur. Find the notebook here.