-
Notifications
You must be signed in to change notification settings - Fork 0
Home
Liz Suter edited this page Aug 28, 2020
·
10 revisions
Here is a break down of the research steps for this project. As you move along, you can check off what is in progress and what is completed by adding items to the Project board (in the "Projects" tab above).
- Part 1: Introduction
- Part 2: Amplicon Pipeline Practice
- Part 3: Getting Set Up
- Part 4: Amplicon Pipeline Analysis
- Part 5: Post-Analysis of Amplicon Pipeline Output [Ecological Analysis]
Discussion
- Intro to Research & marine microbiology
- What is BVCN?
- Discussion of amplicons, microbial ecology of the oceans, disease & climate change
- What is R? Rstudio? Rstudio cloud?
- What is Cyverse?
- What is Dada2?
- What is Markdown?
- Watch videos on common field and lab approaches (links in doc in Google Drive)
To do
- Clone the MarineAnimalDisease repo
- Join BVCN
- Complete BVCN UNIX tutorials 1.1 and 1.2 (in Binder or locally)
- Complete BVCN R lessons 1-2
- Make a github account and make your first commit (tutorial here )
- Complete BVCN Amplicons lessons 1&2. Take notes and ask questions in Slack group
- Make a Cyverse account
Discussion
- What is amplicon science?
- Potential datasets- what are we looking for?
- Tips for working in Discovery Environment
- Tips for documenting (Markdown)
- Intro to NCBI's SRA
To do
- Complete BVCN R lessons 3-5
- Complete BVCN Amplicons lesson 3b, a tutorial using the analysis from Happy Belly Bioinformatics in the Cyverse app, “rstudio-dada2-decipher”
- Follow along with
Amplicons_Lesson_03b_cyverse.Rmd
to do the tutorial in Cyverse - See the Cyverse prereq video from lesson 3a for some tips on setting up in the Discovery Environment.
- Share analysis and data folder with me (Cyverse user name: esuter). Ask questions!
- Read about some details of the analysis at the Happy Belly website as you move along
- Follow along with
To Do
- Start looking for papers for amplicons datasets- discuss with group in Slack (see some guidelines here)
- Finalize dataset (with approval from me)
- Start personalizing your MarineAnimalDisease repo by putting links to your dataset’s Bioproject page (or similar) in the readme file
- Download and install cyberduck (instructions)
- Begin to download your dataset (fastq files) from SRA and import into Cyverse’s Data Store using Cyberduck. Share data folder with me in Cyverse. There are 3 options for this step:
- If you have Conda, you can do this by cloning and following this example Jupyter notebook in your local Terminal. (NOTE: for QIIME2, you should follow the whole notebook. For DADA2, you do not need to do the final steps of making a manifest file).
- For a small number of fastq files, you can use the app,
fastq-dump
in the Discovery Environment. - If you do not have Conda and you have a large number of files, provide me with a list of accession numbers and I will set up the data folder and share it with you in Cyverse
To Do
- Start a full amplicon analysis of your dataset, bringing the fastq files to count tables, taxonomy tables, etc. by following the DADA2 pipeline in R using the Cyverse app. Make a copy of the
Amplicons_Lesson_03b_cyverse.Rmd
notebook from lesson 3b and make appropriate modifications for your dataset.- At various points in this you will likely need input. Every dataset is different, and the pipeline depends a lot on your primers, the sequencing platform, and the overall quality of the sequences! Please discuss each step carefully with me and ask the group for help on Slack
To Do
- Complete BVCN R lessons lessons 6, 8a (skipping 7)
- Start applying some of the R analyses to your own dataset (cleaning up count tables, making abundance plots, using ecological statistics to determine patterns)
- Write up research summary and make sure all data and documents you produced are available in a shared Cyverse folder. Final results files and script should be in a github repo