Skip to content
main
Switch branches/tags
Go to file
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.

README.md

Microbial Ecology Group (MEG) - AMR++ bioinformatics workshop

Course syllabus

Start Date: January 5, 2020

Email: meglab.metagenomics@gmail.com

Slack group: https://meg-research.slack.com

  • Join us here for general discussion and help with course content
  • Slack invite link
    • This link expires every 30 days, so let us know if it doesn't work for you.

Dropbox link

  • This dropbox folder contains all of the videos from our zoom course sessions and recordings from a previous MEG bioinformatics workshop.

Course content

Summary

These lessons are designed to introduce researchers to the R programming language for statistical analysis of metagenomic sequencing data. While we are primarily developing these training resources for the Microbial Ecology Group (MEG), we would love to get your input on improvements to any component so that we can one day provide this as a useful public resource. As the lessons are meant to be an informal collection of resources and tutorials, we have have liberally used parts and pieces of other online lessons and tailored it for our purposes. We attempt to give credit when possible by linking the original source and we are happy to hear recommendations for other resources to include.

We wholeheartedly encourage students to independently troubleshoot the majority of problems they might encounter by:

  • googling it (or using another search engine)
  • getting help from other students by using our slackgroup channel #2021-AMR++workshop
  • searching bioinformatic forums such as (stackoverflow.com, biostars.org, seqanswers.com, etc.)

Learning objectives:

Upon completion of these lessons, students will:

  • have their computer set up with the R and RStudio software
  • know how to read-in count matrices from bioinformatic analysis of sequence data
  • be able to explore and summarize bioinformatic results using
    • diversity indices and box plots
    • ordination with non-metric multidimensional scaling (NMDS)
    • heatmaps
  • be familiar with common statistical techiniques such as:
    • Wilcoxon test
    • Generalized linear models
    • Analysis of similarities (ANOSIM)
    • Differential abundance testing using a zero-inflated Gaussian (ZIG) model

Bioinformatic overview

Metagenomic sequencing approach determines the type of analysis you can perform:

  • Shotgun metagenomic sequencing
    • can analyze both the microbiome and resistome, in addition to other sequences such as plasmid-associated or virulence factors
  • Target-enriched resistome sequencing (MEGARes baits)
    • can only analyze the resistome
  • 16S rRNA amplicon sequencing
    • can only analyze the microbiome

In this repository, we'll show you examples of running variants of the AMR++ pipeline to achieve your bioinformatic analysis goals. We'll be using code found in this repository of bioinformatic pipelines

  • AMR++ pipeline
  • Qiime2 pipeline
    • We use the Qiime2 pipeline to analyze 16S rRNA reads and export the results to a file format that we can use to analyze with R.

Statistics overview

Remember, the analysis will always have to be based on your study design and performed with the goal of testing your apriori hypotheses. The scripts in this repository are merely meant to provide an outline for you to begin your analysis and branch off as needed.

Using RStudio, download everything in this repository and change your working directory to the newly downloaded AMRplusplus_bioinformatic_workshop directory. Start by opening the script on the main page, Stats_overview_script.R, and follow along for a brief explanation of how each of the scripts below fits into your analysis.

If you don't have RStudio installed, click on the link below to explore our test dataset using Binder and RStudio:

Binder

Otherwise, follow the instructions on this tutorial for installing R and Rstudio on your personal computer.

The main steps of data exploration and statistical analysis we will cover are divided into four main steps with associated scripts for each general step:

  1. Loading count matrix results from bioinformatic analyses into R
  2. Calculating summary statistics
  3. Normalizing counts and creating exploratory figures
  4. Running some common statistical tests

Resources:

MEG resources

R programming

  • RStudio cheatsheets
    • This website has tons of helpful cheatsheets for various R packages and analyses methods. Also includes cheatsheets translated to other languages.
  • YaRrr! The Pirate’s Guide to R
    • This is a free online book that goes over many useful topics in a quirky, but fun way! Follow along with our simplified R scripts in Lesson 1 and reference this book if you have any other questions.
  • R programming coursera course
    • This free coursera course goes in-depth with all of the functionality of R. It combines videos with example R scripts for you to follow along with. We recommend this course after you have been playing around with R a bit and want to learn more about the details into how R works.
  • Introduction to R workshop
    • We haven't personally tried this workshop, but they have a combination of videos, slides, and R code for various topics.
  • ggpubr
    • Nice package for "publication-ready" figures.
  • Harvard's Data Science: R Basics

Data visualization

Command-line

  • Explain shell
    • cool website that explains bash commands piece by piece

Statistics resources

Funding Information:

The development of this tutorial was supported in part by USDA NIFA Grant No. 2018-51300-28563, University of Minnesota College of Veterinary Medicine, The VERO Program at Texas A&M University and West Texas A&M University, and the State of Minnesota Agricultural Research, Education, Extension and Technology Transfer program.

About

No description, website, or topics provided.

Resources

Releases

No releases published

Packages

No packages published

Languages