Skip to content

This is a repository for development of the SARS-CoV-2 Bioinformatics for Beginners Course

License

Notifications You must be signed in to change notification settings

WCSCourses/SARS-COV-2_B4B

Repository files navigation

SARS-COV-2_B4B

This is a repository for the SARS-CoV-2 Bioinformatics for Beginners Course

*May 2023 update note - Access to file download links may change in the first two weeks of May 2023 which would impact the input data for example commands. Please expect errors over this time period. *

SARS-CoV-2 variant lineage identification is key to pandemic tracking and enabling public health response. This course is an introduction to bioinformatics by applying skills used in SARS-CoV-2 genomic data analysis. This will be a distributed classrooms style course run across Africa; Latin America and the Caribbean; and Asia. This model was developed by H3ABioNet, see this publication for more info.

SARS-CoV-2 variant lineage identification is key to pandemic tracking and enabling public health response. This course is an introduction to bioinformatics by applying skills used in SARS-CoV-2 genomic data analysis. Bioinformatics skills are fundamental in management and assessment of viral sequences. This course will introduce you to processing data programmatically, the data formats used in viral sequencing, how to determine the variant lineage (Delta, Omicron etc.), and how to share data so that others around the world can benefit. These skills are the building blocks for scaling up analysis to pandemic response levels.

Course website
Glossary

Course structure

This course is making use of Google Colab - https://colab.research.google.com/, a free to use service.

Access to Colab is via a Google Account, which can be made for free.

Time commitment

Contact sessions will run twice a week, lasting for 4 hours per session. It will run between the 31st of October – 2nd of December 2022. There will be sessions in two time zones. Note, each session for Oceania and Asia; and Latin America and Africa; will run in the same block of time, but with regional time differences.

Target audience

The course is aimed at postgraduate scientists, postdoctoral scientists, junior faculty members or clinicians/healthcare professionals based in the regions across Africa, Asia, and Latin America & the Caribbean. It does not require bioinformatics skills as a prerequisite.

Programme

The programme will cover the following core topics:

  • Intro to Python Notebooks
  • Intro to Unix/Linux & running commands
  • Introduction to NGS Technologies employed in SARS-CoV-2 sequencing
  • Data quality control
  • Workflows for sequencing analysis
  • Pangolin for lineage identification
  • Exploring genomics data in a global context

Learning Outcomes

  • Apply command line tools for sequence data quality control
  • List file formats commonly used in SARS-CoV-2 sequencing
  • Use Pangolin to create viral lineages from sets of existing data
  • List key metadata that must be included when uploading sequences to online repositories
  • Describe broad principles in translation of analysis outputs to outbreak/epi/pandemic response

Instructor Team

  • Ariel Amadio, Universidad Nacional de Rafaela, IDICAL, INTA-CONICET, Argentina
  • Blanca Taboada,CoViGen-Mex, Instituto de Biotecnología, Universidad Nacional Autónoma de México (UNAM); Consorcio Mexicano de Vigilancia Genómica (CoViGen-Mex)
  • Carlo Lapid, Philippine Genome Centre, Philippines
  • Elizabeth Batty, MORU Mahidol Oxford Tropical Medicine Research Unit, Thailand
  • Idowu Olawoye, African Centre for Excellence for Genomics of Infectious Diseases (ACEGID), Nigeria
  • Johan F Bernal, AGROSAVIA, Colombia,
  • John Juma, International Livestock Research Institute (ILRI), Kenya, SANBI, University of Western Cape, South Africa
  • Jorge Batista da Rocha, COG-Train, Wellcome Connecting Science, United Kingdom, Sydney Brenner Institute for Molecular Bioscience SBIMB, University of the Witwatersrand, South Africa
  • Leigh Jackson, COG-Train, University of Exeter, United Kingdom
  • Marcela Suarez Esquivel, Universidad Nacional, Costa Rica
  • Paul Oluniyi, Chan Zuckerberg Biohub, University of California, San Francisco, USA
  • Progress Dube, National Biotechnology Authority, Zimbabwe
  • Una Ren, Special Pathogens Unit, Environmental Science and Research, New Zealand
  • Varun Shammana, Central Research Laboratory, Kempegowda Institute of Medical Sciences
  • Zahra Waheed, European Bioinformatics Institute, Wellcome Genome Campus, United Kingdom

Course manual

Introduction Week

Introduction Notebook - Begin here

Video Playlist - Introduction Week

Introduction Day 2 Dayplan

Module 1: Introduction to Notebooks & Unix command line
Module 1 Video Playlist (Parts 1 and 2)

Module 1 Part 1 Day Plan

Module 1 Part 2 Day Plan

Module 1 Part 1 and Part 2 Notebook Instructions

Bonus Videos for NGS technologies

Module 2: Data QC and Consensus sequences

Module 2 Video Playlist (Parts 1 and 2)

Module 2 Part 1 Day Plan

Module 2 Part 2 Day Plan

Module 2 Data QC and Consensus Notebook Instructions Parts 1,2,3

Module 3: Variant Lineage Identification
Module 3 Video - Variant Lineage Identification

Module 3 Part 1 Day Plan

Module 3 Variant Lineage Identification Notebook Instructions

Module 3 Part 2 Day Plan - Exercise

Module 4: Data sharing and interpretation

Module 4 Video Playlist (Please watch Sections 1-2 for Day 1, and Sections 3-7 for Day 2)

Module 4 Part 1 Day Plan

Module 4 Data Sharing and Interpretation Notebook Instructions

Module 4 Part 2 Day Plan (Exercises for Day 2 are in the videos for Sections 5-7)

Module 4 Slide deck pdf

Additional information

WCS LMS
COG-Train Online courses
Your digital mentor podcast
WCS courses and conferences

Any reuse of the course materials, data or code is encouraged with due acknowledgement.


License

Creative Commons Licence
This work is licensed under a Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).

About

This is a repository for development of the SARS-CoV-2 Bioinformatics for Beginners Course

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published