Skip to content

Latest commit

 

History

History
60 lines (42 loc) · 2.87 KB

overview.md

File metadata and controls

60 lines (42 loc) · 2.87 KB

Overview

A collection of scripts for SARS-CoV-2 genomics analysis automation

Our automation solution comes in the form of a collection of independent small scripts powered by the BioBlend library for interacting with the Galaxy API and by the workflow execution functionality of the planemo command-line utilities.

Tag-based orchestration of scripts

The actions of all scripts in the collection are controlled and coordinated via a system of Galaxy dataset and history tags that is used to communicate input data availability and the state of the overall analysis progress.

When run together the scripts support a fully automated SARS-CoV-2 sequence data analysis pipeline that includes

  • raw sequencing data upload into Galaxy and organization of the data into dataset collections
  • variant calling using one of our highly sensitive published workflows for either:
  • generation of reports of all identified variants
  • reliable consensus sequence building including masking of questionable sites
  • export of key analysis results - BAM files of aligned reads, VCF files of called variants, FASTA consensus sequences to a user-specific FTP folder for simplified downloading with standard FTP clients.

The full pipeline with all script actions looks like this:

  1. You upload simple text files with download links for your sequencing data into a Galaxy history on a Galaxy server of your choice (yes, all scripts work on any Galaxy server you have a user account on).

    All links in one dataset will be treated as a batch of data and be analyzed together. Add as many datasets as you want to one or more tagged histories and repeated runs of the scripts will process batches one at a time.

  2. You add a history tag recognized by the variation script, which identifies that history as one holding datasets with data download links that should be processed

  3. You arrange alternating runs of the scripts and watch the automated batch-wise analysis of your data live in the Galaxy UI!

Learn more about:

Contributions welcome

The current scripts support our COG-UK tracking efforts on usegalaxy.* instances quite well, but we hope to be able to expand the collection based on independent user, i.e. your, feedback and contributions!

Bug reports, ideas, patches, additional scripts - whatever you can provide is very welcome!