Skip to content

blab/ncov-wh

Repository files navigation

Viral genome sequencing places White House COVID-19 outbreak into phylogenetic context

Trevor Bedford1,2,3, Jennifer K. Logue4, Peter D. Han2,3, Caitlin R. Wolf4, Chris D. Frazar3, Benjamin Pelle>3, Erica Ryke3, Jover Lee1, Mark J. Rieder2,3, Deborah A. Nickerson2,3, Christina M. Lockwood5, Lea M. Starita2,3, Helen Y. Chu2,4, Jay Shendure2,3,6

1Vaccine and Infectious Disease Division, Fred Hutchinson Cancer Research Center, Seattle, WA, USA 2Brotman Baty Institute for Precision Medicine, Seattle, WA, USA 3Department of Genome Sciences, University of Washington, Seattle, WA, USA 4Department of Medicine, Division of Allergy and Infectious Diseases, University of Washington, Seattle, WA, USA 5Department of Laboratory Medicine and Pathology, University of Washington, Seattle, WA, USA 6Howard Hughes Medical Institute, Seattle, WA, USA

Abstract

In October 2020, an outbreak of at least 50 COVID-19 cases was reported surrounding individuals employed at or visiting the White House. Here, we applied genomic epidemiology to investigate the origins of this outbreak. We enrolled two individuals with exposures linked to the White House COVID-19 outbreak into an IRB-approved research study and sequenced their SARS-CoV-2 infections. We find these viral sequences are highly genetically similar to each other, but are distinct from over 160,000 publicly available SARS-CoV-2 genomes, possessing 5 nucleotide mutations that differentiate this lineage from all other circulating lineages sequenced to date. We estimate this lineage has a common ancestor in the USA in April or May 2020, but its whereabouts for the previous 5 to 6 months are not clear. Looking forwards, sequencing of additional community SARS-CoV-2 infections collected in the USA prior to October 2020 may reveal linked infections and shed light on its geographic ancestry. In sequencing of SARS-CoV-2 infections collected after October 2020, the relative rarity of this constellation of mutations may make it possible to identify infections that likely descend from the White House COVID-19 outbreak.

Repo organization

Nextstrain build

Most of the repository is a fork of github.com/nextstrain/ncov that's been tailored to conduct Nextstrain builds focusing on sample WH1 (USA/DC-BBI1/2020). To reproduce this build manually combine GISAID data in the form of data/sequences.fasta and data/metadata.tsv with repo-specific data in the form of data/wh_sequences.fasta and data/wh_metadata.tsv. Save the concatenated files as data/sequences.fasta and data/metadata.tsv.

The endpoints for this build are auspice/ncov_wh_background.json, auspice/ncov_wh_context.json and auspice/ncov_wh_lineage.json. All three endpoints can be generated by running snakemake -p --profile my_profiles/wh.

This generates Figures 1 and 2 of the manuscript.

This relies on collecting "adjacent" strains from global alignment by running:

python scripts/identify-matching-haplotypes.py --alignment ../ncov/results/aligned.fasta

and updating default/include.txt appropriately.

Branch length distribution

Overall statistics for branch length distributions are calculated in branch-length-distribution.

This generates Figure 3 of the manuscript.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published