Skip to content
Permalink
Browse files

update READMEs and sample metadata

  • Loading branch information
lmoncla committed Nov 14, 2019
1 parent 02fb8e2 commit d8f7386b88c496f712b458741758fa4f8e378705
Showing with 23 additions and 12 deletions.
  1. +2 −2 README.md
  2. +13 −2 data/README.md
  3. +8 −8 data/sample-metadata.tsv
@@ -6,7 +6,7 @@

## Abstract

Avian influenza viruses (AIVs) periodically cross species barriers and infect humans. The likelihood that an AIV will evolve mammalian transmissibility depends on acquiring and selecting mutations during spillover. We analyze deep sequencing data from infected humans and ducks in Cambodia to examine H5N1 evolution during spillover. Viral populations in both species are predominated by low-frequency (<10%) variation shaped by purifying selection and genetic drift. Viruses from humans contain some human-adapting mutations (PB2 E627K, HA A150V, and HA Q238L), but these mutations remain low-frequency. Within-host variants are not enriched along phylogenetic branches leading to human infections. Our data show that H5N1 viruses generate putative human-adapting mutations during natural spillover infection. However, short infections, randomness, and purifying selection limit the evolutionary capacity of H5N1 viruses within-host. Applying evolutionary methods to sequence data, we reveal a detailed view of H5N1 adaptive potential, and develop a foundation for studying host-adaptation in other zoonotic viruses.
Avian influenza viruses (AIVs) periodically cross species barriers and infect humans. The likelihood that an AIV will evolve mammalian transmissibility depends on acquiring and selecting mutations during spillover, but data from natural infection is limited. We analyze deep sequencing data from infected humans and domestic ducks in Cambodia to examine how H5N1 viruses evolve during spillover. Overall, viral populations in both species are predominated by low-frequency (<10%) variation shaped by purifying selection and genetic drift, and half of the variants detected within-host are never detected on the H5N1 virus phylogeny. However, we do detect a subset of mutations linked to human receptor binding and replication (PB2 E627K, HA A150V, and HA Q238L) that arose in multiple, independent humans. PB2 E627K and HA A150V were also enriched along phylogenetic branches leading to human infections, suggesting that they are likely human-adaptive. Our data show that H5N1 viruses generate putative human-adapting mutations during natural spillover infection, many of which are detected at >5% frequency within-host. However, short infection times, genetic drift, and purifying selection likely restrict their ability to evolve extensively during a single infection. Applying evolutionary methods to sequence data, we reveal a detailed view of H5N1 virus adaptive potential, and develop a foundation for studying host-adaptation in other zoonotic viruses.

## Install

@@ -15,7 +15,7 @@ Avian influenza viruses (AIVs) periodically cross species barriers and infect hu
## Project structure

* [`auspice/`](auspice/): contains JSON trees viewable via Nextstrain
* [`data/`](data/): contains consensus genomes, within-host SNV calls and phylogenies
* [`data/`](data/): contains sample metadata, consensus genomes, within-host SNV calls, coverage data in pileup format, coding region annotations in gtf format, and phylogenies
* [`figures/`](figures/): contains Jupyter notebooks to generate manuscript figures
* [`scripts`](scripts/): contains processing scripts

@@ -1,11 +1,22 @@
# Data

## Trees
Tree files shown in Figure 1 are available in json format [here](https://github.com/blab/h5n1-cambodia/tree/master/data/tree-jsons). These jsons were generated using the [Nextstrain avian-flu](https://github.com/nextstrain/avian-flu) pipeline with no geographic or regional subsampling.
## Sample metadata
Metadata describing the strain name, host species, year and month of sample collection, type of sample, sample collection method, vRNA copies/ul as assessed by RT-qPCR, days post symptom onset for human samples, and viral clade.

## Consensus genomes
All consensus sequences are available [here](https://github.com/blab/h5n1-cambodia/tree/master/data/consensus-genomes). The fasta header contains the following information: strain name | sample collection date | country of sampling | host species.

## gtfs
These files include annotations for the coding regions for each sample genome, in gtf format.

## pileup files
These files contain coverage and quality information for each base covered by sequence data for each sample in this dataset. These files were used to calculate and plot coverage information. Pileup format is described [here](http://samtools.sourceforge.net/pileup.shtml).

## nucleotide diversity data
Nonsynonymous and synonymous diversity were calculated for each coding region (PB2, PB1, PA, HA, NP, NA, M1, M2, NS1, and NEP) for each sample in this dataset using [SNPGenie](https://github.com/chasewnelson/SNPGenie). Combined results from all genes and samples are available in `pi-values.tsv`.

## Trees
Tree files shown in Figure 1 are available in json format [here](https://github.com/blab/h5n1-cambodia/tree/master/data/tree-jsons). These jsons were generated using the [Nextstrain avian-flu](https://github.com/nextstrain/avian-flu) pipeline with no geographic or regional subsampling.

## Within-host data
Human reads were removed from all raw fastq files by mapping to the human reference genome GRCh38 with bowtie2. Only unmapped reads were further processed and used for data analysis. The raw fastq files with human reads filtered out are all publicly available in the Sequence Read Archive under the accession number [PRJNA547644](https://www.ncbi.nlm.nih.gov/sra/?term=PRJNA547644), accession numbers SRX5984186-SRX5984198. All within-host variants reported in the manuscript and analyzed are available [here](https://github.com/blab/h5n1-cambodia/blob/master/data/within-host-variants-1%25.tsv). This data file includes all variants present at a frequency of at least 1% in all human and duck samples. FASTQ files were processed and variants called using [this pipeline](https://github.com/lmoncla/illumina_pipeline), briefly outlined below:
@@ -4,11 +4,11 @@ A/duck/Cambodia/083D1/2011 Domestic duck Pooled organs Poultry outbreak investig
A/duck/Cambodia/381W11M4/2013 Domestic duck Pooled throat and cloacal swab Live bird market surveillance March 2013 NA 7.37 x 105 1.1.2/2.3.2.1a reassortant
A/duck/Cambodia/Y0224301/2014 Domestic duck Pooled organs Poultry outbreak investigation February 2014 NA 2.0 x 105 1.1.2/2.3.2.1a reassortant
A/duck/Cambodia/Y0224304/2014 Domestic duck Pooled organs Poultry outbreak investigation February 2014 NA 5.0 x 106 1.1.2/2.3.2.1a reassortant
A/Cambodia/V0401301/2011 Human (10F, died) Throat swab Event-based surveillance April 2011 9 5.02 x 103 1.1.2
A/Cambodia/V0417301/2011 Human (5F, died) Throat swab Event-based surveillance April 2011 5 8.98 x 104 1.1.2
A/Cambodia/W0112303/2012 Human (2M, died) Throat swab Event-based surveillance January 2012 7 2.05 x 103 1.1.2
A/Cambodia/X0125302/2013 Human (1F, died) Throat swab Event-based surveillance January 2013 12 6.84 x 104 1.1.2/2.3.2.1a reassortant
A/Cambodia/X0128304/2013 Human (9F, died) Throat swab Event-based surveillance January 2013 8 5.09 x 103 1.1.2/2.3.2.1a reassortant
A/Cambodia/X0207301/2013 Human (5F, died) Throat swab Event-based surveillance February 2013 12 1.73 x 105 1.1.2/2.3.2.1a reassortant
A/Cambodia/X0219301/2013 Human (2M, died) Throat swab Event-based surveillance February 2013 12 1.66 x 103 1.1.2/2.3.2.1a reassortant
A/Cambodia/X1030304/2013 Human (2F, died) Throat swab Event-based surveillance October 2013 8 1.08 x 104 1.1.2/2.3.2.1a reassortant
A/Cambodia/V0401301/2011 Human Throat swab Event-based surveillance April 2011 9 5.02 x 103 1.1.2
A/Cambodia/V0417301/2011 Human Throat swab Event-based surveillance April 2011 5 8.98 x 104 1.1.2
A/Cambodia/W0112303/2012 Human Throat swab Event-based surveillance January 2012 7 2.05 x 103 1.1.2
A/Cambodia/X0125302/2013 Human Throat swab Event-based surveillance January 2013 12 6.84 x 104 1.1.2/2.3.2.1a reassortant
A/Cambodia/X0128304/2013 Human Throat swab Event-based surveillance January 2013 8 5.09 x 103 1.1.2/2.3.2.1a reassortant
A/Cambodia/X0207301/2013 Human Throat swab Event-based surveillance February 2013 12 1.73 x 105 1.1.2/2.3.2.1a reassortant
A/Cambodia/X0219301/2013 Human Throat swab Event-based surveillance February 2013 12 1.66 x 103 1.1.2/2.3.2.1a reassortant
A/Cambodia/X1030304/2013 Human Throat swab Event-based surveillance October 2013 8 1.08 x 104 1.1.2/2.3.2.1a reassortant

0 comments on commit d8f7386

Please sign in to comment.
You can’t perform that action at this time.