Skip to content
Permalink
Browse files

Update figure supplement references

  • Loading branch information...
sidneymbell committed May 6, 2019
1 parent 81ae620 commit 69f3c96ecb6382725dd30e6536824783a92ea4b0
Showing with 5 additions and 5 deletions.
  1. +5 −5 manuscript/dengue-antigenic-dynamics.tex
@@ -151,15 +151,15 @@ \subsection*{Dengue antigenic evolution corresponds to genetic divergence}
We take the training data and fit $d_m$ for each mutation that is observed two or more times, subject to regularization as follows (also detailed in Methods, Eq.~\ref{eq_cost_fn}).
Parsimoniously, we expect that antigenic change is more likely to be incurred by a few key mutations than by many mutations; correspondingly, our prior expectation of values of $d_m$ is exponentially distributed such that most values of $d_m = 0$.
This is directly analogous to lasso regression to identify a few parameters with positive weights and set other parameters to 0 \citep{tibshirani1996regression}.
Additionally, some viruses have greater binding avidity, and some sera are more potent than others (Figure~\ref{titer_asymmetry}); these `row' and `column' effects, respectively, are normally distributed and are taken into account when training the model.
Additionally, some viruses have greater binding avidity, and some sera are more potent than others (Figure~\ref{titer_tree_heatmap}---Figure Supplement~\ref{titer_asymmetry}); these `row' and `column' effects, respectively, are normally distributed and are taken into account when training the model.
The model uses convex optimization to learn the values of $d_m$ that minimize the sum of squared errors (SSE) between observed and predicted titers in the training data.
We thus learn model parameters from the training data, and then use those parameters to predict test data values.
We assess model performance by comparing the predicted test titer values to the actual values, aggregated across 100-fold Monte Carlo cross validation.

This model formulation is an effective tool for estimating antigenic relationships between viruses based on their genetic sequences.
On average across cross-validation replicates, this model yields a root mean squared error (RMSE) of 0.75 when predicting titers relative to their true value (95\% CI 0.74--0.77, RMSE), and explains 78\% of the observed variation in neutralization titers overall (95\% CI 0.77--0.79, Pearson $R^2$).
This is comparable to the model error from a cartography-based characterization of the same dataset (RMSE 0.65--0.8 log$_2$ titer units) \citep{katzelnick2015dengue}.
Prediction error was comparable between human and non-human primate sera, indicating that these genetic determinants of antigenic phenotypes are not host species-specific (Figure~\ref{species_titers}).
Prediction error was comparable between human and non-human primate sera, indicating that these genetic determinants of antigenic phenotypes are not host species-specific (Figure~\ref{mutation_positions}---Figure Supplement~\ref{species_titers}).

\begin{figure}
\begin{fullwidth}
@@ -348,7 +348,7 @@ \subsection*{Antigenic novelty predicts serotype success}

To test this hypothesis, we examine the composition of the dengue virus population in Southeast Asia from 1970 to 2015.
We estimate the relative population frequency of each DENV serotype at three month intervals, $x_i(t)$ (Figure~\ref{serotype_fitness_model}A), based on their observed relative abundance in the `slice' of the phylogeny corresponding to each timepoint (N=8,644 viruses; see Methods, Eq.~\ref{eq_estimate_frequency}).
While there is insufficient data to directly compare these estimated frequencies to regional case counts, we see good qualitative concordance between frequencies similarly estimated for Thailand and previously reported case counts from Bangkok (Figure~\ref{thai_frequencies_comparison}).
While there is insufficient data to directly compare these estimated frequencies to regional case counts, we see good qualitative concordance between frequencies similarly estimated for Thailand and previously reported case counts from Bangkok (Figure~\ref{serotype_fitness_model}---Figure Supplement~\ref{thai_frequencies_comparison}).

Fitter virus clades increase in frequency over time, such that $x_i(t+dt) > x_i(t)$.
It follows that these clades have a growth rate---defined as the fold-change in frequency over time---greater than one: $\frac{x_i(t+dt)}{x_i(t)} > 1$.
@@ -472,7 +472,7 @@ \subsection*{Within-serotype antigenic heterogeneity}

Consistent with the relatively long timescale of dengue evolution, we observe many sites in the dengue phylogeny to have mutated multiple times.
These represent instances of parallelism, reversion and homoplasy.
For example, we observe that site 390 is consistently S in DENV1, N in DENV3 and H in DENV4, while DENV2 genotypes show a mixture of D, N and S (\ref{phylogeny_homoplasy}).
For example, we observe that site 390 is consistently S in DENV1, N in DENV3 and H in DENV4, while DENV2 genotypes show a mixture of D, N and S (Figure~\ref{mutation_positions}---Figure Supplement~\ref{phylogeny_homoplasy}).
We estimate an antigenic impact of 0.18 log$_2$ titers of the N390S mutation.
Our model predicts that the parallel N390S mutations in DENV1 and DENV2 Cosmopolitan makes these viruses slightly more antigenically similar rather than more antigenically distinct.
Along these lines, we compared the `substitution' model to a similar model formulation (termed the `tree' model) which assigns $d_m$ values to individual branches in the phylogeny, rather than to individual mutations, so that each branch with a positive $d_m$ value increases antigenic distance between strains \citep{neher2016prediction}.
@@ -734,7 +734,7 @@ \subsubsection*{Model performance assessment and parameter fitting}
\end{table}

\subsubsection*{Simulations}
To ensure the model machinery functions correctly, we seeded a forward simulation of clade dynamics with two years of empirical frequencies and simulated predicted dynamics over the remainder of the time course (Figure~\ref{simulation_parameter_recovery}).
To ensure the model machinery functions correctly, we seeded a forward simulation of clade dynamics with two years of empirical frequencies and simulated predicted dynamics over the remainder of the time course (Figure~\ref{serotype_fitness_model}---Figure Supplement~\ref{simulation_parameter_recovery}).
We then fit model parameters as described above, and obtained parameter values that well recover input values (Table~\ref{simulation_parameters}).

%%% simulation_parameters %%%

0 comments on commit 69f3c96

Please sign in to comment.
You can’t perform that action at this time.