Skip to content
Estimating the incubation time of the novel coronavirus (nCoV-2019) based on traveler data using coarse data tools
Branch: master
Clone or download

README.md

DOI

Real-time estimation of the novel coronavirus incubation time

Updated: Mon Mar 2 11:29:41 2020

Read the medRxiv preprint

Our lab has been collecting data (freely available at data/nCoV-IDD-traveler-data.csv) on the exposure and symptom onset for novel coronavirus (COVID-19) cases that have been confirmed outside of the Hubei province. These cases have been confirmed either in other countries or in regions of China with no known local transmission. We search for news articles and reports in both English and Chinese and abstract the data necessary to estimate the incubation period of COVID-19. Two team members independently review the full text of each case report to ensure that data is correctly input. Discrepancies are resolved by discussion and consensus.

Quick links:

Data summary

There are 181 cases from 49 countries and provinces outside of Hubei, China. Of those 69 are known to be female (38%) and 108 are male (60%). The median age is about 44.5 years (IQR: 34-55.5). 81 cases are from Mainland China (45%), while 100 are from the rest of the world (55%). 99 cases presented with a fever (55%).

This figure displays the exposure and symptom onset windows for each case in our dataset, relative to the right-bound of the exposure window (ER). The blue bars indicate the the exposure windows and the red bars indicate the symptom onset windows for each case. Purple areas are where those two bars overlap.

This figure displays the exposure and symptom onset windows for each case in our dataset, relative to the right-bound of the exposure window (ER). The blue bars indicate the the exposure windows and the red bars indicate the symptom onset windows for each case. Purple areas are where those two bars overlap.

The bars where the exposure and symptom onset windows completely overlap are frequently travelers from Wuhan who were symptomatic on arrival to another country, that did not release further details. These cases could have been exposed or symptomatic at any point prior to their trip

Exposure and symptom onset windows

The necessary components for estimating the incubation period are left and right bounds for the exposure (EL and ER) and symptom onset times (SE and SR) for each case. We use explicit dates and times when they are reported in the source documents, however when they are not available, we make the following assumptions:

  • For cases without a reported right-bound on symptom onset time (SR), we use the time that the case is first presented to a hospital or, lacking that, the time that the source document was published
  • For cases without an EL, we use 2019 December 1, which was the onset date for the first reported COVID-19 case; though we will test this assumption later
  • For cases without an ER, we use the SR
  • For cases without an SL, we use the EL

Under these assumptions, the median exposure interval was 49 (range: 1-81.8) and the median symptom onset interval was 1 (range: 0-81.8).

Incubation period estimates

We estimate the incubation period using the coarseDataTools package based on the paper by Reich et al, 2009. We assume a log-normal incubation period and using a bootstrap method for calculating confidence intervals.

The first model we fit is to all of the data and output the median, 2.5th, and 97.5th quantiles (and their confidence intervals):

est CIlow CIhigh
meanlog 1.621 1.504 1.755
sdlog 0.418 0.271 0.542
p2.5 2.228 1.750 2.942
p5 2.542 2.052 3.234
p25 3.814 3.339 4.378
p50 5.057 4.500 5.785
p75 6.705 5.664 7.895
p95 10.061 7.545 13.227
p97.5 11.478 8.230 15.638

The median incubation period lasts 5.057 days (CI: 4.5-5.785). The 2.5% of incubation periods pass in less than 2.228 days (CI: 1.75-2.942), while 97.5% of the population would experience symptoms by 11.478 days (CI: 8.23-15.638) since their exposure. The ‘meanlog’ and ‘sdlog’ estimates are the median and dispersion parameters for a LogNormal distribution; i.e. we recommend using a LogNormal(1.621, 0.418) distribution to appropriately represent the incubation time distribution.

Alternate estimates and sensitivity analyses

Alternate parameterizations

We fit other commonly-used parameterizations of the incubation period as comparisons to the log-normal distribution: gamma, Weibull, and Erlang.

The median estimates are very similar across parameterizations, while the Weibull distribution has a slightly smaller value at the 2.5th percentile and the log-normal distribution has a slightly larger value at the 97.5th percentile. The log-likelihoods were very similar between distributions; the log-normal distribution having the largest log-likelihood (55.16) and the Weibull distribution having the smallest log-likelihood (51.89).

The gamma distribution has an estimated shape parameter of 5.81 (95% CI: 3.58-13.87) and a scale parameter of 0.95 (95% CI: 0.37-1.7). The Weibull distribution has an estimated shape parameter of 2.45 (95% CI: 1.92-4.17) and a scale parameter of 6.26 (95% CI: 5.36-7.26). The Erlang distribution has an estimated shape parameter of 6 (95% CI: 3-10) and a scale parameter of 0.87 (95% CI: 0.56-1.95).

Sensitivity analyses

To make sure that our overall incubation estimates are sound, we ran a few analyses on subsets to see if the results held up. Since the winter often brings cold air and other pathogens that can cause sore throats and coughs, we ran an analysis using only cases that reported a fever. Since a plurality of our cases came from Mainland China, where assumptions about local transmission may be less firm, we ran an analysis without those cases. Finally, we challenge our assumption that unknown ELs can be assumed to be 2019 December 1 (Nextstrain estimates that it could have happened as early as September), by setting unknown ELs to 2018 December 1.

Using only fevers, the estimates are 0.409 to 1.053 days longer than the estimates on the full data. 12 of the cases with a fever reported having other symptoms beforehand. While it may take a little longer for an exposure to cause a fever, the estimates are similar to those of the overall results. The confidence intervals are wider here at every quantile due to having less data.

Using only cases from outside of Mainland China, the estimates are -0.156 to 3.264 days longer than the estimates on the full data. There is a bit of a gap on the long end of the tail, but the confidence intervals overlap for the most part.

When we set the unknown ELs to 2018 December 1 instead of 2019 December 1, the estimates are 0.128 to 0.202 days longer than the estimates on the full data. Somewhat surprisingly, this changes the estimates less than either of the other alternate estimates.

Comparison to other estimates

Backer, Klinkenberg, & Wallinga estimated the incubation period based on 88 early nCoV cases that traveled from Wuhan to other regions in China. Li et al estimated the incubation period based on the 10 laboratory-confirmed cases in Wuhan. A comparison of our incubation periods are shown below:

The median estimates from all models lie between 4.14 and 6.38. The lower and upper tails for our distributions are all closer to the median than from the other studies, whether this is due to differences in data or in estimation methodologies is open for investigation.

Parameter estimates

For the convenience of researchers who need parameter estimates for making infectious disease models, we include a table of the parameter estimates from our analysis and inferred from the other analyses. The parameters are different for each distribution; par1 and par2 are log-mean and log-sd of the log-normal distribution, while they are the shape and scale parameters for the gamma, Weibull, and Erlang distributions.

study type obs par1 par2
JHU-IDD log-normal 181 1.62 0.42
JHU-IDD gamma 181 5.81 0.95
JHU-IDD Weibull 181 2.45 6.26
JHU-IDD Erlang 181 6.00 0.87
Backer 2020 Weibull 88 3.04 7.20
Backer 2020 gamma 88 6.10 1.06
Backer 2020 log-normal 88 1.80 0.48
Li 2020 log-normal 10 1.42 0.67

Active monitoring analysis

Given these estimates of the incubation period, we predicted the number of symptomatic infections we would expect to miss over the course of an active monitoring program. We looked at active monitoring durations from 1 to 28 days for groups of ‘low risk’ (1/10,000 chance of symptomatic infection), ‘medium risk’ (1/1,000), ‘high risk’ (1/100), and ‘infected’ (1/1), similar to the analysis in Reich et al (2018).

Mean estimated symptomatic infections missed per 10,000 monitored (99th percentile), by duration of monitoring and level of risk
Monitoring duration Low (1 in 10,000) Medium (1 in 1,000) High (1 in 100) Infected (1 in 1)
7 days 0.2 (0.4) 2.1 (3.6) 21.2 (36.5) 2120.6 (3648.5)
14 days 0.0 (0.0) 0.1 (0.5) 1.0 (4.8) 100.9 (481.7)
21 days 0.0 (0.0) 0.0 (0.1) 0.1 (0.8) 9.5 (82.5)
28 days 0.0 (0.0) 0.0 (0.0) 0.0 (0.2) 1.4 (17.8)

Time to hospitalization

We can use the same procedure for estimating the incubation period to estimate the time from symptom onset to hospitalization.

This figure displays the symptom onset and hospitalization windows for each case in our dataset, relative to the right-bound of the symptom onset window (SR). The blue bars indicate the the symptom onset windows and the red bars indicate the hospitalization windows for each case. Purple areas are where those two bars overlap.

This figure displays the symptom onset and hospitalization windows for each case in our dataset, relative to the right-bound of the symptom onset window (SR). The blue bars indicate the the symptom onset windows and the red bars indicate the hospitalization windows for each case. Purple areas are where those two bars overlap.

Of the 169 individuals who developed symptoms in the community (as opposed to in isolation), 56 (33%) were hospitalized within a day.

We modeled the time to hospitalization as a gamma distribution:

est CIlow CIhigh
shape 0.401 0.296 0.534
scale 4.620 3.363 6.109
p2.5 0.000 0.000 0.003
p25 0.110 0.037 0.239
p50 0.674 0.386 1.026
p75 2.339 1.748 2.986
p97.5 10.292 8.075 12.640

The model estimates that time to hospitalization is 1.9 days, on average. The majority of cases report quickly, though there is a long tail.

You can’t perform that action at this time.