Skip to content

Uploading your own data

Jake Stockwin edited this page Jul 3, 2017 · 2 revisions

Introduction

This page gives examples of how your data should look. In general, you can find example datasets here.

In all cases, take great case to ensure the correct header option is chosen when uploading your file. Headers are optional, and the default is for header to be unchecked, meaning the app will interpret the first line of you file as data, which will cause problems if it's actually a row of headers!

Incidence Data

Your incidence data should be a single column, with each row containing the number of infected individuals at each time. The first entry should be at least 1.

1
0
2
4
5

If you also have imported cases data, the numbers of cases in the original incidence data file that are known to be imported must be uploaded as a separate data file. These data files must both be of the same format. You need to make sure both datasets are of the same length. You MUST have at least one imported case in the first row. The number of imported cases in any row cannot exceed the total number of cases that you initially uploaded in your incidence data.

Serial Interval Data (RAW)

Your serial interval data should have either 4 of 5 columns:

  • EL: The lower bound of the symptom onset date of the infector (given as an integer)
  • ER: The upper bound of the symptom onset date of the infector (given as an integer). Should be such that ER>=EL.
  • SL:The lower bound of the symptom onset date of the infected individual (given as an integer)
  • SR: the upper bound of the symptom onset date of the infected individual (given as an integer). Should be such that SR>=SL
  • type (optional): can have entries 0, 1, or 2, corresponding to doubly interval-censored, single interval-censored or exact observations, respectively, see Reich et al. Statist. Med. 2009. If not specified, this will be automatically computed from the dates

So, the file should look like:

EL,ER,SL,SR
0,1,1,2                                                                         
0,1,1,2                                                                         
0,1,1,2                                                                         
0,1,1,2                                                                         
0,1,1,2                                                                         
0,1,1,2                                                                         
0,1,1,2                                                                         
0,1,1,2                                                                         
0,1,1,2                                                                         
0,1,1,2  

Again, headers are optional (and you need to remember to tick the header option when uploading your file), and we could have included an additional type column, but this is computed automatically so is never necessary.

Serial Interval Data (SAMPLE)

A matrix where each column gives one distribution of the serial interval to be explored. This generally comes from the output of running the RAW data through MCMC and then through coarseDataTools. We refer to the reader to the files here for examples, since they are typically quite large files.

Non-Parametric Distribution File

A row vector of probabilities giving the discrete distribution of the serial interal, starting with the probability that the serial interval is zero, which should be zero, then the probability that the serial interval is one, and so on. For example,

0,0.2,0.2,0.3,0.3
You can’t perform that action at this time.