This repository provides the data and source code for the following paper: Qulu Zheng, Francisco J Luquero, Iza Ciglenecki, Joseph F. Wamala, Abdinasir Abubakar, Placide Welo, Mukemil Hussen, Mesfin Wossen, Sebastian Yennan, Alama Keita, Justin Lessler, Andrew S. Azman, Elizabeth C. Lee. "Cholera outbreaks in sub-Saharan Africa during 2010-2019: A Descriptive Analysis." International Journal of Infectious Diseases. (DOI: 10.1016/j.ijid.2022.05.039)
- Summarized characteristics of cholera outbreaks from 2010 through 2019: reference_data/outbreak_data.csv
- location: the name of the location where outbreak cases were reported and the name is made up of WHO region, country, and administrative units seperated by "::".
- outbreak_number: the uid of outbreak occurred in a specific region.
- start_weekday: the start weekday of the reported cases.
- duration: the duration of the outbreak in weeks.
- threshold: the outbreak threshold per week.
- total_suspected_cases: the total number of suspected cholera cases in an outbreak.
- attack_rate: the attach rate of an outbreak per 1,000 people.
- total_deaths: the total number of cholera-associated deaths in an outbreak. NA means death data were not reported.
- cfr: the case fatality rate of an outbreak.
- temporal_scale: the temporal scale of the time series data (weekly data).
- country: the country where the outbreak was reported.
- population: the total population in the region where the outbreak was reported and in the year when the outbreak was detected.
- area: the area of the region where the outbreak was reported.
- mean_R0: the average of the instantaneous reproductive estimates over the first week of an outbreak.
- population_density: the number of population per km2 in the region where the outbreak was reported.
- rural_urban: urban regions are places where population density is equal to or higher than 1000 habitants per km2 and rural regions are places where population density is below 1000 habitants per km2.
- pop_cat: regions are grouped into four categories based on the population size, including "population < 1,000,000", "population 100,000-1,000,000", "population 10,000-100,000", and "population<10,000".
- who_region: The WHO region where the outbreak was reported.
- start_date: the start date of the outbreak.
- end_date: the end date of the outbreak.
- total_confirmed_cases: the total number of confirmed cholera cases reported in an outbreak. NA means that no confirmed case data was reported.
- spatial_scale: the spatial scale of the outbreak, including the first administrative unit ("admin1"), the second administrative unit ("admin2"), and the third administrative unit ("admin3").
- location_period_id: the location period id that are used to link to the shapefiles in global cholera incidence database.
- time_to_peak..weeks.: the number of weeks between the start of an outbreak and the week with the most reported cases 2. Summarized characteristics of cholera outbreaks from 2010 through 2019 in the sensitivity analysis: reference_data/outbreak_data_sensitivity_analysis.csv
- column names are the same as above. 3. Processed monthly time series of population data from 2010 through 2019: reference_data/heatmap_data.csv
- country: the ISO 3166 country codes.
- year_month: each row represents a month of a specific year between 2010-2019
- has_data: whether we have time series data in the global cholera incidence database.
- outbreak_sCh: whether there are outbreak-related suspected cholera case data in that month of a specific year.
- outbreak_deaths: whether there are outbreak-related cholera death data in that month of a specific year.
- outbreak_cCh: whether there are outbreak-related confirmed cholera case data in that month of a specific year.
- outbreak_duration: the duration of an outbreak.
- total_pop: the total population in that region within that time period.
- outbreak_pop: the population in that region within that time period.
- outbreak_sl: the spatial scale of the outbreak.
- Processed location and population data: reference_data/country_location_period_pop.csv.
- country: the name of the country.
- year: the year when the outbreak was reported.
- location_period_id: the location period id that are used to link to the shapefiles in global cholera incidence database.
- population: the total population in the region where the outbreak was reported and in the year when the outbreak was detected.
- area: the area of the region where the outbreak was reported.
- population_density: the number of population per km2 in the region where the outbreak was reported. pop_raster_filename: the name of the raster file
- Processed data for supplement figures 75_77: reference_data/Supplement_figure_75_77.csv
- TL: the start date of the observation.
- TR: the end date of the observation.
- sCh: number of suspected cholera cases reported in that observation.
- location: the name of the location where suspected cholera cases were reported and the name is made up of WHO region, country, and administrative units seperated by "::".
- spatial_scale: the spatial scale of the outbreak, including the first administrative unit ("admin1"), the second administrative unit ("admin2"), and the third administrative unit ("admin3").
- date_range: the length of the time period of the observation.
- temporal_scale: the temporal scale of the observation.
- dup_id: the UID of the outbreaks that are overlapped across differnet spatial scales.
This folder contains scripts that can generate the figrues in the main text or supplementary material.