Skip to content

petermacp/mlwdata

Repository files navigation

mlwdata: a repository of datasets for MLW population health in Malawi

mlwdata contains a set of annotated datasets that have been collected as part of research studies conducted at the Malawi-Liverpool-Wellcome Trust Clinical Research Programme in Blantyre, Malawi.

The main aims are to:

  • Ensure consistent use of data between and within studies over time
  • Facilitate sharing of data to reduce duplication of efforts

Installation

You can install the from GitHub with:

# install.packages("devtools")
devtools::install_github("petermacp/mlwdata")

Datasets

Currently, the following datasets are available:

scale_72_clusters
A sf MULTIPOLYGON object, containing polygon boundaries and cluster IDs for the SCALE Study, with variables:

  • cluster: unique identifier for each of the 72 study clusters
  • geometery: a list column, containing polygons for each cluster boundary.

blantyre_clinics
A sf POINT object, containing coordinates for Blantyre clinics/hospitals, and clinic IDs:

  • clinic: unique name of each of the 18 clinics/hospitals
  • geometry: a list column, containing points for each clinic/hospital.

blantyre_census_2008_2018
A tibble, containing age, sex and district-stratfied (Blantyre City, Blantyre Rural) population estimates from the 2008 and 2018 Malawi National Census. These data were provided by the Malwai National Statistics Office in October 2019.

  • District: either Blantyre City, or Blantyre Rural (as classified in the Census)
  • Year: census year (2008 or 2018)
  • Age: age group of population estimates
  • Total: total population
  • Male: Male population
  • Female: Female population

blantyre_census_by_q
A tibble, containing age, sex and district-stratfied (Blantyre City, Blantyre Rural) population, with linear interpolation by quarter, from the 2008 and 2018 Malawi National Census. These data were provided by the Malwai National Statistics Office in October 2019.

  • district: either Blantyre City, or Blantyre Rural (as classified in the Census)
  • year: census year (2008 or 2018)
  • age: age group of population estimates
  • quarter: quarter of the year
  • year_q: concatenated year and quarter
  • sex: male or female
  • population: population estimate

hiv_pops_blantyre_city
A tibble, containing estimates of HIV prevalence for Blantyre City, stratified by age, sex and quarter between 2008 and 2018. Population estimates are from Malawi National Census 2008 and 2018 estimates. HIV prevalence estimates are from an HIV-prevalence survey conducted in 2014-15 in North West Blantyre. Age and Sex specific HIV-prevalence estimates were multiplied by population demoninators to obtain numbers of HIV-positive people per age-sex-quarter strata. (Note, estimates for adults [16+] and Blantyre City only, and not Blantyre Rural are provided).

  • year_q: concatenated year and quarter
  • year: census year (2008 or 2018)
  • quarter: quarter of the year
  • sex: male or female
  • age: age group of population estimates
  • hiv_prev: HIV prevalence in age-sex strata
  • population: number of HIV-positive people in age-sex-quarter strata

blantyre_tb_cases_2009_2018
A tibble, containing numbers of TB cases notified in Blantyre TB registration centres between Q1 2009 and Q4 2018 by quarter, and stratified by active case finding area of the city (ACF vs. non-ACF), and microbiological status of cases

  • year_q: Annual quarter
  • acf: Area of Blantyre City (ACF = received active case finding intervention; non-ACF = didn’t receive active case finding intervention)
  • tbcases: Classification of TB diagnosis (Smr/Xpert-positive cases = cases that were either smear or xpert positive in testing by the routine clinic programme, or smear-positive by the research TB lab; All cases = all cases started on treatment, regardless of microbiological status)
  • n: number of cases in category

blantyre_tb_cases_2009_2018
A tibble, containing anonymised individual-level TB cases notified in Blantyre TB registration centres between Q1 2011 and Q4 2018

  • unique_id: Anonymised unique case ID
  • fac_code: TB registration centre
  • reg_date: Date on which TB case was registered for treatment
  • year: Year of registration for TB treatment
  • quarter: Quarter of registration for TB treatment
  • year_q: Year and quarter of TB treatment registration
  • period: Active case finding intervention period (pre-ACF = before ACF implemented; ACF = during ACF intervention; post-ACF = after ACF intervention implemented)
  • sex: Sex of TB case (male or female)
  • age: Age of TB case on day of treatment registration
  • agegp: Age group of TB case (1= 0-4 years, 2= 5-14 years, 3= 15+ years)
  • acf: Whether TB case’s household was located in the ACF intervention area of Blantyre City (ACF), or the non-ACF area of Blantyre City Non-ACF
  • hiv: HIV status of the TB case at TB treatment registration
  • art: Was patient taking antiretroviral therapy for the treatment of HIV at the start of TB treatment?
  • smr_clinic: Sputum smear status of TB case at TB registration, from sample collected and tested by the routine health system
  • smr_lab: Sputum smear status of TB case at TB registration, from sample collected at treatment registration, and tested in the research TB lab at the College of Medicine, University of Malawi
  • xpert_clinic: Sputum Xpert status of TB case at TB registration, from sample collected and tested by the routine health system. Note that Xpert was only reliably introduced into the programme from Q2 2015 onwards
  • tbtype: Classified as either Pulmonary TB or Extrapulmonary TB
  • cgh_dur: Duration of cough (in weeks) prior to TB treatment registration
  • tb_cat: Category of TB patient (New, Relapse, Retreatment after default, Retreatment after failure, Other)
  • smr_any: Whether a positive sputum smear result was obtained from either the routine health system or the study research lab sample
  • smr_xpert_any: Whether a positive sputum smear result was obtained from either the routine health system or the study research lab sample, or a positive Xpert result was obtained from the routine health system.

acf_cnrs_overall
A tibble, containing TB case notification rates (all cases) per 100,000 population between Q1 2009 and q4 2018, stratifed by active case finding intervention area

  • year_q: Annual quarter
  • acf: Whether TB case’s household was located in the ACF intervention area of Blantyre City (ACF), or the non-ACF area of Blantyre City Non-ACF
  • cases: Number of registered TB cases per category
  • tbcases: Classification of TB cases (microbiologically-confirmed or all cases)
  • population: Total population per strata
  • q_population: Total population per strata/4 (for CNR estimates)
  • cnr: TB case notification rate, per 100,000 per strata
  • conf.low: Lower bound of CNR 95% confidence interval
  • conf.high: Upper bound of CNR 95% confidence interval
  • period: Active case finding intervention period (pre-ACF = before ACF implemented; ACF = during ACF intervention; post-ACF = after ACF intervention implemented)

acf_smrpos_cnrs_overall
A tibble, containing TB case notification rates (smear or xpert-positive cases) per 100,000 population between Q1 2009 and q4 2018, stratifed by active case finding intervention area. Note cases are classified as Smear/Xpert positive if a positive sputum smear result was obtained from either the routine health system or the study research lab sample, or a positive Xpert result was obtained from the routine health system.

  • year_q: Annual quarter
  • acf: Whether TB case’s household was located in the ACF intervention area of Blantyre City (ACF), or the non-ACF area of Blantyre City Non-ACF
  • cases: Number of registered TB cases per category
  • tbcases: Classification of TB cases (microbiologically-confirmed or all cases)
  • population: Total population per strata
  • q_population: Total population per strata/4 (for CNR estimates)
  • cnr: TB case notification rate, per 100,000 per strata
  • conf.low: Lower bound of CNR 95% confidence interval
  • conf.high: Upper bound of CNR 95% confidence interval
  • period: Active case finding intervention period (pre-ACF = before ACF implemented; ACF = during ACF intervention; post-ACF = after ACF intervention implemented)

Examples

The SCALE Study defined 72 geographical cluster boundaries in urban Blantyre using GPS waypaths. These clusters can be loaded and plotted:

library(mlwdata)
library(tidyverse)

glimpse(scale_72_clusters)
#> Observations: 72
#> Variables: 2
#> $ cluster  <chr> "c1", "c2", "c3", "c4", "c5", "c6", "c7", "c8", "c9", "c10",…
#> $ geometry <list> [35.05040, 35.05040, 35.05040, 35.05040, 35.05040, 35.05040…


ggplot(scale_72_clusters) +
  geom_sf() +
  geom_sf_text(aes(label=cluster), colour="blue") +
  theme_minimal() +
  labs(title="SCALE Study",
       subtitle="72 clusters",
       x="",
       y="",
       caption = "Corbett, MacPherson et al.")

We could add on the location of the clinics in Blantyre.

library(sf)
library(ggrepel)

qech <- blantyre_clinics %>%
  #filter to show QECH only for interest
  filter(clinic=="Queen Elizabeth hospital") %>%
  #split out the x and y coordinates to allow use of ggrepel
  st_coordinates() %>%
  as_tibble() %>%
  mutate(clinic="QECH")

ggplot() +
  geom_sf(data = scale_72_clusters) +
  geom_sf_text(data = scale_72_clusters, aes(label=cluster), colour="blue") +
  geom_sf(data=blantyre_clinics, shape=17, colour="#22211d") +
  geom_label_repel(data = qech, 
                       aes(x=X, y=Y, label=clinic), 
        fontface = "bold",
        min.segment.length = 0) +
  theme_minimal() +
  labs(title="SCALE Study",
       subtitle="72 clusters. Clinics/hospitals labelled with triangles",
       x="",
       y="",
       caption = "Corbett, MacPherson et al.") +
  coord_sf()

About

Datasets for MLW Malawi

Resources

License

Unknown, MIT licenses found

Licenses found

Unknown
LICENSE
MIT
LICENSE.md

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published