# Data Element reporting rate: based on reporting of one or more indicators
Partially following methods by WHO and as per Diallo (2025) paper

To accurately measure data completeness, we calculate the **monthly** reporting rate per **ADM2**, as the **proportion** of **facilities** (HF or `OU_ID`) that in a given month submitted data for either a single or _any_ of the chosen indicators (i.e., `CONF`, `SUSP`, `TEST`). 
Basically, "Data Element" reporting rate is the number of facilities reporting on 1 or more given indicators, over the total number of facilities.<br>
For this method the user is allowed to **chose** how to calculate both the **numerator** and **denominator**.<br> 

Specifically:  

* **Numerator**: Number of facilities that _actually reported_ data, and it is estimated based on whether a facility (OU_ID) submitted data for **_any_** of the **selected indicators**.  
    Note: we **recommend** always including `CONF` because it is a core indicator consistently tracked across the dataset. This choice ensures alignment with the structure of the incidence calculation, which is also mainly based on confirmed cases.
    <br>
    <br>
* **Denominator**: Number of facilities _expected_ to report. This number can be obtained in two different ways:    
    * `"ROUTINE_ACTIVE_FACILITIES"`: uses the col `EXPECTED_REPORTS` from the df `active_facilities`.<br>
      This is calculated as the number of "**active**" facilities (OU_ID), defined as those that submitted _any_ data **at least once in a given year**, across **all** indicators extracted in `dhis2_routine` (namely: all aggregated indicators as defined in the SNT_config.json file, see: `config_json$DHIS2_DATA_DEFINITIONS$DHIS2_INDICATOR_DEFINITIONS`)
    * `"PYRAMID_OPEN_FACILITIES"`: This method uses the opening and closing dates in DHIS2 (stored in the DHIS2 organisation units) to determine whether a facility was open, and thus expected to report, at the time of calculation.
    <br>
    <br>
* **Output**: Reporting rate table aggregated at administrative level 2 with extensions csv and parquet saved to dataset **SNT_DHIS2_REPORTING_RATE**:
    * cols: YEAR, MONTH, ADM2_ID, REPORTING_RATE
    * Filename: `XXX_reporting_rate_dataelement.<extension>`

## 1. Setup

In [None]:
# Project paths
SNT_ROOT_PATH <- "/home/hexa/workspace" 
CODE_PATH <- file.path(SNT_ROOT_PATH, 'code') 
CONFIG_PATH <- file.path(SNT_ROOT_PATH, 'configuration') 
DATA_PATH <- file.path(SNT_ROOT_PATH, 'data', 'dhis2')  

# Load utils
source(file.path(CODE_PATH, "snt_utils.r"))

# Load libraries 
required_packages <- c("arrow", "tidyverse", "stringi", "jsonlite", "httr", "reticulate", "glue")
install_and_load(required_packages)

# Environment variables
Sys.setenv(PROJ_LIB = "/opt/conda/share/proj")
Sys.setenv(GDAL_DATA = "/opt/conda/share/gdal")
Sys.setenv(RETICULATE_PYTHON = "/opt/conda/bin/python")

# Load OpenHEXA sdk
openhexa <- import("openhexa.sdk")

### 1.1. Fallback parameters values
This parameters are injected by papermill when running in OH via pipeline run interface. <br>
The code cell below here provides fallback paramater values needed when running this notebook locally.

In [None]:
# Current options: 
# "COUNTRY_CODE_routine.parquet" (RAW data)
# "COUNTRY_CODE_routine_outliers-mean_removed.parquet" 
# "COUNTRY_CODE_routine_outliers-mean_imputed.parquet"
# "COUNTRY_CODE_routine_outliers-median_removed.parquet"
# "COUNTRY_CODE_routine_outliers-median_imputed.parquet"            
# "COUNTRY_CODE_routine_outliers-iqr_removed.parquet"
# "COUNTRY_CODE_routine_outliers-iqr_imputed.parquet"
# "COUNTRY_CODE_routine_outliers-trend_removed.parquet"
# "COUNTRY_CODE_routine_outliers-trend_imputed.parquet" 
if (!exists("ROUTINE_FILE")) {ROUTINE_FILE <- "NER_routine_outliers-mean_imputed.parquet"}

# Options: "ROUTINE_ACTIVE_FACILITIES", "PYRAMID_OPEN_FACILITIES"
if (!exists("DATAELEMENT_METHOD_DENOMINATOR")) {DATAELEMENT_METHOD_DENOMINATOR <- "ROUTINE_ACTIVE_FACILITIES"}
if (!exists("ACTIVITY_INDICATORS")) {ACTIVITY_INDICATORS <- c("CONF", "PRES", "SUSP")} 
if (!exists("VOLUME_ACTIVITY_INDICATORS")) {VOLUME_ACTIVITY_INDICATORS <-  c("CONF", "PRES")}
if (!exists("USE_WEIGHTED_REPORTING_RATES")) {USE_WEIGHTED_REPORTING_RATES <- FALSE}

### 1.2. Load and check `snt config` file

In [None]:
# Load SNT config
config_json <- tryCatch({ jsonlite::fromJSON(file.path(CONFIG_PATH, "SNT_config.json")) },
    error = function(e) {
        msg <- paste0("[ERROR] Error while loading configuration", conditionMessage(e))  
        cat(msg)   
        stop(msg) 
    })

log_msg(paste0("SNT configuration loaded from : ", file.path(CONFIG_PATH, "SNT_config.json")))

In [None]:
# Configuration settings
COUNTRY_CODE <- config_json$SNT_CONFIG$COUNTRY_CODE
ADMIN_1 <- toupper(config_json$SNT_CONFIG$DHIS2_ADMINISTRATION_1)
ADMIN_2 <- toupper(config_json$SNT_CONFIG$DHIS2_ADMINISTRATION_2)

# How to treat 0 values (in this case: "SET_0_TO_NA" converts 0 to NAs)
# üö® NOTE (2025-01-09): The configuration field `NA_TREATMENT` has been removed from SNT_config.json files.
# It was legacy code from Ousmane and was only used for Reporting Rate calculations (not anymore).
# It has been replaced by `0_VALUES_PRESERVED` (boolean: true/false) which specifies whether zero values
# are stored in the DHIS2 instance (true) or converted to NULL to save space (false).
# See: https://bluesquare.atlassian.net/browse/SNT25-158
# The variable `NA_TREATMENT` is kept here for backward compatibility but is no longer loaded from config.
NA_TREATMENT <- config_json$SNT_CONFIG$NA_TREATMENT
# DHIS2_INDICATORS <- names(config_json$DHIS2_DATA_DEFINITIONS$DHIS2_INDICATOR_DEFINITIONS)  
DHIS2_INDICATORS <- c("CONF", "PRES", "SUSP", "TEST") # GP 20260205

ACTIVITY_INDICATORS <- unlist(ACTIVITY_INDICATORS)
VOLUME_ACTIVITY_INDICATORS <- unlist(VOLUME_ACTIVITY_INDICATORS)
fixed_cols <- c('PERIOD', 'YEAR', 'MONTH', 'ADM1_ID', 'ADM2_ID', 'OU_ID')
fixed_cols_rr <- c('YEAR', 'MONTH', 'ADM2_ID', 'REPORTING_RATE') # Fixed cols for exporting RR tables

### 1.3. üîç Check: at least 1 indicator must be selected
The use can toggle on/off each of the indicators. Therefore, need to make sure at least one is ON. <br>
Indicator `CONF` is mandatory, but I think it looks better if they're all displayed in the Run pipeline view (more intuitive).

In [None]:
if (!length(ACTIVITY_INDICATORS) > 0) {
    msg <- "[ERROR] Error: no indicator selected, cannot perform calculation of reporting rate method. Select at least one (e.g., `CONF`)."
    cat(msg)   
    stop(msg)
}

## 2. Load Data

### 2.1. Routine data (DHIS2) 
**Note on pipeline behaviour**: <br>
The value of `ROUTINE_FILE` is resolved within the pipeline.py code and injected into the notebook as parameter.

In [None]:
# select dataset
if (ROUTINE_FILE == glue("{COUNTRY_CODE}_routine.parquet")) {
    rountine_dataset_name <- config_json$SNT_DATASET_IDENTIFIERS$DHIS2_DATASET_FORMATTED
} else {
    rountine_dataset_name <- config_json$SNT_DATASET_IDENTIFIERS$DHIS2_OUTLIERS_IMPUTATION
}
 
# Load file from dataset
dhis2_routine <- tryCatch({ get_latest_dataset_file_in_memory(rountine_dataset_name, ROUTINE_FILE) }, 
                  error = function(e) {
                      msg <- paste("[ERROR] Error while loading DHIS2 routine data file for: " , COUNTRY_CODE, conditionMessage(e))  # log error message
                      cat(msg)
                      stop(msg)
})
dhis2_routine <- dhis2_routine %>% mutate(across(c(PERIOD, YEAR, MONTH), as.numeric)) # Ensure correct data type for numerical columns 

# log
log_msg(glue("DHIS2 routine file {ROUTINE_FILE} loaded from dataset: {rountine_dataset_name}. Dataframe dimensions: {paste(dim(dhis2_routine), collapse=', ')}"))
dim(dhis2_routine)
head(dhis2_routine, 2)

### 2.2. Organisation units (DHIS2 pyramid)

In [None]:
# Load file from dataset
dataset_name <- config_json$SNT_DATASET_IDENTIFIERS$DHIS2_DATASET_FORMATTED

dhis2_pyramid_formatted <- tryCatch({ get_latest_dataset_file_in_memory(dataset_name, paste0(COUNTRY_CODE, "_pyramid.parquet")) }, 
                error = function(e) {
                    msg <- paste("Error while loading DHIS2 pyramid FORMATTED data file for: " , COUNTRY_CODE, conditionMessage(e))  # log error message
                    cat(msg)
                    stop(msg)
})
    
msg <- paste0("DHIS2 pyramid FORMATTED data loaded from dataset: `", dataset_name, "`. Dataframe dimensions: ", paste(dim(dhis2_pyramid_formatted), collapse=", "))
log_msg(msg)
dim(dhis2_pyramid_formatted)
head(dhis2_pyramid_formatted,2)

### 2.3. Check whether selected indicators are present in routine data
Extra precaution measure to avoid breaks downstream.<br>

Note: This logic should be moved to pipeline.py üêç

In [None]:
if (!all(ACTIVITY_INDICATORS %in% names(dhis2_routine))) {
        log_msg(glue("üö® Warning: one or more of the follow column is missing from `dhis2_routine`: {paste(ACTIVITY_INDICATORS, collapse = ', ')}"), "warning")
}

if (!all(VOLUME_ACTIVITY_INDICATORS %in% names(dhis2_routine))) {
    msg <- glue("[ERROR] Volume activity indicator {VOLUME_ACTIVITY_INDICATORS} not present in the routine data. Process cannot continue.")
    cat(msg)
    stop(msg)
}

## 3. Reporting rates computations

#### 3.0. Define start and end period based on routine data 

In [None]:
PERIOD_START <- dhis2_routine$PERIOD %>% min()
PERIOD_END <- dhis2_routine$PERIOD %>% max()

period_vector <- format(seq(ym(PERIOD_START), ym(PERIOD_END), by = "month"), "%Y%m")
cat(glue("Start period: {PERIOD_START} \nEnd period: {PERIOD_END} \nPeriods count: {length(period_vector)}"))

#### 3.1. Build master table (all PERIOD x OU)
The master table contains all combinations of period x organisation unit 

In [None]:
log_msg(glue("Building master table with periods from {PERIOD_START} to {PERIOD_END}. Periods count: {length(period_vector)}"))

facility_master <- dhis2_pyramid_formatted %>%
    rename(
        OU_ID = glue::glue("LEVEL_{config_json$SNT_CONFIG$ANALYTICS_ORG_UNITS_LEVEL}_ID"),
        OU_NAME = glue::glue("LEVEL_{config_json$SNT_CONFIG$ANALYTICS_ORG_UNITS_LEVEL}_NAME"),
        ADM2_ID = str_replace(ADMIN_2, "NAME", "ID"),
        ADM2_NAME = all_of(ADMIN_2),
        ADM1_ID = str_replace(ADMIN_1, "NAME", "ID"),
        ADM1_NAME = all_of(ADMIN_1)
    ) %>%
    select(ADM1_ID, ADM1_NAME, ADM2_ID, ADM2_NAME, OU_ID, OU_NAME, OPENING_DATE, CLOSED_DATE) %>%
    distinct() %>%
    tidyr::crossing(PERIOD = period_vector) %>%
    mutate(PERIOD=as.numeric(PERIOD))
    

#### 3.2. Identify "Active" facilities

Facilities **reporting** zero or positive values on any of the selected indicators (**"Activity indicators"**) are considered to be **active**. Note that this method only counts **non-null** (not `NA`s) to prevent counting empty submissions as valid reporting.


In [None]:
log_msg(glue("Assessing facility reporting activity based on the following indicators: {paste(ACTIVITY_INDICATORS, collapse=', ')}"))

facility_master_routine <- left_join(
    facility_master,
    # dhis2_routine %>% select(OU_ID, PERIOD, all_of(DHIS2_INDICATORS)), # GP 2026-02-04
    dhis2_routine %>% select(OU_ID, PERIOD, any_of(DHIS2_INDICATORS)), 
    by = c("OU_ID", "PERIOD")
    ) %>%
    mutate(
        YEAR = as.numeric(substr(PERIOD, 1, 4)),
        ACTIVE_THIS_PERIOD = ifelse(
            rowSums(!is.na(across(all_of(ACTIVITY_INDICATORS))) & across(all_of(ACTIVITY_INDICATORS)) >= 0) > 0, 1, 0),        
        COUNT = 1 # Counting every facility
    )

#### 3.3. Identify `OPEN` facilities (denominator)
The "OPEN" variable indicates whether a facility is considered structurally open for a given reporting period.

A facility is flagged as open (OPEN = 1) for a period if both of the following conditions are met:
1. No explicit closure in the facility name. The facility name does not contain closure keywords such as ‚ÄúCLOTUR‚Äù, ‚ÄúFERM√â‚Äù, ‚ÄúFERMEE‚Äù, or similar.

2. The period falls within the facility‚Äôs opening and closing dates. The opening date is not after the reporting period, and the closing date is not before or equal to the reporting period.

If either of these conditions is not met, the facility is considered not open (OPEN = 0) for that period.

In [None]:
facility_master_routine <- facility_master_routine %>%
  mutate(
    period_date = as.Date(ym(PERIOD)),
      
    # Flag facilities explicitly marked as closed in their name
    NAME_CLOSED = str_detect(
      toupper(OU_NAME),
      "CLOTUR|FERM(E|EE)?"
    ),

    # Check whether the facility is open during the period using open/close dates
    OPEN_BY_DATE = 
      !(is.na(OPENING_DATE) | as.Date(OPENING_DATE) > period_date |
      (!is.na(CLOSED_DATE) & as.Date(CLOSED_DATE) <= period_date)
    ),
      
    # Final definition of an open facility for the period:
    # not explicitly closed, within opening/closing dates,
    # and started reporting
    OPEN = ifelse(
      !NAME_CLOSED & OPEN_BY_DATE,
      1, 0
    )
  )

#### 3.4. Identify "Active" facilities for each YEAR (denominator)

<div class="alert alert-block alert-info">
  <b>Important: this step could have a huge influence on reporting rates!</b><br>
  Activity can be evaluated over <b>1 year</b> or <b>across all years</b>, based on grouping: <code>group_by(OU_ID, YEAR)</code>:<br>
  <ul>
    <li>With <code>YEAR</code> ‚Üí ‚Äúactive that year‚Äù</li>
    <li>Without <code>YEAR</code> ‚Üí ‚Äúever active over the entire extracted period‚Äù</li>
  </ul>
</div>

In [None]:
# Flag facilities with at least one report in the year
facility_master_routine_01 <- facility_master_routine %>%
    group_by(OU_ID, YEAR) %>%
    mutate(ACTIVE_THIS_YEAR = max(ACTIVE_THIS_PERIOD, na.rm = TRUE)) %>%  # use max() to flag if ACTIVE_THIS_PERIOD is 1 at least once
    ungroup()

#### 3.5. Compute Weighting factor based on "volume of activity"

In [None]:
log_msg(glue("Computing volume of activity using indicator: {paste(VOLUME_ACTIVITY_INDICATORS, collapse=', ')}"))

# Compute MEAN_REPORTED_CASES_BY_HF as total cases over months with activity
mean_monthly_cases <- dhis2_routine %>% 
    mutate(total_cases_by_hf_month = rowSums(across(all_of(VOLUME_ACTIVITY_INDICATORS)), na.rm = TRUE)) %>%
    group_by(ADM2_ID, OU_ID) %>% 
    summarise(
        total_cases_by_hf_year = sum(total_cases_by_hf_month, na.rm = TRUE),
        number_of_reporting_months = length(which(total_cases_by_hf_month > 0)),
        .groups = "drop"
    ) %>% 
    mutate(MEAN_REPORTED_CASES_BY_HF = total_cases_by_hf_year / number_of_reporting_months) %>%
    select(ADM2_ID, OU_ID, MEAN_REPORTED_CASES_BY_HF)

mean_monthly_cases_adm2 <- mean_monthly_cases %>% 
    select(ADM2_ID, MEAN_REPORTED_CASES_BY_HF) %>% 
    group_by(ADM2_ID) %>% 
    summarise(SUMMED_MEAN_REPORTED_CASES_BY_ADM2 = sum(MEAN_REPORTED_CASES_BY_HF, na.rm=TRUE), 
              NR_OF_HF = n())

# Compute weights
hf_weights <- mean_monthly_cases %>% 
    left_join(mean_monthly_cases_adm2, by = "ADM2_ID") %>%
    mutate(WEIGHT = MEAN_REPORTED_CASES_BY_HF / SUMMED_MEAN_REPORTED_CASES_BY_ADM2 * NR_OF_HF)

# Join with rest of data
facility_master_routine_02 <- facility_master_routine_01 %>%
    left_join(hf_weights %>% select(OU_ID, WEIGHT), by = c("OU_ID"))

#### 3.6. Compute Weighted variables

In [None]:
log_msg(glue("Computing weighted variables for reporting rate calculation."))

facility_master_routine_02$ACTIVE_THIS_PERIOD_W <- facility_master_routine_02$ACTIVE_THIS_PERIOD * facility_master_routine_02$WEIGHT
facility_master_routine_02$COUNT_W <- facility_master_routine_02$COUNT * facility_master_routine_02$WEIGHT   
facility_master_routine_02$OPEN_W <- facility_master_routine_02$OPEN * facility_master_routine_02$WEIGHT
facility_master_routine_02$ACTIVE_THIS_YEAR_W <- facility_master_routine_02$ACTIVE_THIS_YEAR * facility_master_routine_02$WEIGHT

dim(facility_master_routine_02)
head(facility_master_routine_02, 2)

#### 3.7. Aggregate data at ADM2 level

In [None]:
log_msg(glue("Aggregating data at admin level 2."))

reporting_rate_adm2 <- facility_master_routine_02 %>% 
    group_by(ADM1_ID, ADM1_NAME, ADM2_ID, ADM2_NAME, YEAR, PERIOD) %>%
    summarise(
              HF_ACTIVE_THIS_PERIOD_BY_ADM2 = sum(ACTIVE_THIS_PERIOD, na.rm = TRUE), # (numerator) sum of all facilities active per PERIOD
              NR_OF_HF_BY_ADM2 = sum(COUNT, na.rm = TRUE),
              NR_OF_OPEN_HF_BY_ADM2 = sum(OPEN, na.rm = TRUE),
              HF_ACTIVE_THIS_YEAR_BY_ADM2 = sum(ACTIVE_THIS_YEAR, na.rm = TRUE), # (denominator) sum of all facilities active at least once in the YEAR
              HF_ACTIVE_THIS_PERIOD_BY_ADM2_WEIGHTED = sum(ACTIVE_THIS_PERIOD_W, na.rm = TRUE),
              NR_OF_HF_BY_ADM2_WEIGHTED = sum(COUNT_W, na.rm = TRUE),
              NR_OF_OPEN_HF_BY_ADM2_WEIGHTED = sum(OPEN_W, na.rm = TRUE),
              HF_ACTIVE_THIS_YEAR_BY_ADM2_WEIGHTED = sum(ACTIVE_THIS_YEAR_W, na.rm = TRUE),             
              .groups = "drop")

dim(reporting_rate_adm2)
# head(reporting_rate_adm2, 5)

#### 3.8. Calculate Reporting Rates (all methods)

In [None]:
log_msg(glue("Calculating Reporting Rates at admin level 2. Using all methods, weighted and unweighted."))

reporting_rate_adm2 <- reporting_rate_adm2 %>% 
      mutate(
        RR_TOTAL_HF = HF_ACTIVE_THIS_PERIOD_BY_ADM2 / NR_OF_HF_BY_ADM2,
        RR_OPEN_HF = HF_ACTIVE_THIS_PERIOD_BY_ADM2 / NR_OF_OPEN_HF_BY_ADM2,
        RR_ACTIVE_HF = HF_ACTIVE_THIS_PERIOD_BY_ADM2 / HF_ACTIVE_THIS_YEAR_BY_ADM2,
        RR_TOTAL_HF_W = HF_ACTIVE_THIS_PERIOD_BY_ADM2_WEIGHTED / NR_OF_HF_BY_ADM2_WEIGHTED,
        RR_OPEN_HF_W = HF_ACTIVE_THIS_PERIOD_BY_ADM2_WEIGHTED / NR_OF_OPEN_HF_BY_ADM2_WEIGHTED,
        RR_ACTIVE_HF_W = HF_ACTIVE_THIS_PERIOD_BY_ADM2_WEIGHTED / HF_ACTIVE_THIS_YEAR_BY_ADM2_WEIGHTED
      )

dim(reporting_rate_adm2)
head(reporting_rate_adm2, 5)

## 4. Select correct col for `REPORTING_RATE` based on denominator method

### 4.1. Select results and format

In [None]:
if (DATAELEMENT_METHOD_DENOMINATOR == "ROUTINE_ACTIVE_FACILITIES") { 
    rr_column_selection <- "RR_ACTIVE_HF" 
    if (USE_WEIGHTED_REPORTING_RATES) {
        rr_column_selection <- "RR_ACTIVE_HF_W"
    }
} else if (DATAELEMENT_METHOD_DENOMINATOR == "PYRAMID_OPEN_FACILITIES") {
    rr_column_selection <- "RR_OPEN_HF"
    if (USE_WEIGHTED_REPORTING_RATES) {
        rr_column_selection <- "RR_OPEN_HF_W"
    }
}

In [None]:
log_msg(glue("Using reporting rate column: `{rr_column_selection}` 
based on DATAELEMENT_METHOD_DENOMINATOR == {DATAELEMENT_METHOD_DENOMINATOR} 
and USE_WEIGHTED_REPORTING_RATES == {USE_WEIGHTED_REPORTING_RATES}"))

In [None]:
log_msg(glue("Formatting table for '{DATAELEMENT_METHOD_DENOMINATOR}' selection."))

# Select column and format final table
reporting_rate_dataelement <- reporting_rate_adm2 %>%
    mutate(MONTH = PERIOD %% 100) %>%
    rename(REPORTING_RATE = !!sym(rr_column_selection)) %>%
    select(all_of(fixed_cols_rr))

print(dim(reporting_rate_dataelement))
head(reporting_rate_dataelement, 3)

## 5. Inspect reporting rate values

In [None]:
hist(reporting_rate_dataelement$REPORTING_RATE, breaks=50, 
main=paste0("Histogram of REPORTING_RATE\n(", DATAELEMENT_METHOD_DENOMINATOR, ",\n", ifelse(USE_WEIGHTED_REPORTING_RATES, "Weighted", "Unweighted"), ")"), 
xlab="REPORTING_RATE")

In [None]:
# Boxplot
ggplot(reporting_rate_dataelement,
       aes(x = factor(YEAR), y = REPORTING_RATE)) +
  geom_boxplot(outlier.alpha = 0.3) +
  labs(
    x = "Year",
    y = glue::glue("REPORTING_RATE ({DATAELEMENT_METHOD_DENOMINATOR})"),
    title = "Distribution of REPORTING_RATE per year",
    subtitle = ifelse(USE_WEIGHTED_REPORTING_RATES, "Weighted Reporting Rates", "Unweighted Reporting Rates")
  ) +
  theme_minimal()

In [None]:
ggplot(reporting_rate_dataelement,
       aes(x = factor(YEAR), y = REPORTING_RATE)) +
# Boxplot without outliers
  geom_boxplot(outlier.alpha = 0) +
  geom_point(alpha = 0.3, position = position_jitter(width = 0.35)) +
  labs(
    x = "Year",
    y = glue::glue("REPORTING_RATE based on {DATAELEMENT_METHOD_DENOMINATOR}"),
    title = "Distribution of REPORTING_RATE per year",
    subtitle = ifelse(USE_WEIGHTED_REPORTING_RATES, "Weighted Reporting Rates", "Unweighted Reporting Rates")
  ) +
  theme_minimal()

## 5. üìÅ Export to `data/` folder

In [None]:
output_data_path <- file.path(DATA_PATH, "reporting_rate")

# parquet
file_path <- file.path(output_data_path, paste0(COUNTRY_CODE, "_reporting_rate_dataelement.parquet"))
write_parquet(reporting_rate_dataelement, file_path)
log_msg(glue("Exported : {file_path}"))

# csv
file_path <- file.path(output_data_path, paste0(COUNTRY_CODE, "_reporting_rate_dataelement.csv"))
write.csv(reporting_rate_dataelement, file_path, row.names = FALSE)
log_msg(glue("Exported : {file_path}"))