This notebook computes a quarterly report on core metrics.

In [7]:
import numpy as np
import pandas as pd

import src.utils as utils

In [8]:
metrics = utils.load_all_metric_files()

In [9]:
reporting_period = pd.Period("2023Q4", freq="Q-JUN")

"Q-JUN" means quarters with a year ending in June. To Pandas, "2023Q4" means Q4 in the 2022-23 fiscal year, _not_ in the 2023-24 fiscal year. This is consistent with the guidance on https://office.wikimedia.org/wiki/Quarters.

In [10]:
# Pad the ends of the metrics with months to ensure that only full quarters are represented.
# This way, when we resample to quarterly averages, we can get null values for quarters where
# some months have null data.
first_quarter = metrics.index[0].asfreq("Q-JUN")
last_quarter = metrics.index[-1].asfreq("Q-JUN")
new_index = pd.period_range(first_quarter.start_time, last_quarter.end_time, freq="M")

quarterly_averages = (
    metrics
    .reindex(new_index)
    .resample("Q-JUN")
    .aggregate(
        # We need the lambda function because a plain "mean" would get translated
        # into PeriodIndexResampler.mean, which has no option to retain NaNs (which
        # allows us to report NaNs rather than misleading quarterly values based
        # on partial data)
        lambda x: x.mean(skipna=False)
    )
)

If the table is missing values, it's likely that some data is missing (such as the data for the last month in the quarter). Check the data files in the "data" directory to investigate.

In [11]:
core_metrics = [
    # % new quality biography articles about women and gender-diverse people
    # % new quality articles about regions that are underrepresented, compared to world population
    "unique_devices",
    "south_asia_unique_devices",
    "latin_america_caribbean_unique_devices",
    "north_america_unique_devices",
    "northern_western_europe_unique_devices"
]

(
    quarterly_averages
    .reindex(core_metrics, axis="columns")
    .apply(utils.calc_rpt, reporting_period=reporting_period)
    .transpose()
    .pipe(utils.format_report, metrics_type="core", reporting_period=reporting_period)
)

Unnamed: 0_level_0,2023Q4 core metrics,2023Q4 core metrics,2023Q4 core metrics
Unnamed: 0_level_1,value,year_over_year_change,naive_forecast
unique_devices,1.59 B,-18.9%,1.24 B
south_asia_unique_devices,151.0 M,-29.4%,107.0 M
latin_america_caribbean_unique_devices,151.0 M,-22.2%,116.0 M
north_america_unique_devices,316.0 M,-6.4%,274.0 M
northern_western_europe_unique_devices,381.0 M,-8.1%,324.0 M
