# Validation

PolicyEngine-UK runs unit and integration tests on each new version (see [here](https://github.com/PSLmodels/openfisca-uk/tree/master/tests)).
In addition, the table below shows the aggregates produced by the model for the major taxes and benefits, and comparisons with UKMOD (latest [country report](https://www.iser.essex.ac.uk/research/publications/working-papers/cempa/cempa2-22.pdf)) and official sources.[^1]
UKMOD and administrative sources refer to 2018, and PolicyEngine-UK is simulated on policy at the end of 2018.
Numbers are in billions of pounds.

[^1]: From the UKMOD country report: unless otherwise specified: Department for Work and Pensions https://www.gov.uk/government/publications/benefit-expenditure-and-caseload-tables-2018 ; Best Start Grant: https://www2.gov.scot/Topics/Statistics/Browse/Social-Welfare/SocialSecurityforScotland/BSGJune2019; Child tax credit and working tax credit: HMRC statistics 
https://www.gov.uk/government/statistics/child-and-working-tax-credits-statistics-finalised-annual-awards-2016-to-2017; Scottish Child Payment: Scottish Fiscal Commission https://www.fiscalcommission.scot/forecast/supplementary-costing-scottish-child-payment; Scottish Child Winter Heating Assistance: Scottish Fiscal Commission 
https://www.fiscalcommission.scot/forecast/supplementary-costing-child-winter-heating-assistance; Income tax: HMRC statistics https://www.gov.uk/government/statistics/income-tax-liabilities-statistics-tax-year-2014-to-2015-to-tax-year-2017-to-2018; National Insurance Contributions: ONS Blue Book Table 5.2.4s 

## Aggregate tables

PolicyEngine-UK uprates input FRS data: below are comparisons between the aggregates calculated by PolicyEngine-UK, UKMOD and external sources.

### Aggregates in full

In [1]:
import numpy as np
import pandas as pd
from policyengine_uk import (
    Microsimulation,
    REPO,
)
from policyengine_uk.data import CalibratedSPIEnhancedPooledFRS_2018_20
from policyengine_uk.system import (
    parameters as BASELINE_PARAMETERS,
    variables as BASELINE_VARIABLES,
)
from policyengine_core.parameters import ParameterNode
from pathlib import Path
import yaml
import plotly.express as px
import warnings

warnings.filterwarnings("ignore")
warnings.simplefilter("ignore")

policyengine_uk = "PolicyEngine-UK"
UKMOD = "UKMOD"
EXTERNAL = "External"

parameters = BASELINE_PARAMETERS
default_variables = BASELINE_VARIABLES

VARIABLES = [
    "child_benefit",
    "income_support",
    "JSA_income",
    "housing_benefit",
    "working_tax_credit",
    "child_tax_credit",
    "universal_credit",
    "pension_credit",
    "income_tax",
    "total_NI",
    "employment_income",
    "self_employment_income",
    "pension_income",
    "property_income",
    "savings_interest_income",
    "dividend_income",
]

sim = Microsimulation(dataset=CalibratedSPIEnhancedPooledFRS_2018_20)

# https://stackoverflow.com/questions/34667108/ignore-dates-and-times-while-parsing-yaml

yaml.SafeLoader.yaml_implicit_resolvers = {
    k: [r for r in v if r[0] != "tag:yaml.org,2002:timestamp"]
    for k, v in yaml.SafeLoader.yaml_implicit_resolvers.items()
}

with open(
    REPO.parent
    / "docs"
    / "book"
    / "model"
    / "ukmod_country_report_statistics.yaml",
    mode="r",
) as f:
    ukmod_statistics = ParameterNode(
        "ukmod", data=yaml.load(f, Loader=yaml.SafeLoader)
    )


def process_scalar(x, divisor, decimals):
    try:
        if np.isnan(x):
            return x
    except:
        return ""
    return round(x / divisor, decimals)


def model_validation_table(
    model_year_variable_to_result_func,
    title=None,
    start_year=2022,
    end_year=2025,
    divisor=1,
    decimals=0,
    models=[EXTERNAL, policyengine_uk, UKMOD],
    variables=VARIABLES,
):
    dfs = []
    for model in models:
        df = pd.DataFrame(
            {
                year: {
                    BASELINE_VARIABLES[variable].label: process_scalar(
                        model_year_variable_to_result_func(
                            model, year, variable
                        ),
                        divisor,
                        decimals,
                    )
                    for variable in variables
                }
                for year in range(start_year, end_year + 1)
            }
        )
        dfs.append(df.T)
    return pd.concat(dfs, keys=models).replace(np.nan, "")


def budgetary_impact(model, year, variable):
    try:
        if model == policyengine_uk:
            return sim.calc(variable, map_to="household", period=year).sum()
        elif model == UKMOD:
            return getattr(ukmod_statistics.ukmod.budgetary_impact, variable)(
                f"{year}-01-01"
            )
        elif model == EXTERNAL:
            param = BASELINE_PARAMETERS.calibration.programs.children[
                variable
            ].budgetary_impact
            if variable == "income_tax":
                param = param.by_country
            if "UNITED_KINGDOM" in param.children:
                return param.UNITED_KINGDOM(f"{year}-01-01")
            elif "GREAT_BRITAIN" in param.children:
                return param.GREAT_BRITAIN(f"{year}-01-01")
    except Exception as e:
        return np.nan


df = model_validation_table(budgetary_impact, divisor=1e9, decimals=1)

df[df.columns[:7]]

Unnamed: 0,Unnamed: 1,Child Benefit,Income Support,JSA (income-based),Housing Benefit,Working Tax Credit,Child Tax Credit,Universal Credit
External,2022,11.2,0.7,0.2,15.9,1.7,6.1,43.7
External,2023,11.5,0.5,0.0,14.7,1.3,4.6,49.8
External,2024,11.6,0.5,0.0,12.5,0.8,3.0,57.8
External,2025,11.6,0.4,0.0,10.2,0.4,1.4,68.0
PolicyEngine-UK,2022,11.2,0.8,0.2,15.5,1.7,6.2,43.2
PolicyEngine-UK,2023,11.6,0.8,0.2,14.2,1.7,6.2,41.5
PolicyEngine-UK,2024,11.5,0.8,0.2,13.9,1.6,5.9,41.2
PolicyEngine-UK,2025,11.5,0.8,0.2,13.6,1.5,5.7,40.8
UKMOD,2022,12.0,,,10.3,1.6,5.4,34.1
UKMOD,2023,12.3,,,9.6,1.3,4.5,39.1


In [2]:
df[df.columns[7:14]]

Unnamed: 0,Unnamed: 1,Pension Credit,Income Tax,National Insurance (total),employment income,self-employment income,pension income,rental income
External,2022,4.5,205.0,151.3,868.6,93.3,,25.0
External,2023,4.4,215.0,157.5,894.7,96.1,,25.7
External,2024,4.4,225.7,164.6,922.4,99.0,,26.5
External,2025,4.2,237.5,172.1,949.2,101.9,,27.3
PolicyEngine-UK,2022,4.7,190.3,150.8,879.5,103.2,114.5,24.3
PolicyEngine-UK,2023,4.7,201.4,148.7,906.0,106.3,117.9,25.1
PolicyEngine-UK,2024,4.3,212.7,154.9,934.0,109.6,121.6,25.8
PolicyEngine-UK,2025,4.0,218.7,160.8,961.1,112.8,125.1,26.6
UKMOD,2022,4.2,161.2,156.0,,,,
UKMOD,2023,4.4,170.6,145.1,,,,


In [3]:
df[df.columns[14:]]

Unnamed: 0,Unnamed: 1,savings interest income,dividend income
External,2022,5.3,69.6
External,2023,5.5,71.7
External,2024,5.6,73.9
External,2025,5.8,76.0
PolicyEngine-UK,2022,5.2,45.4
PolicyEngine-UK,2023,5.4,46.8
PolicyEngine-UK,2024,5.6,48.2
PolicyEngine-UK,2025,5.7,49.6
UKMOD,2022,,
UKMOD,2023,,


### Forecast comparison

In [4]:
def table_to_model_comparison(table):
    df = table.reset_index()
    df.columns = ["Model", "Year"] + list(table.columns)
    return (
        pd.melt(df, id_vars=["Year", "Model"])
        .pivot(index=["Year", "variable"], columns="Model", values="value")
        .reset_index()
        .rename(columns=dict(variable="Program"))
    )


def tables_to_model_comparisons(tables):
    dfs = []
    for key, table in tables.items():
        df = table_to_model_comparison(table)
        df = df.rename(
            columns={
                column: f"{column} {key}"
                if key != "" and column not in ("Year", "Model", "Program")
                else column
                for column in df.columns
            }
        )
        dfs.append(df)
    df = pd.concat(dfs, axis=1)
    df = df.loc[:, ~df.columns.duplicated()]
    return df


def model_forecast_chart(table, title=None, currency=True):
    hovertemplate = ""
    df = table_to_model_comparison(table)
    fig = (
        px.line(
            df,
            animation_frame="Program",
            x="Year",
            y=[EXTERNAL, policyengine_uk, UKMOD],
            color_discrete_map={
                policyengine_uk: "blue",
                UKMOD: "lightgrey",
            },
        )
        .update_layout(
            width=800,
            height=600,
            yaxis_tickprefix="£" if currency else "",
            title=title,
            template="plotly_white",
            legend_title="Model",
            yaxis_title="",
            yaxis_range=(0, 20e9) if currency else (0, 10e6),
            legend_traceorder="reversed",
            xaxis_tickvals=list(range(2022, 2026)),
        )
        .update_traces(hovertemplate=hovertemplate)
    )
    for frame in fig.frames:
        for data in frame.data:
            data.hovertemplate = hovertemplate
    return fig


model_forecast_chart(
    model_validation_table(budgetary_impact, divisor=1, decimals=1),
    title="Budgetary impact forecasts",
)

### Differences

#### Absolute

In [5]:
def budgetary_impact_error(model, year, variable):
    try:
        if model == policyengine_uk:
            return sim.calc(
                variable, map_to="household", period=year
            ).sum() - budgetary_impact(EXTERNAL, year, variable)
        elif model == UKMOD:
            return getattr(ukmod_statistics.ukmod.budgetary_impact, variable)(
                f"{year}-01-01"
            ) - budgetary_impact(EXTERNAL, year, variable)
    except:
        return np.nan


df = model_validation_table(
    budgetary_impact_error,
    models=[policyengine_uk, UKMOD],
    divisor=1e9,
    decimals=1,
)

df[df.columns[:7]]

Unnamed: 0,Unnamed: 1,Child Benefit,Income Support,JSA (income-based),Housing Benefit,Working Tax Credit,Child Tax Credit,Universal Credit
PolicyEngine-UK,2022,-0.0,0.1,0.0,-0.4,0.0,0.0,-0.4
PolicyEngine-UK,2023,0.0,0.3,0.2,-0.5,0.5,1.6,-8.3
PolicyEngine-UK,2024,-0.1,0.3,0.2,1.4,0.8,2.9,-16.6
PolicyEngine-UK,2025,-0.1,0.4,0.2,3.4,1.1,4.3,-27.2
UKMOD,2022,0.8,,,-5.6,-0.1,-0.8,-9.6
UKMOD,2023,0.8,,,-5.1,0.0,-0.1,-10.6
UKMOD,2024,0.9,,,-4.0,0.1,0.3,-18.6
UKMOD,2025,1.0,,,-3.5,0.0,0.1,-22.3


In [6]:
df[df.columns[7:14]]

Unnamed: 0,Unnamed: 1,Pension Credit,Income Tax,National Insurance (total),employment income,self-employment income,pension income,rental income
PolicyEngine-UK,2022,0.2,-14.7,-0.5,10.9,9.9,,-0.6
PolicyEngine-UK,2023,0.2,-13.6,-8.8,11.2,10.2,,-0.7
PolicyEngine-UK,2024,-0.1,-13.0,-9.7,11.6,10.5,,-0.7
PolicyEngine-UK,2025,-0.3,-18.8,-11.3,11.9,10.9,,-0.7
UKMOD,2022,-0.3,-43.8,4.7,,,,
UKMOD,2023,-0.0,-44.4,-12.4,,,,
UKMOD,2024,0.1,-47.2,-16.7,,,,
UKMOD,2025,0.2,-49.6,-20.2,,,,


In [7]:
df[df.columns[14:]]

Unnamed: 0,Unnamed: 1,savings interest income,dividend income
PolicyEngine-UK,2022,-0.1,-24.2
PolicyEngine-UK,2023,-0.1,-24.9
PolicyEngine-UK,2024,-0.1,-25.7
PolicyEngine-UK,2025,-0.1,-26.4
UKMOD,2022,,
UKMOD,2023,,
UKMOD,2024,,
UKMOD,2025,,


#### Relative

In [8]:
def relative_budgetary_impact_error(model, year, variable):
    try:
        if model == policyengine_uk:
            return (
                sim.calc(variable, map_to="household", period=year).sum()
                / budgetary_impact(EXTERNAL, year, variable)
                - 1
            )
        elif model == UKMOD:
            return (
                getattr(ukmod_statistics.ukmod.budgetary_impact, variable)(
                    f"{year}-01-01"
                )
                / budgetary_impact(EXTERNAL, year, variable)
                - 1
            )
    except:
        return np.nan


df = model_validation_table(
    relative_budgetary_impact_error,
    models=[policyengine_uk, UKMOD],
    divisor=1e-2,
    decimals=1,
)

df[df.columns[:7]]

Unnamed: 0,Unnamed: 1,Child Benefit,Income Support,JSA (income-based),Housing Benefit,Working Tax Credit,Child Tax Credit,Universal Credit
PolicyEngine-UK,2022,-0.0,15.9,9.9,-2.4,0.0,0.0,-1.0
PolicyEngine-UK,2023,0.1,61.8,1054052.5,-3.7,37.1,34.9,-16.6
PolicyEngine-UK,2024,-0.7,68.6,inf,11.1,93.2,96.4,-28.8
PolicyEngine-UK,2025,-0.7,83.7,inf,33.1,295.7,315.5,-40.0
UKMOD,2022,6.7,,,-35.0,-7.2,-12.5,-22.0
UKMOD,2023,6.8,,,-34.7,3.8,-1.5,-21.3
UKMOD,2024,7.9,,,-32.2,7.4,8.5,-32.3
UKMOD,2025,8.2,,,-34.3,11.6,6.3,-32.8


In [9]:
df[df.columns[7:14]]

Unnamed: 0,Unnamed: 1,Pension Credit,Income Tax,National Insurance (total),employment income,self-employment income,pension income,rental income
PolicyEngine-UK,2022,5.4,-7.2,-0.3,1.3,10.6,,-2.6
PolicyEngine-UK,2023,5.0,-6.3,-5.6,1.3,10.6,,-2.6
PolicyEngine-UK,2024,-1.9,-5.7,-5.9,1.3,10.6,,-2.6
PolicyEngine-UK,2025,-6.3,-7.9,-6.5,1.3,10.6,,-2.6
UKMOD,2022,-7.0,-21.4,3.1,,,,
UKMOD,2023,-0.8,-20.6,-7.9,,,,
UKMOD,2024,2.6,-20.9,-10.2,,,,
UKMOD,2025,5.8,-20.9,-11.8,,,,


In [10]:
df[df.columns[14:]]

Unnamed: 0,Unnamed: 1,savings interest income,dividend income
PolicyEngine-UK,2022,-1.4,-34.7
PolicyEngine-UK,2023,-1.4,-34.7
PolicyEngine-UK,2024,-1.4,-34.7
PolicyEngine-UK,2025,-1.4,-34.7
UKMOD,2022,,
UKMOD,2023,,
UKMOD,2024,,
UKMOD,2025,,


In [11]:
pd.set_option("display.max_colwidth", 0)
pd.set_option("display.max_rows", 500)


def error_chart(table, title=None):
    hovertemplate = "<b>%{customdata[4]} in %{customdata[3]}</b><br>Error: %{x}<br>Official: £%{customdata[2]}bn<br>PolicyEngine-UK: £%{customdata[0]}bn<br>UKMOD: £%{customdata[1]}bn"
    table = table.replace("", np.nan).dropna(axis=0)
    table[[policyengine_uk, UKMOD]] = (
        table[[policyengine_uk, UKMOD]].abs() / 1e2
    )
    fig = (
        px.bar(
            table.sort_values(["Year", policyengine_uk]),
            x=[policyengine_uk, UKMOD],
            y="Program",
            orientation="h",
            animation_frame="Year",
            barmode="group",
            color_discrete_map={
                policyengine_uk: "blue",
                UKMOD: "lightgrey",
            },
            custom_data=[
                "PolicyEngine-UK budgetary impact",
                "UKMOD budgetary impact",
                "External budgetary impact",
                "Year",
                "Program",
            ],
        )
        .update_layout(
            width=800,
            height=600,
            xaxis_tickprefix="£",
            xaxis_title="Budgetary impact error",
            title=title,
            template="plotly_white",
            legend_title="Model",
            yaxis_title="",
            legend_traceorder="reversed",
        )
        .update_traces(hovertemplate=hovertemplate)
    )
    for frame in fig.frames:
        for data in frame.data:
            data.hovertemplate = hovertemplate
    return fig


error_chart(
    tables_to_model_comparisons(
        {
            "": model_validation_table(
                budgetary_impact_error,
                models=[policyengine_uk, UKMOD],
                divisor=1e-2,
                decimals=1,
            ),
            "budgetary impact": model_validation_table(
                budgetary_impact, divisor=1e9, decimals=1
            ),
        }
    ),
    title="Budgetary impact error",
)

## Caseload tables

PolicyEngine-UK uprates input FRS data: below are comparisons between the aggregates calculated by PolicyEngine-UK, UKMOD and external sources.

### Caseloads in full

In [12]:
from microdf import MicroSeries
from policyengine_core.parameters import Parameter


def get_nonzero(variable, year):
    entity = default_variables[variable].entity.key
    values = sim.calc(variable, period=year) > 0
    return MicroSeries(
        sim.map_result(values, entity, "household"),
        weights=sim.calc("household_weight", year),
    )


def caseload(model, year, variable):
    try:
        if model == policyengine_uk:
            return get_nonzero(variable, year).sum()
        elif model == UKMOD:
            return getattr(ukmod_statistics.ukmod.nonzero_units, variable)(
                f"{year}-01-01"
            )
        elif model == EXTERNAL:
            if variable == "income_tax":
                total = 0
                for (
                    subparam
                ) in (
                    parameters.calibration.programs.income_tax.participants.by_country_and_band.get_descendants()
                ):
                    if isinstance(subparam, Parameter):
                        total += subparam(f"{year}-01-01")
                return total
            try:
                return parameters.calibration.programs.children[
                    variable
                ].participants.UNITED_KINGDOM(f"{year}-01-01")
            except:
                return parameters.calibration.programs.children[
                    variable
                ].participants.GREAT_BRITAIN(f"{year}-01-01")
    except:
        return np.nan


df = model_validation_table(
    caseload,
    models=[EXTERNAL, policyengine_uk, UKMOD],
    divisor=1e6,
    decimals=2,
)
df[df.columns[:7]]

Unnamed: 0,Unnamed: 1,Child Benefit,Income Support,JSA (income-based),Housing Benefit,Working Tax Credit,Child Tax Credit,Universal Credit
External,2022,7.07,0.16,0.04,2.71,0.77,1.25,4.65
External,2023,7.0,0.12,0.01,2.46,0.58,0.94,5.05
External,2024,6.93,0.08,0.0,2.11,0.38,0.62,5.61
External,2025,6.87,0.03,0.0,1.73,0.17,0.28,6.29
PolicyEngine-UK,2022,7.07,0.17,0.05,2.74,0.79,1.25,4.33
PolicyEngine-UK,2023,7.07,0.17,0.05,2.58,0.79,1.24,4.52
PolicyEngine-UK,2024,7.06,0.17,0.05,2.57,0.76,1.19,4.56
PolicyEngine-UK,2025,7.06,0.17,0.05,2.56,0.71,1.15,4.57
UKMOD,2022,7.12,,,2.26,0.54,1.08,4.21
UKMOD,2023,7.07,,,2.05,0.44,0.89,4.66


In [13]:
df[df.columns[7:14]]

Unnamed: 0,Unnamed: 1,Pension Credit,Income Tax,National Insurance (total),employment income,self-employment income,pension income,rental income
External,2022,1.41,31.88,,,,,
External,2023,1.34,31.88,,,,,
External,2024,1.28,31.88,,,,,
External,2025,1.23,31.88,,,,,
PolicyEngine-UK,2022,1.48,32.09,27.6,29.2,3.26,11.36,2.6
PolicyEngine-UK,2023,1.46,32.78,27.62,29.2,3.26,11.36,2.6
PolicyEngine-UK,2024,1.37,33.6,27.98,29.2,3.26,11.36,2.6
PolicyEngine-UK,2025,1.31,34.23,28.28,29.2,3.26,11.36,2.6
UKMOD,2022,1.41,30.01,26.08,,,,
UKMOD,2023,1.45,30.58,26.19,,,,


In [14]:
df[df.columns[14:]]

Unnamed: 0,Unnamed: 1,savings interest income,dividend income
External,2022,,
External,2023,,
External,2024,,
External,2025,,
PolicyEngine-UK,2022,15.49,5.64
PolicyEngine-UK,2023,15.49,5.64
PolicyEngine-UK,2024,15.49,5.64
PolicyEngine-UK,2025,15.49,5.64
UKMOD,2022,,
UKMOD,2023,,


### Caseload forecasts

In [15]:
model_forecast_chart(
    model_validation_table(
        caseload,
        models=[EXTERNAL, policyengine_uk, UKMOD],
        divisor=1,
        decimals=1,
    ),
    title="Caseload forecasts",
    currency=False,
)

### Differences

#### Absolute

In [16]:
def caseload_error(model, year, variable):
    try:
        if model == policyengine_uk:
            return get_nonzero(variable, year).sum() - caseload(
                EXTERNAL, year, variable
            )
        elif model == UKMOD:
            return getattr(ukmod_statistics.ukmod.nonzero_units, variable)(
                f"{year}-01-01"
            ) - caseload(EXTERNAL, year, variable)
    except:
        return np.nan


df = model_validation_table(
    caseload_error,
    variables=VARIABLES[:-1],
    models=[policyengine_uk, UKMOD],
    divisor=1e6,
    decimals=1,
)
df[df.columns[:7]]

Unnamed: 0,Unnamed: 1,Child Benefit,Income Support,JSA (income-based),Housing Benefit,Working Tax Credit,Child Tax Credit,Universal Credit
PolicyEngine-UK,2022,0.0,0.0,0.0,0.0,0.0,-0.0,-0.3
PolicyEngine-UK,2023,0.1,0.1,0.0,0.1,0.2,0.3,-0.5
PolicyEngine-UK,2024,0.1,0.1,0.0,0.5,0.4,0.6,-1.0
PolicyEngine-UK,2025,0.2,0.1,0.0,0.8,0.5,0.9,-1.7
UKMOD,2022,0.0,,,-0.4,-0.2,-0.2,-0.4
UKMOD,2023,0.1,,,-0.4,-0.1,-0.0,-0.4
UKMOD,2024,0.1,,,-0.3,-0.1,0.0,-0.3
UKMOD,2025,0.1,,,-0.3,-0.0,0.0,-0.1


In [17]:
df[df.columns[7:14]]

Unnamed: 0,Unnamed: 1,Pension Credit,Income Tax,National Insurance (total),employment income,self-employment income,pension income,rental income
PolicyEngine-UK,2022,0.1,0.2,,,,,
PolicyEngine-UK,2023,0.1,0.9,,,,,
PolicyEngine-UK,2024,0.1,1.7,,,,,
PolicyEngine-UK,2025,0.1,2.4,,,,,
UKMOD,2022,0.0,-1.9,,,,,
UKMOD,2023,0.1,-1.3,,,,,
UKMOD,2024,0.2,-0.9,,,,,
UKMOD,2025,0.2,-0.4,,,,,


In [18]:
df[df.columns[14:]]

Unnamed: 0,Unnamed: 1,savings interest income
PolicyEngine-UK,2022,
PolicyEngine-UK,2023,
PolicyEngine-UK,2024,
PolicyEngine-UK,2025,
UKMOD,2022,
UKMOD,2023,
UKMOD,2024,
UKMOD,2025,


#### Relative

In [19]:
def relative_caseload_error(model, year, variable):
    try:
        if model == policyengine_uk:
            return (
                get_nonzero(variable, year).sum()
                / caseload(EXTERNAL, year, variable)
                - 1
            )
        elif model == UKMOD:
            return (
                getattr(ukmod_statistics.ukmod.nonzero_units, variable)(
                    f"{year}-01-01"
                )
                / caseload(EXTERNAL, year, variable)
                - 1
            )
    except:
        return np.nan


df = model_validation_table(
    relative_caseload_error,
    variables=VARIABLES[:-1],
    models=[policyengine_uk, UKMOD],
    divisor=1e-2,
    decimals=1,
)
df[df.columns[:7]]

Unnamed: 0,Unnamed: 1,Child Benefit,Income Support,JSA (income-based),Housing Benefit,Working Tax Credit,Child Tax Credit,Universal Credit
PolicyEngine-UK,2022,0.0,10.4,11.9,1.3,2.4,-0.1,-6.8
PolicyEngine-UK,2023,0.9,49.5,884.4,4.8,36.5,32.7,-10.5
PolicyEngine-UK,2024,1.9,128.2,inf,21.4,98.1,93.6,-18.6
PolicyEngine-UK,2025,2.8,410.0,inf,48.0,309.5,311.3,-27.3
UKMOD,2022,0.6,,,-16.5,-30.2,-13.6,-9.4
UKMOD,2023,1.0,,,-16.6,-23.7,-5.1,-7.8
UKMOD,2024,1.3,,,-15.7,-22.1,4.4,-5.2
UKMOD,2025,1.8,,,-17.0,-25.4,4.5,-2.2


In [20]:
df[df.columns[7:14]]

Unnamed: 0,Unnamed: 1,Pension Credit,Income Tax,National Insurance (total),employment income,self-employment income,pension income,rental income
PolicyEngine-UK,2022,5.5,0.7,,,,,
PolicyEngine-UK,2023,8.7,2.8,,,,,
PolicyEngine-UK,2024,6.3,5.4,,,,,
PolicyEngine-UK,2025,6.4,7.4,,,,,
UKMOD,2022,0.5,-5.9,,,,,
UKMOD,2023,7.8,-4.1,,,,,
UKMOD,2024,12.3,-2.9,,,,,
UKMOD,2025,14.9,-1.3,,,,,


In [21]:
df[df.columns[14:]]

Unnamed: 0,Unnamed: 1,savings interest income
PolicyEngine-UK,2022,
PolicyEngine-UK,2023,
PolicyEngine-UK,2024,
PolicyEngine-UK,2025,
UKMOD,2022,
UKMOD,2023,
UKMOD,2024,
UKMOD,2025,


In [22]:
def error_chart(table, title=None):
    hovertemplate = "<b>%{customdata[4]} in %{customdata[3]}</b><br>Error: %{x}<br>Official: %{customdata[2]}m<br>PolicyEngine-UK: %{customdata[0]}m<br>UKMOD: %{customdata[1]}m"
    table = table.replace("", np.nan).dropna(axis=0)
    table[[policyengine_uk, UKMOD]] = (
        table[[policyengine_uk, UKMOD]].abs() / 1e2
    )
    fig = (
        px.bar(
            table.sort_values(["Year", policyengine_uk]),
            x=[policyengine_uk, UKMOD],
            y="Program",
            orientation="h",
            animation_frame="Year",
            barmode="group",
            color_discrete_map={
                policyengine_uk: "blue",
                UKMOD: "lightgrey",
            },
            custom_data=[
                "PolicyEngine-UK caseload",
                "UKMOD caseload",
                "External caseload",
                "Year",
                "Program",
            ],
        )
        .update_layout(
            width=800,
            height=600,
            xaxis_title="Caseload error",
            title=title,
            template="plotly_white",
            legend_title="Model",
            yaxis_title="",
            legend_traceorder="reversed",
        )
        .update_traces(hovertemplate=hovertemplate)
    )
    for frame in fig.frames:
        for data in frame.data:
            data.hovertemplate = hovertemplate
    return fig


error_chart(
    tables_to_model_comparisons(
        {
            "": model_validation_table(
                caseload_error,
                models=[policyengine_uk, UKMOD],
                decimals=1,
                divisor=1e-2,
            ),
            "caseload": model_validation_table(
                caseload, divisor=1e6, decimals=1
            ),
        }
    ),
    title="Caseload errors",
)

## Automated tests

Below are test results from the most recent version.

In [23]:
from policyengine_uk.tests.microsimulation.test_statistics import tests

pd.set_option("display.max_colwidth", 0)
pd.set_option("display.max_rows", 500)
pd.DataFrame({"Name": tests, "Passed": [test.test()[0] for test in tests]})

Unnamed: 0,Name,Passed
0,PolicyEngine-UK Child Benefit caseload error is less than 10.0% in 2022,True
1,PolicyEngine-UK Council Tax (less CTB) aggregate error is less than 11.0% in 2022,True
2,PolicyEngine-UK Child Tax Credit aggregate error is less than 55.0% in 2022,True
3,PolicyEngine-UK Child Tax Credit caseload error is less than 25.0% in 2022,True
4,PolicyEngine-UK Working Tax Credit caseload error is less than 45.0% in 2022,True
5,PolicyEngine-UK Housing Benefit caseload error is less than 15.0% in 2022,True
6,PolicyEngine-UK JSA (income-based) aggregate error is less than 110.0% in 2022,True
7,PolicyEngine-UK Income Support caseload error is less than 50.0% in 2022,True
8,PolicyEngine-UK Universal Credit caseload error is less than 20.0% in 2022,True
9,PolicyEngine-UK Income Tax caseload error is less than 17.0% in 2022,True
