# Loss

The fundamental core of Survey-Enhance is the idea of measuring survey accuracy/usefulness in a singular value that packs together lots of individual targets we're concerned about: everything from tax to benefits to demographics. We call this value the loss. The loss is a measure of how far away the survey is from the truth. It's a single number that we can use to compare different survey designs, and to measure the impact of different survey enhancements.

We need to build this ourselves for any given survey, trying to be as neutral as possible. There's strength in numbers: I've incorporated as many different statistics as are readily available for the UK, but of course the way I've constructed the loss function is the most vulnerable part of the pipeline to arbitrary assumptions. I followed the following principles:
* Put demographics into one bin, and financial statistics into another, then normalise them and weight them equally.
* Within those bins, weight by size (e.g. "Income Tax statistics" should be weighted 200:40 to "Universal Credit statistics", because Income Tax revenue is £200bn and Universal Credit spending is £40bn).

So what does this look like? The following code runs the loss function on the 2019-20 FRS.

In [1]:
from loss.loss import Loss, calibration_parameters
from datasets.frs import FRS_2019_20
from datasets.output_dataset import OutputDataset
import torch

original_frs = OutputDataset.from_dataset(FRS_2019_20, 2019, 2022)()

loss = Loss(
    original_frs,
    calibration_parameters(f"2022-01-01"),
    static_dataset=False,
)

frs_loss = loss(
    torch.tensor(original_frs.household.household_weight.values), original_frs
)

print(f"Original FRS: {frs_loss}")

Original FRS: 1.0


... which isn't too exciting, because it's normalised to 1.0 (deliberately). The value of the loss is fundamentally difficult to understand, because we don't really think of accuracy as a single number. But we can get some level of intuition by changing the survey up a bit and seeing how the loss changes. For example: what if everyone in the FRS lied and said they had no pension income? How much less accurate would the survey be? We have some rough subjective feeling for this, so we can see what the loss function says and calibrate our mental model of accuracy to that.

In [2]:
class FRS_2019_20_with_no_pension_income(FRS_2019_20):
    name = "FRS_2019_20_with_no_pension_income"
    file_path = (
        FRS_2019_20.file_path.parent / "frs_2019_20_with_no_pension_income.h5"
    )

    def generate(self):
        super().generate()
        pension_income = self.load("pension_income")
        self.save("pension_income", pension_income * 0)


frs_with_no_pension_income = OutputDataset.from_dataset(
    FRS_2019_20_with_no_pension_income, 2019, 2022
)()

frs_with_no_pension_income_loss = loss(
    torch.tensor(frs_with_no_pension_income.household.household_weight.values),
    frs_with_no_pension_income,
)

print(f"FRS with no pension income: {frs_with_no_pension_income_loss}")

FRS with no pension income: 6.552893370459233


So the loss jumped to around 6. Why is that? We can check the loss function to see what's going on:

In [3]:
weights = torch.tensor(
    frs_with_no_pension_income.household.household_weight.values
)

import yaml


print(yaml.dump(loss.computation_tree(weights, frs_with_no_pension_income)))

Loss.Programs:
  1_loss: 12.105786740918466
  2_weight: 1
  3_children:
    Loss.Programs.ChildTaxCredit:
      1_loss: 0.996940178801625
      2_weight: 13.875
      3_children:
        Loss.Programs.ChildTaxCredit.child_tax_credit_budgetary_impact:
          1_loss: 0.9938803576032499
          2_weight: 1
          3_children:
            child_tax_credit_budgetary_impact_UNITED_KINGDOM:
              1_loss: 0.03474512870643906
              2_loss_0: 0.03496522301027196
              3_y_pred: 11,288,691,857.03
              4_y_0_pred: 11,280,513,256.16
              5_y_true: 13,875,000,000.00
    Loss.Programs.HousingBenefit:
      1_loss: 0.9403234612227025
      2_weight: 15.894
      3_children:
        Loss.Programs.HousingBenefit.housing_benefit_budgetary_impact:
          1_loss: 0.9288759046814592
          2_weight: 1
          3_children:
            housing_benefit_budgetary_impact_ENGLAND:
              1_loss: 0.3096141427762169
              2_loss_0: 0.33078786550

What this is showing us is which parts of the loss function are most sensitive to the change we made. We can see that the biggest single loss change came from the pension income category (this makes sense- zeroing out pension incomes makes it very difficult to hit total pension income statistics!). But there were lots of knock-on effects on other categories, too: total taxpayer count statistics were significantly off after the change (likely because a lot of people pay tax solely because of their pension income, since the State Pension on its own is not enough to push you into the tax system).

As another sanity check, let's see what the loss is if we just toned down the pension income by 10%:

In [4]:
class FRS_2019_20_with_too_little_pension_income(FRS_2019_20):
    name = "FRS_2019_20_with_too_little_pension_income"
    file_path = (
        FRS_2019_20.file_path.parent
        / "frs_2019_20_with_too_little_pension_income.h5"
    )

    def generate(self):
        super().generate()
        pension_income = self.load("pension_income")
        self.save("pension_income", pension_income * 0.9)


frs_with_too_little_pension_income = OutputDataset.from_dataset(
    FRS_2019_20_with_too_little_pension_income, 2019, 2022
)()

frs_with_too_little_pension_income_loss = loss(
    torch.tensor(
        frs_with_too_little_pension_income.household.household_weight.values
    ),
    frs_with_too_little_pension_income,
)

print(
    f"FRS with 10% less pension income: {frs_with_too_little_pension_income_loss}"
)

FRS with 10% less pension income: 1.0334495620378865


... which is a much smaller change, and the loss function is much more stable- for lots of little reasons, like the fact that we didn't cross a lot of people over boundaries between tax bands, etc. etc.