In [1]:
import numpy as np
from gettsim import main, InputData, MainTarget, copy_environment

# Modifying Taxes and Transfers

GETTSIM's design allows you to go beyond the depiction of the current (or historical)
tax and transfer system. Analyzing counterfactual reform scenarios, ranging from small
changes of certain parameters of the tax and transfer system, to the introduction of
large-scale reforms, is a common use case.

This tutorial showcases how to modify the calculation of taxes and transfers when using
GETTSIM.

Here, we focus mainly on small reforms to the means-tested social welfare benefits for
the unemployed (German: Bürgergeld; until 2022: Arbeitslosengeld II). We pick this
example because Bürgergeld is a fairly complex system that uses (almost) the entire
range of objects TTSIM offers.

## Status Quo

Before modifying taxes and transfers, it's important to understand how GETTSIM
represents the current tax and transfer system. The core of GETTSIM's implementation is
the **policy environment** - a comprehensive data structure that contains everything
needed to compute taxes and transfers for a specific date.

### What is a Policy Environment?

A policy environment is a nested dictionary that holds all the parameters and functions
needed to calculate taxes and transfers for a given policy date. Think of it as a
complete snapshot of the tax and transfer system at a particular point in time.

### The Three Types of Objects

The policy environment contains three main categories of objects:

1. **Column Objects** (`ColumnObjects`): These work with data columns - either input
   data you provide or results computed by previous functions. They handle the actual
   calculations and data processing.

2. **Parameter Objects** (`ParamObjects`): These store the parameters and constants
   used in calculations, such as tax rates, benefit amounts, or thresholds.

3. **Parameter Functions** (`ParamFunctions`): These process and prepare parameters
   so they can be used by the column objects. They handle parameter transformations
   and validations.

### Getting Started

The first step in modifying taxes and transfers is to create the base policy
environment for the date you want to work with.

In [2]:
status_quo_environment = main(
    main_target=MainTarget.policy_environment,
    policy_date_str="2025-01-01",
    backend="numpy",
)

We also create some input data in order to verify how our modifications to the policy
environment affect the output. The following input data is required to compute the
amount of social welfare benefits (i.e. `('arbeitslosengeld_2', 'betrag_m_bg')`) when
assuming parental leave benefits (i.e. `('elterngeld', 'betrag_m')`), pensions
(i.e. `('sozialversicherung', 'rente', 'altersrente', 'betrag_m')`,
`('sozialversicherung', 'rente', 'erwerbsminderung', 'betrag_m')`), and unemployment
benefits (i.e. `('sozialversicherung', 'arbeitslosen', 'betrag_m')`) are fixed at some
value.

In [None]:
inp = {
    "p_id": np.array([0]),
    "wohngeld": {"betrag_m_wthh": np.array([0.0])},
    #"arbeitslosengeld_2": {"betrag_m_bg": np.array([0.0])},
    #"elterngeld": {"betrag_m": np.array([0.0])},
    #"sozialversicherung": {
    #    "arbeitslosen": {
    #        "betrag_m": np.array([0.0])
    #    },
    #    "rente": {
    #        "altersrente": {
    #            "betrag_m": np.array([0.0])
    #        },
    #        "erwerbsminderung": {
    #            "betrag_m": np.array([0.0])
    #        },
    #    },
    #},
}

main(
    main_target=MainTarget.templates.input_data_dtypes,
    policy_date_str="2025-01-01",
    input_data=InputData.tree(inp),
    include_warn_nodes=False,
)

MissingFunctionsError: The following targets have no corresponding function:

[
    "wohngeld__betrag_m_wthh",
]


In [18]:
INPUT_DATA_TREE = {
    "alter": np.array([40, 40, 5]),
    "alter_monate": np.array([480, 480, 60]),
    "arbeitslosengeld_2": {
        "bezug_im_vorjahr": np.array([True, True, True]),
        "eigenbedarf_gedeckt": np.array([False, False, False]),
        "p_id_einstandspartner": np.array([1, 0, -1]),
    },
    "arbeitsstunden_w": np.array([20, 0, 0]),
    "behinderungsgrad": np.array([0, 0, 0]),
    "einkommensteuer": {
        "abzüge": {
            "beitrag_private_rentenversicherung_m": np.array([0.0, 0.0, 0.0]),
            "kinderbetreuungskosten_m": np.array([0.0, 0.0, 0.0]),
            "p_id_kinderbetreuungskostenträger": np.array([-1, -1, 0]),
        },
        "einkünfte": {
            "aus_forst_und_landwirtschaft": {"betrag_m": np.array([0.0, 0.0, 0.0])},
            "aus_gewerbebetrieb": {"betrag_m": np.array([0.0, 0.0, 0.0])},
            "aus_kapitalvermögen": {"kapitalerträge_m": np.array([0.0, 0.0, 0.0])},
            "aus_nichtselbstständiger_arbeit": {
                "bruttolohn_m": np.array([1500.0, 0.0, 0.0])
            },
            "aus_selbstständiger_arbeit": {"betrag_m": np.array([0.0, 0.0, 0.0])},
            "aus_vermietung_und_verpachtung": {"betrag_m": np.array([0.0, 0.0, 0.0])},
            "ist_hauptberuflich_selbstständig": np.array([False, False, False]),
            "sonstige": {
                "alle_weiteren_m": np.array([0.0, 0.0, 0.0]),
                "rente": {
                    "betriebliche_altersvorsorge_m": np.array([0.0, 0.0, 0.0]),
                    "geförderte_private_vorsorge_m": np.array([0.0, 0.0, 0.0]),
                    "sonstige_private_vorsorge_m": np.array([0.0, 0.0, 0.0]),
                },
            },
        },
        "gemeinsam_veranlagt": np.array([True, True, False]),
    },
    "elterngeld": {"betrag_m": np.array([0.0, 0.0, 0.0])},
    "familie": {
        "alleinerziehend": np.array([False, False, False]),
        "p_id_ehepartner": np.array([1, 0, -1]),
        "p_id_elternteil_1": np.array([-1, -1, 0]),
        "p_id_elternteil_2": np.array([-1, -1, 1]),
    },
    "geburtsjahr": np.array([1985, 1985, 2020]),
    "hh_id": np.array([0, 0, 0]),
    "kindergeld": {
        "in_ausbildung": np.array([False, False, False]),
        "p_id_empfänger": np.array([-1, -1, 0]),
    },
    "p_id": np.array([0, 1, 2]),
    "sozialversicherung": {
        "arbeitslosen": {"betrag_m": np.array([0.0, 0.0, 0.0])},
        "kranken": {"beitrag": {"privat_versichert": np.array([False, False, False])}},
        "pflege": {"beitrag": {"hat_kinder": np.array([True, True, False])}},
        "rente": {
            "altersrente": {"betrag_m": np.array([0.0, 0.0, 0.0])},
            "bezieht_rente": np.array([False, False, False]),
            "erwerbsminderung": {"betrag_m": np.array([0.0, 0.0, 0.0])},
            "jahr_renteneintritt": np.array([2060, 2060, 2090]),
        },
    },
    "unterhalt": {"tatsächlich_erhaltener_betrag_m": np.array([0.0, 0.0, 0.0])},
    "vermögen": np.array([0.0, 0.0, 0.0]),
    "wohnen": {
        "bewohnt_eigentum_hh": np.array([False, False, False]),
        "bruttokaltmiete_m_hh": np.array([600.0, 600.0, 600.0]),
        "heizkosten_m_hh": np.array([60.0, 60.0, 60.0]),
        "wohnfläche_hh": np.array([50.0, 50.0, 50.0]),
    },
    "wohngeld": {"mietstufe_hh": np.array([4, 4, 4])},
}

The status quo is the following:

In [19]:
main(
    main_target=MainTarget.results.df_with_nested_columns,
    policy_date_str="2025-01-01",
    input_data=InputData.tree(INPUT_DATA_TREE),
    tt_targets={"tree": {"arbeitslosengeld_2": {"betrag_m_bg": None}}},
    include_warn_nodes=False,
)

Unnamed: 0_level_0,arbeitslosengeld_2
Unnamed: 0_level_1,betrag_m_bg
p_id,Unnamed: 1_level_2
0,790.916898
1,790.916898
2,790.916898


## Modifying Parameters

GETTSIM's parameters are stored in different objects depending on their type. If you
modify the parameters in the `policy_environment`, you will encounter the following
objects:

1. **ScalarParam**: A scalar parameter, i.e. a parameter that is a single number.
2. **DictParam**: A parameter that is a flat dictionary with homogeneous keys and values
   (i.e. all keys and values are of the same type).
3. **ConsecutiveIntLookupTableParam**: A lookup table that stores values and assigns a
   consecutive integer index to each value.
4. **PiecewisePolynomialParam**: A piecewise polynomial parameter, i.e. a parameter that
    describes a piecewise polynomial function.
5. **RawParam**: A parameter that does not fit into the other categories. For these
   parameters, we need `ParamFunction`s to process them (see next section).

Any of those parameter classes has the following attributes:
- `leaf_name`: The leaf name of the parameter in GETTSIM's policy environment.
- `start_date`: The date from which the parameter is valid (if applicable).
- `end_date`: The date until which the parameter is valid (if applicable).
- `unit`: The unit of the parameter (if applicable).
- `reference_period`: The period over which the parameter is valid (if applicable).
- `name`: The name of the parameter.
- `description`: A more elaborate description of the parameter.
- `value`: The value of the parameter.
- `note`: Some notes (if applicable).
- `reference`: A legal reference.

When modifying parameters, you will mostly care about the `value` attribute.

### Scalar Parameters

Scalar parameters are the simplest type of parameters. They are represented by the
`ScalarParam` class. They are stored as a single number in the `policy_environment`.

Let's take a look at the `kindersofortzuschlag` parameter. This parameter increases the
transfer to children by a fixed amount.

As you can see the `kindersofortzuschlag` parameter is a `ScalarParam` object and its
value is 25€ in the status quo.

In [20]:
status_quo_environment["arbeitslosengeld_2"]["kindersofortzuschlag"]

ScalarParam(leaf_name='kindersofortzuschlag', start_date=datetime.date(2025, 1, 1), end_date=datetime.date(2099, 12, 31), unit='Euros', reference_period='Month', name={'de': 'Kindersofortzuschlag für Arbeitslosengeld II', 'en': 'Instant surcharge for children for unemployment benefit'}, description={'de': '§ 72 SGB II Kinder, Jugendliche  und junge Erwachsene, die Anspruch auf Arbeitslosengeld II oder Sozialgeld haben (Regelbedarfsstufen 3, 4, 5, 6), erhalten einen Sofortzuschlag von 20€.', 'en': '§ 72 SGB II Children, adolescents and young adults who are entitled to unemployment benefits or social benefits (Regelbedarfsstufen 3, 4, 5, 6) receive an instant surcharge of 20 Euro.'}, value=25, note=None, reference=None)

In [21]:
status_quo_environment["arbeitslosengeld_2"]["kindersofortzuschlag"].value

25

Let's increase the parameter.

#### Step 1: Create a copy of the status quo policy environment. 

This is good practice to avoid inplace modifications of the original policy environment.

In [22]:
higher_kindersofortzuschlag_pe = copy_environment(
    status_quo_environment
)

#### Step 2: Create the new parameter.

Create a new `ScalarParam` object. To do this, we first import the `ScalarParam` class
from GETTSIM and then instantiate it with the new value.

**Tip**: You don't have to specify all attributes of the `ScalarParam` class. Only the
value attribute is required.

In [23]:
from gettsim.tt import ScalarParam

new_kindersofortzuschlag = ScalarParam(value=40)

#### Step 3: Replace the old parameter with the new one in the new policy environment

In [24]:
higher_kindersofortzuschlag_pe["arbeitslosengeld_2"][
    "kindersofortzuschlag"
] = new_kindersofortzuschlag

Let's call GETTSIM with the modified policy environment.

In [25]:
main(
    main_target=MainTarget.results.df_with_nested_columns,
    policy_date_str="2025-01-01",
    input_data=InputData.tree(INPUT_DATA_TREE),
    tt_targets={"tree": {"arbeitslosengeld_2": {"betrag_m_bg": None}}},
    policy_environment=higher_kindersofortzuschlag_pe,
    include_warn_nodes=False,
)

Unnamed: 0_level_0,arbeitslosengeld_2
Unnamed: 0_level_1,betrag_m_bg
p_id,Unnamed: 1_level_2
0,805.916898
1,805.916898
2,805.916898


### Dict Parameters

Dict parameters are parameters that are a flat dictionary with homogeneous keys and
values. They are represented by the `DictParam` class. They are stored as a flat
dictionary in the `policy_environment`.

Let's take a look at the `berechtigte_wohnfläche_miete` parameter. This parameter
contains the amount of the admissible housing size in square meters for recipients of
social welfare benefits.

As you can see the `berechtigte_wohnfläche_miete` parameter is a `DictParam` object and
its value is a dictionary.

In [26]:
status_quo_environment["arbeitslosengeld_2"]["berechtigte_wohnfläche_miete"]

DictParam(leaf_name='berechtigte_wohnfläche_miete', start_date=datetime.date(2005, 1, 1), end_date=datetime.date(2099, 12, 31), unit='Square Meters', reference_period=None, name={'de': 'Berechtigte Mietwohnfläche für ALG2-Empfänger*innen', 'en': 'Living rental space eligible for ALG2-recipients'}, description={'de': 'Eine Mietwohnung darf für einen Single 45 Quadratmeter (+15 für jede weitere Person) groß sein. Dies ist nur eine Approximation. Die regionalen Parameter sind unbekannt, siehe Issue https://github.com/ttsim-dev/gettsim/issues/782.', 'en': 'A rental apartment may be 45 square meters for a single person (+15 for each additional person). This is only an approximation. The regional parameters are unknown, see Issue https://github.com/ttsim-dev/gettsim/issues/782.'}, value={'single': 45, 'je_weitere_person': 15}, note=None, reference=None)

In [27]:
status_quo_environment["arbeitslosengeld_2"]["berechtigte_wohnfläche_miete"].value

{'single': 45, 'je_weitere_person': 15}

Let's modify the parameter by decreasing admissible household size for a single person.
We follow the same steps as in the previous section.

In [30]:
from gettsim.tt import DictParam

new_berechtigte_wohnfläche_miete = DictParam(
    value={
        "single": 15,
        "je_weitere_person": 15,
    },
)

lower_berechtigte_wohnfläche_miete_pe = copy_environment(
    status_quo_environment
)

lower_berechtigte_wohnfläche_miete_pe["arbeitslosengeld_2"]["berechtigte_wohnfläche_miete"] = new_berechtigte_wohnfläche_miete

main(
    main_target=MainTarget.results.df_with_nested_columns,
    policy_date_str="2025-01-01",
    input_data=InputData.tree(INPUT_DATA_TREE),
    tt_targets={"tree": {"arbeitslosengeld_2": {"betrag_m_bg": None}}},
    policy_environment=lower_berechtigte_wohnfläche_miete_pe,
    include_warn_nodes=False,
)

Unnamed: 0_level_0,arbeitslosengeld_2
Unnamed: 0_level_1,betrag_m_bg
p_id,Unnamed: 1_level_2
0,740.916898
1,740.916898
2,740.916898


### Consecutive Int Lookup Table Parameters

Consecutive Int Lookup Table Parameters are one-dimensional arrays. GETTSIM uses then
whenever a parameter is a function of a single integer variable (like age in months,
number of household members, etc.).

There are very few parameters of this type in GETTSIM's policy environment; most of them
are created via `param_function`s.

Let's step out of the social welfare benefits example for this parameter and look at
pension benefits. Here, usage of `ConsecutiveIntLookupTableParam` is more common. In
particular, we'll look at the normal retirement age parameter, which is a function of
the birth year.

In [77]:
status_quo_environment["sozialversicherung"]["rente"]["altersrente"]["regelaltersrente"]["altersgrenze_gestaffelt"]

ConsecutiveIntLookupTableParam(leaf_name='altersgrenze_gestaffelt', start_date=datetime.date(2007, 4, 20), end_date=datetime.date(2030, 12, 31), unit='Years', reference_period=None, name={'de': 'Gestaffeltes Eintrittsalter für Regelaltersrente nach Geburtsjahr', 'en': 'Staggered normal retirement age (NRA) for Regelaltersrente by birth year'}, description={'de': '§ 35 Satz 2 SGB VI Regelaltersgrenze ab der Renteneintritt möglich ist. Wenn früher oder später in Rente gegangen wird, wird der Zugangsfaktor und damit der Rentenanspruch höher oder niedriger, sofern keine Sonderregelungen gelten.', 'en': '§ 35 Satz 2 SGB VI Normal retirement age from which pension can be received. If retirement benefits are claimed earlier or later, the Zugangsfaktor and thus the pension entitlement is higher or lower unless special regulations apply.'}, value=<ttsim.tt.param_objects.ConsecutiveIntLookupTableParamValue object at 0x171a1ea80>, note=None, reference=None)

In [78]:
status_quo_environment["sozialversicherung"]["rente"]["altersrente"]["regelaltersrente"]["altersgrenze_gestaffelt"].value

<ttsim.tt.param_objects.ConsecutiveIntLookupTableParamValue at 0x171a1ea80>

The `ConsecutiveIntLookupTableParamValue` has the following attributes:
- `values_to_look_up`: an array of values
- `bases_to_subtract`: the base value to subtract when indexing into
  `values_to_look_up`. For example, when setting this to `10`, indexing the
  `ConsecutiveIntLookupTableParamValue` at `12` returns the value at indes `12 - 10 =
  2`.

In this example here, one could look up the parameter value at `1900` via the `look_up´
method to get the first value of the array:

In [None]:
status_quo_environment["sozialversicherung"]["rente"]["altersrente"]["regelaltersrente"]["altersgrenze_gestaffelt"].value.bases_to_subtract

array([[1900]])

In [80]:
status_quo_environment["sozialversicherung"]["rente"]["altersrente"]["regelaltersrente"]["altersgrenze_gestaffelt"].value.values_to_look_up

array([65.        , 65.        , 65.        , 65.        , 65.        ,
       65.        , 65.        , 65.        , 65.        , 65.        ,
       65.        , 65.        , 65.        , 65.        , 65.        ,
       65.        , 65.        , 65.        , 65.        , 65.        ,
       65.        , 65.        , 65.        , 65.        , 65.        ,
       65.        , 65.        , 65.        , 65.        , 65.        ,
       65.        , 65.        , 65.        , 65.        , 65.        ,
       65.        , 65.        , 65.        , 65.        , 65.        ,
       65.        , 65.        , 65.        , 65.        , 65.        ,
       65.        , 65.        , 65.08333333, 65.16666667, 65.25      ,
       65.33333333, 65.41666667, 65.5       , 65.58333333, 65.66666667,
       65.75      , 65.83333333, 65.91666667, 66.        , 66.16666667,
       66.33333333, 66.5       , 66.66666667, 66.83333333, 67.        ,
       67.        , 67.        , 67.        , 67.        , 67.  

In [81]:
status_quo_environment["sozialversicherung"]["rente"]["altersrente"]["regelaltersrente"]["altersgrenze_gestaffelt"].value.look_up(1900)

array([65.])

Let's create a modified version with a steeper increase in the normal retirement age.

In [83]:
from gettsim.tt import ConsecutiveIntLookupTableParam, ConsecutiveIntLookupTableParamValue

increased_nra_by_birth_year = ConsecutiveIntLookupTableParam(
    value=ConsecutiveIntLookupTableParamValue(
        values_to_look_up=np.array([65.0] * 45 + [65.5, 66.0, 66.5, 67.0] + [67.0] * 51),
        bases_to_subtract=np.array([1900]),
        xnp=np,
    ),
)

increased_nra_by_birth_year_pe = copy_environment(status_quo_environment)
increased_nra_by_birth_year_pe["sozialversicherung"]["rente"]["altersrente"]["regelaltersrente"]["altersgrenze_gestaffelt"] = increased_nra_by_birth_year

main(
    main_target=MainTarget.results.df_with_nested_columns,
    policy_date_str="2025-01-01",
    input_data=InputData.tree({
        "geburtsjahr": np.array([1944, 1945, 1946, 1947, 1948]),
        "p_id": np.array([0, 1, 2, 3, 4])
    }),
    tt_targets={"tree": {"sozialversicherung": {"rente": {"altersrente": {"regelaltersrente": {"altersgrenze": None}}}}}},
    policy_environment=increased_nra_by_birth_year_pe,
    include_warn_nodes=False,
)

Unnamed: 0_level_0,sozialversicherung
Unnamed: 0_level_1,rente
Unnamed: 0_level_2,altersrente
Unnamed: 0_level_3,regelaltersrente
Unnamed: 0_level_4,altersgrenze
p_id,Unnamed: 1_level_5
0,65.0
1,65.5
2,66.0
3,66.5
4,67.0


### Piecewise Polynomial Parameters

Piecewise polynomial parameters specify a continuous polynomial (first to third degree)
on the real line. GETTSIM uses them whenever a parameter is a function of a continuous
variable (like income, age, etc.).



## Modifying Parameter Functions

## Modifying Column Objects