# Visualizing the Taxes and Transfers System

## How to Create a Plot

To help you understand how GETTSIM works internally and how you are able to implement custom reforms, you can visualize the tax and transfer system. This tutorial explains how to create a graphic and what information you can get from it. It also explains GETTSIM's design to some extent.

In [1]:
import pandas as pd

from gettsim import set_up_policy_environment
from gettsim import plot_dag

For the visualization, we need to set up our policy environment.

In [2]:
policy_params, policy_functions = set_up_policy_environment(date=2020)

Functions inside GETTSIM are a little bit special. Take for example `kindergeld_m_tu` which is documented [here](https://gettsim.readthedocs.io/en/latest/functions.html#gettsim.functions.kindergeld_m_hh). The signature of the function is

```python
def kindergeld_m_hh(kindergeld_m, hh_id):
    pass
```

This functions has two arguments and none of them are parameters. Most functions require some parameters, but it is not necessary. The names of the arguments correspond to either a variable in the data provided by the user or to another function which, in turn, also relies on some arguments.

Here, ``hh_id`` is a variable which indicates households and needs to be given by the data. ``kindergeld_m`` on the other hand is itself a function which is documented [here](https://gettsim.readthedocs.io/en/latest/functions.html#gettsim.functions.kindergeld_m_ab_1997) and [here](https://gettsim.readthedocs.io/en/latest/functions.html#gettsim.functions.kindergeld_m_bis_1996) (different versions for different time periods). By using ``kindergeld_m`` as a an argument name, GETTSIM knows to pass the data computed by the function ``kindergeld_m`` to ``kindergeld_m_hh``.

This dependency relationship can be analyzed for all functions passed to GETTSIM and be visualized in a dag. Below you can see a plot of all variables which are directly connected to ``kindergeld_m``. The arrows point from dependencies to dependents. Each node is either a functions or a variable. By clicking on a node, you are redirected to the documentation of gettsim and if the variable is computed by a function, the function's documentation is displayed.

In [3]:
p = plot_dag(
    functions=policy_functions,
    selectors=[{"node": "kindergeld_m", "type": "neighbors"}],
);

The general interface of the plotting function is similar to ``compute_taxes_and_transfers()``, but without the ``data`` and ``params`` argument. Here is the complete signature.

In [4]:
plot_dag?

[1;31mSignature:[0m
[0mplot_dag[0m[1;33m([0m[1;33m
[0m    [0mfunctions[0m[1;33m,[0m[1;33m
[0m    [0mtargets[0m[1;33m=[0m[1;32mNone[0m[1;33m,[0m[1;33m
[0m    [0mcolumns_overriding_functions[0m[1;33m=[0m[1;32mNone[0m[1;33m,[0m[1;33m
[0m    [0mcheck_minimal_specification[0m[1;33m=[0m[1;34m'ignore'[0m[1;33m,[0m[1;33m
[0m    [0mselectors[0m[1;33m=[0m[1;32mNone[0m[1;33m,[0m[1;33m
[0m    [0mlabels[0m[1;33m=[0m[1;32mTrue[0m[1;33m,[0m[1;33m
[0m    [0mtooltips[0m[1;33m=[0m[1;32mFalse[0m[1;33m,[0m[1;33m
[0m    [0mplot_kwargs[0m[1;33m=[0m[1;32mNone[0m[1;33m,[0m[1;33m
[0m    [0marrow_kwargs[0m[1;33m=[0m[1;32mNone[0m[1;33m,[0m[1;33m
[0m    [0medge_kwargs[0m[1;33m=[0m[1;32mNone[0m[1;33m,[0m[1;33m
[0m    [0mlabel_kwargs[0m[1;33m=[0m[1;32mNone[0m[1;33m,[0m[1;33m
[0m    [0mnode_kwargs[0m[1;33m=[0m[1;32mNone[0m[1;33m,[0m[1;33m
[0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[1;31mDocstri

In the following, you see many ways to either select different subsets of the graph or style the plot.

## Labels

It is possible to hide the labels of the nodes by setting `plot_dag(..., labels=False)`.


## Selectors

Selectors allow you to visualize only a subset of the complete graph of the tax and transfer systems. They can be passed to the `selectors` argument of the `plot_dag()` function. There exist some ways to define a selector and they can be combined with one another. Let us discuss each selector on its own first.


### Basics

It is always possible to pass a string or a list of strings to `selectors`. In this case only the given nodes are displayed in the plot.

In [5]:
selectors = "kinderfreib_tu"

plot_dag(functions=policy_functions, selectors=selectors);

Using a list of variable names, we can select multiple nodes.

In [6]:
selectors = ["kinderfreib_tu", "_zu_verst_eink_kinderfreib_tu"]

plot_dag(functions=policy_functions, selectors=selectors);

Passing a string or a list of strings to `selectors` is actually a shortcut for the richer interface for selecting nodes. Selectors are usually represented as dictionaries. The corresponding dictionary for selecting a list of nodes is

In [7]:
selector = {
    "type": "nodes",
    "node": ["kinderfreib_tu", "_zu_verst_eink_kinderfreib_tu"],
    "select": True,  # optional
}

Let us go through the keys of the dictionary one by one.

1. `"type"` specifies the type of the selector. For a single node or a list of nodes the type is `"nodes"`.
2. `"node"` always refers to the node or nodes to which the selector is applied. In this case, it is the list of node names.
3. `"select"` specifies whether the nodes should be selected or de-selected. If you do not specify `"select"` it is assumed to be `True`.

### De-selecting Nodes

It is also possible to specify selectors which de-select some nodes. Note that,

- De-selectors are applied after nodes have been selected.
- If no selectors are provided, de-selectors de-select nodes from the complete DAG.
- Selection and de-selection works for all selector types which follow.

For a simple and silly example, we want to reproduce the graph with the single node for `kinderfreib_tu` after, but starting from the last plot which also showed `_zu_verst_eink_kinderfreib_tu`.

First, we define the selectors. The first selector or dictionary in the list selects the two nodes. Note that the `"select"` key is `True` by default. The second key in the de-selects `"_zu_verst_eink_kinderfreib_tu"`.

In [8]:
selectors = [
    {
        "type": "nodes",
        "node": ["kinderfreib_tu", "_zu_verst_eink_kinderfreib_tu"],
    },
    {
        "type": "nodes",
        "node": "_zu_verst_eink_kinderfreib_tu",
        "select": False,
    }
]

In [9]:
plot_dag(functions=policy_functions, selectors=selectors);

### Ancestors and Descendants

Two other types of selectors allow you to pick one node and all nodes which appear before or after this node. We call the nodes ancestors or descendants, respectively. To select `"anz_kindergeld_kinder_tu"` which are the children per tax unit for whom the tax unit receives child benefits and all its ancestors, do the following.

In [10]:
selector = {"type": "ancestors", "node": "anz_kindergeld_kinder_tu"}

In [11]:
plot_dag(functions=policy_functions, selectors=selector);

To see the variables which are explicitly and implicitly dependent on the information in `"_geringfügig_beschäftigt"` use the type `"descendants"`.

In [12]:
selector = {"type": "descendants", "node": "_geringfügig_beschäftigt"}

In [13]:
plot_dag(functions=policy_functions, selectors=selector);

### Neighbors

Another common way to look at a graph is to visualize a node and its neighbors, its ancestors and descendants. Let us take a look at `"anz_kindergeld_kinder_tu"` again and visualize its direct neighbors.

In [14]:
selector = {"type": "neighbors", "node": "anz_kindergeld_kinder_tu"}

In [15]:
plot_dag(functions=policy_functions, selectors=selector);

It is also possible to look at more distant neighbors or neighbors of order 2, 3, ... . This can be done by the `"order"` key which is 1 by default.

In [16]:
selector = {"type": "neighbors", "node": "anz_kindergeld_kinder_tu", "order": 2}

In [17]:
plot_dag(functions=policy_functions, selectors=selector);

## Tooltips (experimental)

Instead of clicking on the nodes to be redirected to the online documentation of gettsim, it is possible to display the source code of functions in tooltips while hovering over the nodes. To enable this experimental feature, pass ``tooltips=True`` to the plotting function.

Unfortunately, the tooltips are sometimes cut off or misplaced which can be mitigated by increasing the figure size and the margins.

In [18]:
from bokeh.models import Range1d

In [19]:
plot_kwargs = {
    "plot_width": 1_000,
    "plot_height": 1_000,
    "x_range": Range1d(-1.5, 1.5),
    "y_range": Range1d(-1.5, 1.5)
}

In [20]:
plot_dag(functions=policy_functions, selectors=selector, tooltips=True, plot_kwargs=plot_kwargs);