# Parameter Indexing

Parameter indexing is outside the scope of ParamTools. However, indexed parameters are an important part of many modeling projects and ParamTools is up to the challenge. Let's take a look at how one might implement indexed parameters with ParamTools.

Setup
--------------------------

The TaxParams parameters will be used to demonstrate parameter indexing. These parameters are based directly on the Tax-Calculator policy parameters. Tax-Calculator does serious parameter indexing since many of its parameters depend on inflation and wage growth rates. This tutorial demonstrates how you can replicate that same level of parameter indexing with ParamTools.

The approach for this tutorial is to build a `IndexedParameters` class on top of `paramtools.Parameters`. This class can then be used by projects that require parameter indexing.

### Get the code and the data

1. Clone the ParamTools repo: https://github.com/PSLmodels/ParamTools
2. Install paramtools
    ```
    conda create -n paramtools-env numpy taxcalc pip -c pslmodels
    conda activate paramtools-env
    # do a local install (temporary) 
    cd ParamTools/
    pip install -e .
    ```
3. Change directories to `ParamTools/paramtools/examples/taparams-demo`



In [1]:
# quick helper to print stuff out nicely.
def pprint(vals):
    for v in vals:
        print(v)

### TaxParams

Before we get started, let's make sure that the TaxParams can be loaded.

In [2]:
from marshmallow import Schema, fields

import paramtools

# first define the compatible data custom field.
class CompatibleDataSchema(Schema):
    """
    Schema for Compatible data object
    {
        "compatible_data": {"data1": bool, "data2": bool, ...}
    }
    """

    puf = fields.Boolean()
    cps = fields.Boolean()

class TaxParams(paramtools.Parameters):
    # You need to be in the paramtools/examples/taxparams directory!
    schema = "schema.json"
    defaults = "defaults.json"
    field_map = {"compatible_data": fields.Nested(CompatibleDataSchema())}

params = TaxParams()
print("EITC celing max year: ", max(map(lambda x: x["year"], params._EITC_c)), "\n")
print("EITC ceiling ceiling as dict: ")
pprint(params._EITC_c)

EITC celing max year:  2018 

EITC ceiling ceiling as dict: 
{'value': 487.0, 'EIC': '0kids', 'year': 2013}
{'value': 3250.0, 'EIC': '1kid', 'year': 2013}
{'value': 5372.0, 'EIC': '2kids', 'year': 2013}
{'value': 6044.0, 'EIC': '3+kids', 'year': 2013}
{'value': 496.0, 'EIC': '0kids', 'year': 2014}
{'value': 3305.0, 'EIC': '1kid', 'year': 2014}
{'value': 5460.0, 'EIC': '2kids', 'year': 2014}
{'value': 6143.0, 'EIC': '3+kids', 'year': 2014}
{'value': 503.0, 'EIC': '0kids', 'year': 2015}
{'value': 3359.0, 'EIC': '1kid', 'year': 2015}
{'value': 5548.0, 'EIC': '2kids', 'year': 2015}
{'value': 6242.0, 'EIC': '3+kids', 'year': 2015}
{'value': 506.0, 'EIC': '0kids', 'year': 2016}
{'value': 3373.0, 'EIC': '1kid', 'year': 2016}
{'value': 5572.0, 'EIC': '2kids', 'year': 2016}
{'value': 6269.0, 'EIC': '3+kids', 'year': 2016}
{'value': 510.0, 'EIC': '0kids', 'year': 2017}
{'value': 3400.0, 'EIC': '1kid', 'year': 2017}
{'value': 5616.0, 'EIC': '2kids', 'year': 2017}
{'value': 6318.0, 'EIC': '3+kids'

Extend Parameters
------------------------

To get started, the ability to extend parameters through a given year needs to be added to ParamTools. This is done by finding the maximum specifed year for each parameter and duplicating each value object defined at this maximum year for the remaining years. For example, the maximum defined year for `_EITC_c` is 2018. There are four values defined in 2018. These four values will be extended for the remaining years, 2019 to 2028.

```json
[
    {"year": 2018, "value": 519.0, "EIC": "0kids"}, 
    {"year": 2018, "value": 3461.0, "EIC": "1kid"}, 
    {"year": 2018, "value": 5716.0, "EIC": "2kids"}, 
    {"year": 2018, "value": 6431.0, "EIC": "3+kids"}
]
```

In [3]:
from collections import defaultdict

import paramtools

class ExtendParameters(paramtools.Parameters):
    
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
        self.extend()
    
    def extend(self):
        """
        Guarantee that all parameters are defined for each year
        from start year to end year.
        """
        max_allowed_year = max(self._stateless_dim_mesh["year"])
        adjustment = defaultdict(list)
        for param, data in self.specification(meta_data=True).items():
            max_year = max(map(lambda x: x["year"], data["value"]))
            if max_year == max_allowed_year:
                continue
            value_objects = self._get(param, True, year=max_year)
            while max_year < max_allowed_year:
                max_year += 1
                for value_object in value_objects:
                    adjustment[param].append(dict(value_object, **{"year": max_year}))
        self.adjust(adjustment)


class TaxParams(ExtendParameters):
    schema = "schema.json"
    defaults = "defaults.json"
    field_map = {"compatible_data": fields.Nested(CompatibleDataSchema())}

params = TaxParams()
print("EITC celing max year: ", max(map(lambda x: x["year"], params._EITC_c)), "\n")

EITC celing max year:  2028 



We can activate the `array_first` mode, since all parameters have been extended across the year axis. This means that all parameter values will be available as arrays instead of a list of dictionaries, or [value-objects][1]. 


[1]: https://paramtools.readthedocs.io/en/latest/spec.html#value-object

In [4]:
# all parameters can now be converted to arrays.
params.array_first = True
params.set_state()
print("EITC celing as array: ")
print(params._EITC_c)


EITC celing as array: 
[[ 487. 3250. 5372. 6044.]
 [ 496. 3305. 5460. 6143.]
 [ 503. 3359. 5548. 6242.]
 [ 506. 3373. 5572. 6269.]
 [ 510. 3400. 5616. 6318.]
 [ 519. 3461. 5716. 6431.]
 [ 519. 3461. 5716. 6431.]
 [ 519. 3461. 5716. 6431.]
 [ 519. 3461. 5716. 6431.]
 [ 519. 3461. 5716. 6431.]
 [ 519. 3461. 5716. 6431.]
 [ 519. 3461. 5716. 6431.]
 [ 519. 3461. 5716. 6431.]
 [ 519. 3461. 5716. 6431.]
 [ 519. 3461. 5716. 6431.]
 [ 519. 3461. 5716. 6431.]]


Note that 519, 3461, 5716, and 6431 have been set as the default value for year 2018 to 2028.

Indexed Parameters Intuition
-----------------------------------


The previous example shows how to extend parameter values along a given axis, like "year". However, what's really going on is that they are just repeated until the maximum year is reached. Now, it's time to index them at some specified rate.

To "grow" a parameter forward a year, we need to multiply it by one plus the rate at which it is expected to grow or its recorded growth rate. First, let's generate some data:

In [5]:
min_allowed_year = 2013
max_allowed_year = 2028
# max specified year
max_year = 2019

MARS = ["single", "joint", "separate", "headhousehold", "widow"]

vals = [{"year": 2013 + i, "MARS": status, "value": 1000 + i} 
          for i in range(max_year - min_allowed_year + 1) 
          for status in MARS]

pprint(vals)

{'year': 2013, 'MARS': 'single', 'value': 1000}
{'year': 2013, 'MARS': 'joint', 'value': 1000}
{'year': 2013, 'MARS': 'separate', 'value': 1000}
{'year': 2013, 'MARS': 'headhousehold', 'value': 1000}
{'year': 2013, 'MARS': 'widow', 'value': 1000}
{'year': 2014, 'MARS': 'single', 'value': 1001}
{'year': 2014, 'MARS': 'joint', 'value': 1001}
{'year': 2014, 'MARS': 'separate', 'value': 1001}
{'year': 2014, 'MARS': 'headhousehold', 'value': 1001}
{'year': 2014, 'MARS': 'widow', 'value': 1001}
{'year': 2015, 'MARS': 'single', 'value': 1002}
{'year': 2015, 'MARS': 'joint', 'value': 1002}
{'year': 2015, 'MARS': 'separate', 'value': 1002}
{'year': 2015, 'MARS': 'headhousehold', 'value': 1002}
{'year': 2015, 'MARS': 'widow', 'value': 1002}
{'year': 2016, 'MARS': 'single', 'value': 1003}
{'year': 2016, 'MARS': 'joint', 'value': 1003}
{'year': 2016, 'MARS': 'separate', 'value': 1003}
{'year': 2016, 'MARS': 'headhousehold', 'value': 1003}
{'year': 2016, 'MARS': 'widow', 'value': 1003}
{'year': 201

Second, we need to be able to to get the previous year's value, while taking into account the current value object's MARS value.

In [6]:
def get_mars_lookup(value_objects):
    """Return dictionary where the MARS values are the keys."""
    return {vo["MARS"]: {"value": vo["value"]} for vo in value_objects}

test_vals = [
    {'year': 2013, 'MARS': 'single', 'value': 'single val'},
    {'year': 2013, 'MARS': 'joint', 'value': 1000},
    {'year': 2013, 'MARS': 'separate', 'value': 1000},
    {'year': 2013, 'MARS': 'headhousehold', 'value': 1000},
    {'year': 2013, 'MARS': 'widow', 'value': 1000}
]
mars_lookup_2013 = get_mars_lookup(test_vals)

print("Look up the value for MARS=single: ", mars_lookup_2013["single"]["value"])

Look up the value for MARS=single:  single val


Finally, we extend and index the data.

In [7]:
import taxcalc
import numpy as np

# use taxcalc inflation rates
rates = taxcalc.Policy().inflation_rates()

print("rates", rates)
print("vals")
pprint(vals[:-10])

for ix, year in enumerate(range(max_year + 1, max_allowed_year + 1)):
    prev_year = year - 1
    prev_year_vals = [val for val in vals if val["year"] == prev_year]
    mars_lookup = get_mars_lookup(prev_year_vals)
    for status in MARS:
        new_val = {
            "year": year,
            "MARS": status,
            # previous value * inflation rate for current year!
            "value": np.round(mars_lookup[status]["value"] * 
                              (1 + rates[max_year - min_allowed_year + ix]), 2)
        }
        vals.append(new_val)
    
print("result for a subsection of the values: ")
pprint(vals[25:-25])
print("...")

rates [0.0148, 0.0159, 0.0012, 0.0127, 0.0189, 0.0229, 0.0199, 0.0224, 0.0227, 0.0223, 0.0218, 0.0215, 0.0211, 0.021, 0.021, 0.0211]
vals
{'year': 2013, 'MARS': 'single', 'value': 1000}
{'year': 2013, 'MARS': 'joint', 'value': 1000}
{'year': 2013, 'MARS': 'separate', 'value': 1000}
{'year': 2013, 'MARS': 'headhousehold', 'value': 1000}
{'year': 2013, 'MARS': 'widow', 'value': 1000}
{'year': 2014, 'MARS': 'single', 'value': 1001}
{'year': 2014, 'MARS': 'joint', 'value': 1001}
{'year': 2014, 'MARS': 'separate', 'value': 1001}
{'year': 2014, 'MARS': 'headhousehold', 'value': 1001}
{'year': 2014, 'MARS': 'widow', 'value': 1001}
{'year': 2015, 'MARS': 'single', 'value': 1002}
{'year': 2015, 'MARS': 'joint', 'value': 1002}
{'year': 2015, 'MARS': 'separate', 'value': 1002}
{'year': 2015, 'MARS': 'headhousehold', 'value': 1002}
{'year': 2015, 'MARS': 'widow', 'value': 1002}
{'year': 2016, 'MARS': 'single', 'value': 1003}
{'year': 2016, 'MARS': 'joint', 'value': 1003}
{'year': 2016, 'MARS': 'se

There you go. A simple indexing function. However, it's not going to work for `TaxParams` just yet. Tax-Calculator parameters could have other dimensions besides Marital Status, like Itemized Deduction type. Further, we need to check that the parameter needs to be indexed in the first place.

`IndexedParameters` Class
-----------------------------

`IndexedParameters` builds in the notions discussed in the previous section in a more general way. Additionally, it supports adjusting parameters and extending and indexing those new values. The `extend` method is pretty gnarly, but if you stare at it long enough, you'll notice that the `get_vo_lookup` corresponds to `get_mars_lookup` and the for loop boundaries are identical, among other similarities.

In [8]:
from collections import defaultdict

import numpy as np
from marshmallow import Schema, fields

import paramtools

import taxcalc


class IndexedParameters(paramtools.Parameters):
    def __init__(self):
        super().__init__()
        self.extend()

    def extend(self, params_to_extend=None):
        min_allowed_year = min(self._stateless_dim_mesh["year"])
        max_allowed_year = max(self._stateless_dim_mesh["year"])
        adjustment = defaultdict(list)
        spec = self.specification(use_state=False, meta_data=True)
        if params_to_extend is None:
            param_data = spec
        else:
            param_data = {param: spec[param] for param in params_to_extend}

        def get_vo_lookup(vos, dims):
            qh = {}
            for vo in vos:
                qh[tuple(vo[d] for d in dims)] = vo["value"]
            return qh

        for param, data in param_data.items():
            max_year = max(map(lambda x: x["year"], data["value"]))
            if max_year == max_allowed_year:
                continue
            max_year = data["value"][-1]["year"]
            if max_year == max_allowed_year:
                continue
            value_objects = self._get(param, True, year=max_year)
            if data["cpi_inflated"]:
                # preserve order!
                dims_to_match = sorted(
                    [
                        dim_name
                        for dim_name in value_objects[0]
                        if dim_name not in ("year", "value")
                    ]
                )
                vo_lookup = get_vo_lookup(value_objects, dims_to_match)
                rates = self.indexing_rates(param)
                for ix, year in enumerate(
                    range(max_year + 1, max_allowed_year + 1)
                ):
                    for vo in value_objects:
                        dim_values = tuple(
                            vo[dim_name] for dim_name in dims_to_match
                        )
                        v = vo_lookup[dim_values] * (
                            1 + rates[max_year - min_allowed_year + ix]
                        )
                        v = np.round(v, 2) if v < 9e99 else 9e99
                        adjustment[param].append(
                            dict(vo, **{"year": year, "value": v})
                        )
                        vo_lookup[dim_values] = v
            else:
                for year in range(max_year, max_allowed_year + 1):
                    for vo in value_objects:
                        adjustment[param].append(dict(vo, **{"year": year}))
        
        self.array_first = True
        self.adjust(adjustment)

    def adjust_with_extend(self, params_or_path, raise_errors=False):
        params = self.read_params(params_or_path)
        curr_vals = self.specification()
        for param, param_adj in params.items():
            max_year = max(map(lambda x: x["year"], param_adj))
            for vo in curr_vals[param]:
                if vo["year"] > max_year:
                    params[param].append(dict(vo, **{"value": None}))
        self.array_first = False
        self.adjust(params)
        self.extend(params_to_extend=list(params.keys()))


`IndexedParameters` isn't meant to stand on its own. `TaxParams` must implement it before it can be used.

In [9]:
class TaxParams(IndexedParameters):
    schema = "schema.json"
    defaults = "defaults.json"
    field_map = {"compatible_data": fields.Nested(CompatibleDataSchema())}

    def __init__(self, *args, **kwargs):
        # Prepare the taxcalc inflation rates.
        growfactors = taxcalc.GrowFactors()
        self._inflation_rates = growfactors.price_inflation_rates(2013, 2028)
        self._apply_clp_cpi_offset(2028 - 2013 + 1)
        self._wage_growth_rates = growfactors.wage_growth_rates(2013, 2028)
        super().__init__(*args, **kwargs)

    def indexing_rates(self, param=None):
        """
        See taxcalc.Parameters.indexing_rates.
        """
        if param == "_SS_Earnings_c":
            return self._wage_growth_rates
        else:
            return self._inflation_rates

    def _apply_clp_cpi_offset(self, num_years):
        """
        See taxcalc.Policy._apply_clp_cpi_offset

        If you are curious about what's going on here, the
        cpi_offset parameter is an approximation for the chained
        cpi.
        """
        cpi_offset = [0.0, 0.0, 0.0, 0.0, -0.0025]
        if len(cpi_offset) < num_years:  # extrapolate last known value
            cpi_offset = cpi_offset + cpi_offset[-1:] * (
                num_years - len(cpi_offset)
            )
        for idx in range(0, num_years):
            infrate = round(self._inflation_rates[idx] + cpi_offset[idx], 6)
            self._inflation_rates[idx] = infrate


In [10]:
params = TaxParams()
print("paramtools: ", params._EITC_c, "\n")

pol = taxcalc.Policy()
print("taxcalc: ", pol._EITC_c)



paramtools:  [[ 487.   3250.   5372.   6044.  ]
 [ 496.   3305.   5460.   6143.  ]
 [ 503.   3359.   5548.   6242.  ]
 [ 506.   3373.   5572.   6269.  ]
 [ 510.   3400.   5616.   6318.  ]
 [ 519.   3461.   5716.   6431.  ]
 [ 530.89 3540.26 5846.9  6578.27]
 [ 541.45 3610.71 5963.25 6709.18]
 [ 553.58 3691.59 6096.83 6859.47]
 [ 566.15 3775.39 6235.23 7015.18]
 [ 578.78 3859.58 6374.28 7171.62]
 [ 591.4  3943.72 6513.24 7327.96]
 [ 604.12 4028.51 6653.27 7485.51]
 [ 616.87 4113.51 6793.65 7643.45]
 [ 629.82 4199.89 6936.32 7803.96]
 [ 643.05 4288.09 7081.98 7967.84]] 

taxcalc:  [[ 487.   3250.   5372.   6044.  ]
 [ 496.   3305.   5460.   6143.  ]
 [ 503.   3359.   5548.   6242.  ]
 [ 506.   3373.   5572.   6269.  ]
 [ 510.   3400.   5616.   6318.  ]
 [ 519.   3461.   5716.   6431.  ]
 [ 530.89 3540.26 5846.9  6578.27]
 [ 541.45 3610.71 5963.25 6709.18]
 [ 553.58 3691.59 6096.83 6859.47]
 [ 566.15 3775.39 6235.23 7015.18]
 [ 578.78 3859.58 6374.28 7171.62]
 [ 591.4  3943.72 6513.24 732

Let's confirm that the results are the same.

In [11]:
for param in params.specification():
    np.testing.assert_allclose(getattr(params, param), getattr(pol, param))

print("No errors were raised!")

No errors were raised!


Adjust
----------

The `IndexedParameters.adjust_with_extend` method should be used to adjust the parameter values. This method finds the maximum specified year for a parameter in a reform. It then removes all values set in later years. The new value is extended and indexed using the `extend` method.\*

Here's how you adjust the the Tax-Calculator parameters with `TaxParams` and with `taxcalc.Policy`. This adjustment is a component of the Brown-Khanna Grow American Incomes Now (GAIN) Act of 2017. The entire Tax-Calculator reform for this act is in [Tax-Calculator's repo][1].

\* This is a coarse approach. More care could be taken for adjustments that, for example, adjust a parameter's value in 2018 and in 2025. Right now the 2018 value is updated but it is ignored by the `extend` method. If anyone is interested in this functionality, I will happily implement it or help someone else implement it.

[1]: https://github.com/hdoupe/Tax-Calculator/blob/master/taxcalc/reforms/BrownKhanna.json

In [12]:
params.adjust_with_extend({
    "_EITC_c": [
        {"EIC": "0kids", "year": 2017, "value": 3000},
        {"EIC": "1kid", "year": 2017, "value": 6528},
        {"EIC": "2kids", "year": 2017, "value":10783},
        {"EIC": "3+kids", "year": 2017, "value":12131}
    ]
})

pol.implement_reform({
    2017: {
        "_EITC_c": [[3000, 6528, 10783, 12131]]
    }
})

In [13]:
print("paramtools: ", params._EITC_c, "\n")

print("taxcalc: ", pol._EITC_c)

np.testing.assert_allclose(params._EITC_c, pol._EITC_c)

print("Updated params are the same!")

paramtools:  [[  487.    3250.    5372.    6044.  ]
 [  496.    3305.    5460.    6143.  ]
 [  503.    3359.    5548.    6242.  ]
 [  506.    3373.    5572.    6269.  ]
 [ 3000.    6528.   10783.   12131.  ]
 [ 3056.7   6651.38 10986.8  12360.28]
 [ 3126.7   6803.7  11238.4  12643.33]
 [ 3188.92  6939.09 11462.04 12894.93]
 [ 3260.35  7094.53 11718.79 13183.78]
 [ 3334.36  7255.58 11984.81 13483.05]
 [ 3408.72  7417.38 12252.07 13783.72]
 [ 3483.03  7579.08 12519.17 14084.21]
 [ 3557.92  7742.03 12788.33 14387.02]
 [ 3632.99  7905.39 13058.16 14690.59]
 [ 3709.28  8071.4  13332.38 14999.09]
 [ 3787.17  8240.9  13612.36 15314.07]] 

taxcalc:  [[  487.    3250.    5372.    6044.  ]
 [  496.    3305.    5460.    6143.  ]
 [  503.    3359.    5548.    6242.  ]
 [  506.    3373.    5572.    6269.  ]
 [ 3000.    6528.   10783.   12131.  ]
 [ 3056.7   6651.38 10986.8  12360.28]
 [ 3126.7   6803.7  11238.4  12643.33]
 [ 3188.92  6939.09 11462.04 12894.93]
 [ 3260.35  7094.53 11718.79 13183.78]

I hope this tutorial helped you learn more about ParamTools and how to do Parameter Indexing with it. I'm eager for feedback on this tutorial. Feel free to open an issue or send me an email at henrymdoupe@gmail.com.