# Design template: Model specification

This notebook hosts a template for model specifications in OpenSourceEconomics. As an example, we consider the model of a simple $AR(1)$ process. 

\begin{align*}
x_{t + 1} = \rho x_t + \epsilon_t,\quad\text{for}\quad t = 1, ..., T
\end{align*}

with $x_t = 0$ and $\epsilon_t \sim N(0, 1)$. There are two remaiing parameters, $\rho$ the level of serial correlation and $T$ the number of time periods. We can think of $\rho$ as parameter to be estimated and thus updated during an estimation and $T$ as set throughout.

In [1]:
import pathlib
import yaml
import copy

import numpy as np

# The use of the typing module requires the use of
# Python 3.6 or higher.
import typing

We will build on the [NamedTuple](https://docs.python.org/2/library/collections.html#collections.namedtuple) from the Python standard library. It is a simple extension of the standard tuple where all elements can be accessed by field names as well. This ensures that our model specification remains immutable throughout after initialiation.

In [2]:
class ModelSpec(typing.NamedTuple):
    """Model specification.
     
    Class that contains all the information required to simulate
    a specified AR(1) process. It is the one an central place
    that contains this information throughout. It is an extended
    version of a namedtuple and thus ensures that model specification
    remains immutable.
     
    Attributes:
        rho: a float indicating the degree of serial correlation
        periods: an integer for the length of the time horizon
    """
    # We need to define all fields and their type right at the
    # beginning. These cannot be changed after initialization
    # and no other fields added dynamically without raising
    # an error.
    periods: int
    rho: float
        
    # We make some of the private methods of the base class
    # public.
    def as_dict(self):
        return self._asdict()
    
    def replace(self, *args, **kwargs):
        return self._replace(*args, **kwargs)
    
    # We write wrappers for common use cases.
    def copy(self):
        return copy.deepcopy(self)

    # We specify some methods that have no counterpart in the 
    # base class.
    def to_yaml(self, fname='test.yml'):
        with open(fname, 'w') as out_file:
            yaml.dump(self._asdict(), out_file)
    
    def validate(self):
        """Validation of model specification.
        
        All validation is done here and no further checks are 
        necessary later in the program for the immutable 
        parameters describing the model. The for-loop ensures that 
        all fields require exlicit checks.
        """
        for field in self._fields:
            attr = getattr(self, field)
            if field == 'periods':
                assert isinstance(attr, int)
                assert attr > 0
            elif field == 'rho':
                assert isinstance(attr, float)
            else:
                raise NotImplementedError('validation of {:} missing'.format(field))
    
    # We ovewrite some of the intrinsic __dunder__ methods to increase
    # usability of our class.
    def __repr__(self):
        """Provides a string representation of the model speficiation for
        quick visual inspection.
        """
        str_ = ''
        for field in self._fields:
            str_ +='{:}: {:}\n'.format(field, getattr(self, field))
        return str_
    
    def __eq__(self, other):
        """Check the equality of two model specifications.
        
        Returns true if two model specifications have the same fields defined
        and all have the same value.
        
        Args:
            other: A ModelSpec class instance.
        
        Returns:
            A boolean corresponding to equality of specifications.
        """
        assert isinstance(other, type(self))
        assert set(spec_1._fields) == set(spec_2._fields)
        for field in self._fields:
            if getattr(self, field) != getattr(other, field):
                return False
        return True
 
    def __ne__(self, other):
        """Check the inequality of two model specification."""
        return not self.__eq__(other)

In [3]:
def get_random_model_specification(constr=None):
    """Create a random model specification
    
    Creates a random specification of the model which is useful 
    for testing the robustness of implementation and testing
    in general.
    
    Args:
        constr: A dictionary that contains the requested constrains.
            The keys correspond to the field that is set to the value
            field.
            
            {'periods': 4, 'rho': 0.4}
    """
    def process_constraints(constr):
        """Impose a constraint on initialization dictionary.
        
        This function processes all constraints passed in by the user 
        for the random model specification.
        
        Args:
            constr: A dictionary which contains the constraints. 
        """ 
        if constr is None:
            constr = dict()
            
        if constr.get('periods'):
            init_dict['periods'] = constr['periods']
        if constr.get('rho'):
            init_dict['rho'] = constr['rho']
    
    init_dict = dict()
    init_dict['rho'] = np.random.uniform(0.01, 0.99)
    init_dict['periods'] = np.random.randint(1, 10)
        
    process_constraints(constr)
        
    return init_dict

In [4]:
def get_model_obj(source=None, constr=None):
    """Get model specification.
    
    This is a factory method to create a model spefication from
    a variety of differnt input types.
    
    Args:
        input: str, dictionary, None specifying the input for
            for the model specification.
        constr: A dictionary with the constraints imposed
            on a random initialization file.
    
    Returns:
        An instance of the ModelSpec class with the model
        specification.
    """    
    # We want to enforce the use of Path objects going forward.
    if isinstance(source, str):
        source = pathlib.Path(source)
 
    if isinstance(source, dict):
        model_spec = ModelSpec(**source)
    elif isinstance(source, pathlib.PosixPath):
        model_spec = ModelSpec(**yaml.load(open(source, 'r'), Loader=yaml.FullLoader))
    elif source is None:
        model_spec = ModelSpec(**get_random_model_specification(constr))
    else:
        raise NotImplementedError
    
    # We validate our model specification once and for all. 
    # Unfortunately, there is no way to do so at class
    # initialization as we cannot override the __new__
    # method.
    model_spec.validate()

    return model_spec

# Use cases 

We want to explore some use cases with the proposed setup to test is usability.

* We can specify a model programmatically using a dictionary or read it in from a specification file.

In [5]:
init_dict = {'rho': 0.5, 'periods': 10}
spec = get_model_obj(init_dict);

In [6]:
# %load model_spec.yml
periods: 2
rho: 0.5
    
spec = get_model_obj('model_spec.yml')

* We want to easily access all fields.

In [7]:
spec = get_model_obj()
print('periods', spec.periods)
print('rho', spec.rho)

periods 3
rho 0.5720901984041954


* We want to easily compare different model specifications.

In [10]:
spec_1 = get_model_obj()
spec_2 = spec_1.replace(rho=0.9)
assert spec_1 != spec_2

spec_1 = get_model_obj()
spec_2 = spec_1.copy()
assert spec_1 == spec_2

* We want to be able to update the parameters of the model specification during an optimization.

> This part will be influenced by the final design of the estimagic class.

In [11]:
spec_1 = get_model_obj()
spec_2 = spec_1.replace(rho=0.9, periods=3)

spec_3 = spec_1.replace(**{'rho': 0.9, 'periods': 3})
assert spec_2 == spec_3

* We want to be able to go back and forth between the different ways a model is stored.

In [12]:
for _ in range(100):
    spec_1 = get_model_obj()
    spec_1.to_yaml()

    spec_2 = get_model_obj('test.yml')
    assert spec_1 == spec_2

    spec_3 = get_model_obj(pathlib.Path('test.yml'))
    assert spec_1 == spec_3
    
    spec_4 = get_model_obj(spec_1.as_dict())
    assert spec_1 == spec_4

* We want to easily validate the integrity of our model specification.

In [13]:
for _ in range(100):
    spec = get_model_obj()
    spec.validate()

* We want to easily inspect the model specification.

In [14]:
spec = get_model_obj()
print(spec)

periods: 1
rho: 0.08178924397625818



* We do not want to change parts of our model specification by accident.

In [15]:
spec = get_model_obj()

# We cannot change a field already defined.
with np.testing.assert_raises(AttributeError):
    spec.periods = 1

# We cannot add a field dynamically.
spec_1 = spec.copy()
with np.testing.assert_raises(AttributeError):
    spec.period = 1 

## Integration

This model class can then be used to work with the specified model.

In [16]:
def simulate(model_spec):
    """Simulate AR(1) process.
    
    This function simulates a simple AR(1) process based
    on the model specification passed in.
    
    Args:
        model_spec: An instance of the ModelSpec class.
    """
    assert isinstance(model_spec, ModelSpec)
    
    sequence = np.tile(np.nan, model_spec.periods)
    sequence[0] = 0
    for i in range(1, model_spec.periods):
        sequence[i] = np.random.normal() + model_spec.rho * sequence[i - 1]
    return sequence

model_spec = get_model_obj('test.yml')
simulate(model_spec);

We can then combine the testing features.

In [17]:
for _ in range(100):
    rslt = simulate(get_model_obj())
    assert np.isnan(rslt).any() == False

# Additional resources

* https://paramtools.readthedocs.io is an attempt to conduct a similar effort. It appears general in the sense that the constraints we define in the validate() method are specified in an external file thus increaseing portability across different model types. This is worthwile to follow along.

# Comments

Please submit comments as GitHub issues with the label **dp-model-specification**.

# Extensions

* Extend template to explicitly deal with nested model specifictions. For example, as in `init_dict['BASICS']['num_agents']`.