# Attr-Parameters
This notebook shows how to use `DictDataClass` and `Parameters` to specify parametrisation. It comes with following advantages compared to other parametrisation schemes (eg. jsons)

1.  No KeyErrors
2.  Typed
3.  Default values so only need to specify changes - (compared this to maintaining tons of json files for each configuration)
4.  All attr features for free. Eg. validators, comparators etc.
5.  Intellisense in ide
6.  Both dict-like and dot access
7.  Interconversion to/from dict as well as flattened dict
8.  Easy De/serialisation from/to json and yaml.
9. Direct instantiation of objects from parameters specifying constructor arguments.
10.  Flexible + automatic disambiguation
11. Easy hyper-parameter search
12. And last but not the least, almost no overhead!

In [1]:
import attr
from pprint import pprint as print
from typing import Union, Optional
from copy import deepcopy

In [2]:
from param_impl import DictDataClass, default_value, Settings, Parameters, InstantiationMixin

The use of the various components of the framework can be summarized as follows:

1.  `DictDataClass`: This forms the core of the framework. All parameter classes should at least subclass this (and additionally add the `@attr.s(auto_atribs=True)` decorator). All parameters should be specified as a hierarchy of classes subclassing it.
2.  `default_value`: A function used specify default values when they are mutable eg. list, classes etc. Refer to [this](https://docs.python-guide.org/writing/gotchas/#mutable-default-arguments) to know why this is important.
3.  `Settings`: A list-type class used to specify multiple values for a parameter for grid search. Supports all operations of a regular python `list`.
4.  `Parameters`: This is a subclass of `DictDataClass` useful in case grid search is also desired. It adds `get_settings()` method that returns all combinations of parameters values specifed using `Settings`.  
5.  `InstantiationMixin`: This Mixin can be added to a hirearchy of parameter classes that each specify constructor arguments of a particular class which in turn is specified by a `type` attribute. 

When using this parametrisation framework the following need to be followed:

1.  Don't forget to add the decorator `@attr.s` and subclass `Parameters`.
2.  For each attribute specify type and default value. If the type is a class, use `default_value()` function to specify it. 
3.  Some attributes may allow multiple types. In this case do the following:
    1.  specify the type as `Union[type1, type2, ..., typek]`
    2.  override the `@classmethod` called `get_disambiguators()` that returns a dictionary with all `Union` types in that dataclass as    
    keys and a "disambiguator" functions as value. Disambiguator functions are functions that takes two inputs, an object and a union type
    corresponding to all possible types that the object can have and returns the actual type of that object. To avoid repetition, specify 
    all disambiguators in the `disambiguate()` function.

The following cell contains a sample parameters hierarchy.

In [3]:
import typing
def disambiguate(o, t): 
    lambdas = {
        Union[AdamOptimizerParams, SGDOptimizerParams]: lambda o, _: SGDOptimizerParams if 'momentum' in o else AdamOptimizerParams,
        Union[int, str]: lambda *_: None
    }
    if t in lambdas:
        return lambdas[t](o, t)
    # elif t == Union[t1, t2, t3]:  # Write disambiguator like this when a simple lambda is not possible
    #     pass
    else:
        raise TypeError("Unknown Type")

class SimpleTagger:
    def __init__(self, embedding_param=50, encoder=None):
        self.embedding_param = embedding_param
        self.encoder = encoder

@attr.s(auto_attribs=True)
class EncoderParams(InstantiationMixin, Parameters):    # Note that mixins should come before the actual superclass.
    type: str = 'torch.nn.LSTM'
    hidden_size: int = 100
    num_layers: int = 1
    bias: bool = True
    dropout: float = 0
    bidirectional: bool = True

@attr.s(auto_attribs=True)
class ModelParams(InstantiationMixin, Parameters):
    type: str = '__main__.SimpleTagger'
    embedding_param: Union[int, str] = 50
    encoder: Optional[EncoderParams] = None

    @classmethod
    def get_disambiguators(cls):
        return {Union[int, str]: disambiguate}

@attr.s(auto_attribs=True)
class AdamOptimizerParams(Parameters):
    type: str = 'torch.optim.Adam'
    lr: float = 0.001
    dict_attr: typing.Dict = default_value({1: '1', 2: '2'})

@attr.s(auto_attribs=True)
class SGDOptimizerParams(Parameters):
    type: str = 'torch.optim.SGD'
    lr: float = 0.001
    momentum: float = 0.1

@attr.s(auto_attribs=True)
class TrainingParams(Parameters):
    num_epochs: int = 20
    optimizer: Union[AdamOptimizerParams,
                     SGDOptimizerParams] = default_value(AdamOptimizerParams())

    @classmethod
    def get_disambiguators(cls):
        return {Union[AdamOptimizerParams, SGDOptimizerParams]: disambiguate}

@attr.s(auto_attribs=True)
class TaggingParams(Parameters):
    random_seed: int = 42
    gpu_idx: int = -1
    model: ModelParams = default_value(ModelParams())
    training: TrainingParams = default_value(TrainingParams())
    
    def __attrs_post_init__(self):
        # this function is called by attr after __init__()
        # useful to modify default values
        pass

In [4]:
params = TaggingParams()

## Attr Freebies
All the features of attrs suchs as dunder methods, comparators, validattors etc. are available at your service!

In [5]:
# __repr__ method
print(params)

# equility comparison
params1 = TaggingParams()
params2 = TaggingParams(model=ModelParams(encoder=EncoderParams()))
print(params == params1)
print(params == params2)

TaggingParams(random_seed=42, gpu_idx=-1, model=ModelParams(type='__main__.SimpleTagger', embedding_param=50, encoder=None), training=TrainingParams(num_epochs=20, optimizer=AdamOptimizerParams(type='torch.optim.Adam', lr=0.001, dict_attr={1: '1', 2: '2'})))
True
False


## Serialisation
### Dicts
Easy conversion to and from ensted dicts as well as flattened dicts. The latter is useful because many packages (eg. comet_ml) do not support nested configurations

In [6]:
# easy conversion to and from dict
print(params.to_dict())
print(TaggingParams.from_dict(params.to_dict()))

# the deserialized params are equal to original params
assert TaggingParams.from_dict(params.to_dict()) == params

{'gpu_idx': -1,
 'model': {'embedding_param': 50,
           'encoder': None,
           'type': '__main__.SimpleTagger'},
 'random_seed': 42,
 'training': {'num_epochs': 20,
              'optimizer': {'dict_attr': {1: '1', 2: '2'},
                            'lr': 0.001,
                            'type': 'torch.optim.Adam'}}}
TaggingParams(random_seed=42, gpu_idx=-1, model=ModelParams(type='__main__.SimpleTagger', embedding_param=50, encoder=None), training=TrainingParams(num_epochs=20, optimizer=AdamOptimizerParams(type='torch.optim.Adam', lr=0.001, dict_attr={1: '1', 2: '2'})))


In [7]:
# easy serialisation to flattend dict
print(params.to_flattened_dict())

# can use different spearator
print(params.to_flattened_dict(sep='_')) 

# easy deserialisation from flattend dict
print(TaggingParams.from_flattened_dict(params.to_flattened_dict()))

# the deserialized params are equal to original params
assert TaggingParams.from_flattened_dict(params.to_flattened_dict()) == params

{'gpu_idx': -1,
 'model.embedding_param': 50,
 'model.encoder': None,
 'model.type': '__main__.SimpleTagger',
 'random_seed': 42,
 'training.num_epochs': 20,
 'training.optimizer.dict_attr': {1: '1', 2: '2'},
 'training.optimizer.lr': 0.001,
 'training.optimizer.type': 'torch.optim.Adam'}
{'gpu_idx': -1,
 'model_embedding_param': 50,
 'model_encoder': None,
 'model_type': '__main__.SimpleTagger',
 'random_seed': 42,
 'training_num_epochs': 20,
 'training_optimizer_dict_attr': {1: '1', 2: '2'},
 'training_optimizer_lr': 0.001,
 'training_optimizer_type': 'torch.optim.Adam'}
TaggingParams(random_seed=42, gpu_idx=-1, model=ModelParams(type='__main__.SimpleTagger', embedding_param=50, encoder=None), training=TrainingParams(num_epochs=20, optimizer=AdamOptimizerParams(type='torch.optim.Adam', lr=0.001, dict_attr={1: '1', 2: '2'})))


### JSON, YAML
Helper methods to serialise to and deserialise from json and yaml.

In [8]:
# Easy serialisation to and deserialisation from json
params.to_json(open('params.json', 'w'))
_params = TaggingParams.from_json(open('params.json'))

# Note however that params != _params because the keys in dict_attr have become string after deserialisation. This is a shortcoming of using json.
print(params)
print(_params)
print(params == _params)

TaggingParams(random_seed=42, gpu_idx=-1, model=ModelParams(type='__main__.SimpleTagger', embedding_param=50, encoder=None), training=TrainingParams(num_epochs=20, optimizer=AdamOptimizerParams(type='torch.optim.Adam', lr=0.001, dict_attr={1: '1', 2: '2'})))
TaggingParams(random_seed=42, gpu_idx=-1, model=ModelParams(type='__main__.SimpleTagger', embedding_param=50, encoder=None), training=TrainingParams(num_epochs=20, optimizer=AdamOptimizerParams(type='torch.optim.Adam', lr=0.001, dict_attr={'1': '1', '2': '2'})))
False


In [9]:
# Easy serialisation to and deserialisation from yaml
params.to_yaml(open('params.yaml', 'w'))
_params = TaggingParams.from_yaml(open('params.yaml'))
assert params == _params

## Flexible Attribute Access

In [10]:
# Both dict-like and dot access are supported:
print(params.model.to_dict())
print(params['model'].to_dict())
assert params.model == params['model']

{'embedding_param': 50, 'encoder': None, 'type': '__main__.SimpleTagger'}
{'embedding_param': 50, 'encoder': None, 'type': '__main__.SimpleTagger'}


In [11]:
# can modify using both dict and attribute access
_params = deepcopy(params)
_params.model.encoder = EncoderParams()
_params['model']['embedding_param'] = 100
print(_params.to_dict())
print(params.to_dict())

{'gpu_idx': -1,
 'model': {'embedding_param': 100,
           'encoder': {'bias': True,
                       'bidirectional': True,
                       'dropout': 0,
                       'hidden_size': 100,
                       'num_layers': 1,
                       'type': 'torch.nn.LSTM'},
           'type': '__main__.SimpleTagger'},
 'random_seed': 42,
 'training': {'num_epochs': 20,
              'optimizer': {'dict_attr': {1: '1', 2: '2'},
                            'lr': 0.001,
                            'type': 'torch.optim.Adam'}}}
{'gpu_idx': -1,
 'model': {'embedding_param': 50,
           'encoder': None,
           'type': '__main__.SimpleTagger'},
 'random_seed': 42,
 'training': {'num_epochs': 20,
              'optimizer': {'dict_attr': {1: '1', 2: '2'},
                            'lr': 0.001,
                            'type': 'torch.optim.Adam'}}}


## Direct Instantiation

In [12]:
mp = ModelParams(encoder=EncoderParams())
print(mp.to_dict())

# This would produce error because EncoderParams does not specify input_size which is a required argument for torch.nn.LSTM
# m = mp.instantiate() 

# This will work
m = mp.instantiate(encoder={'input_size': 50})
print(m)
print(m.encoder)

# If EncoderParams itself had some attributes that themselves can be instantiated and 
# do not specify all parameters then those also need to be passed as a nested dictionary. 
# The arguments to `instantiate()` can also be used to override parameter values.

{'embedding_param': 50,
 'encoder': {'bias': True,
             'bidirectional': True,
             'dropout': 0,
             'hidden_size': 100,
             'num_layers': 1,
             'type': 'torch.nn.LSTM'},
 'type': '__main__.SimpleTagger'}
<__main__.SimpleTagger object at 0x7fae0899dfa0>
LSTM(50, 100, bidirectional=True)


## Hyper-parameter Search
### Directly using `Parameters`
`Parameters` can be directly used to specify the values to try out for each parameter and then to get all settings in the grid formed by product of values for each parameter.

In [13]:
params = TaggingParams(model=ModelParams(encoder=EncoderParams()))
print(params.to_dict())

{'gpu_idx': -1,
 'model': {'embedding_param': 50,
           'encoder': {'bias': True,
                       'bidirectional': True,
                       'dropout': 0,
                       'hidden_size': 100,
                       'num_layers': 1,
                       'type': 'torch.nn.LSTM'},
           'type': '__main__.SimpleTagger'},
 'random_seed': 42,
 'training': {'num_epochs': 20,
              'optimizer': {'dict_attr': {1: '1', 2: '2'},
                            'lr': 0.001,
                            'type': 'torch.optim.Adam'}}}


Use `Settings` to specify different values for each parameter:


In [14]:
params.model.encoder.hidden_size = Settings([50, 100])
params.training.optimizer.lr = Settings([1e-2, 1e-1])

Now just use the `get_settings()` function to get all the different possible settings:

In [15]:
settings = params.get_settings()
print(len(settings))        # will be equal to the product of the number of values for each parameter
for setting in settings:
    print(setting.to_flattened_dict())

4
{'gpu_idx': -1,
 'model.embedding_param': 50,
 'model.encoder.bias': True,
 'model.encoder.bidirectional': True,
 'model.encoder.dropout': 0,
 'model.encoder.hidden_size': 50,
 'model.encoder.num_layers': 1,
 'model.encoder.type': 'torch.nn.LSTM',
 'model.type': '__main__.SimpleTagger',
 'random_seed': 42,
 'training.num_epochs': 20,
 'training.optimizer.dict_attr': {1: '1', 2: '2'},
 'training.optimizer.lr': 0.01,
 'training.optimizer.type': 'torch.optim.Adam'}
{'gpu_idx': -1,
 'model.embedding_param': 50,
 'model.encoder.bias': True,
 'model.encoder.bidirectional': True,
 'model.encoder.dropout': 0,
 'model.encoder.hidden_size': 50,
 'model.encoder.num_layers': 1,
 'model.encoder.type': 'torch.nn.LSTM',
 'model.type': '__main__.SimpleTagger',
 'random_seed': 42,
 'training.num_epochs': 20,
 'training.optimizer.dict_attr': {1: '1', 2: '2'},
 'training.optimizer.lr': 0.1,
 'training.optimizer.type': 'torch.optim.Adam'}
{'gpu_idx': -1,
 'model.embedding_param': 50,
 'model.encoder.bia

It is also possible to do the above for attributes of list or any other more complex type.

In [16]:
from typing import List
@attr.s(auto_attribs=True)
class TempParams(Parameters):
    list_param: Optional[List[int]] = None
p = TempParams()
print(p)
p.list_param = Settings([[1], [1,2]])
s = p.get_settings()
print(len(s))        # will be equal to the product of the number of values for each parameter
for _s in s:
    print(_s)

TempParams(list_param=None)
2
TempParams(list_param=[1])
TempParams(list_param=[1, 2])


### Using Raytune without Search Algorithm

In [17]:
from ray import tune 



In [18]:
params = TaggingParams(model=ModelParams(encoder=EncoderParams()))
print(params.to_dict())
params.model.encoder.hidden_size = tune.grid_search([50, 100])
params.training.optimizer.lr = tune.loguniform(1e-3, 1e-1)
print(params.to_flattened_dict())
# Now just pass `params.to_flattened_dict()` as `config` parameter to `tune.run()`.

{'gpu_idx': -1,
 'model': {'embedding_param': 50,
           'encoder': {'bias': True,
                       'bidirectional': True,
                       'dropout': 0,
                       'hidden_size': 100,
                       'num_layers': 1,
                       'type': 'torch.nn.LSTM'},
           'type': '__main__.SimpleTagger'},
 'random_seed': 42,
 'training': {'num_epochs': 20,
              'optimizer': {'dict_attr': {1: '1', 2: '2'},
                            'lr': [0.01, 0.1],
                            'type': 'torch.optim.Adam'}}}
{'gpu_idx': -1,
 'model.embedding_param': 50,
 'model.encoder.bias': True,
 'model.encoder.bidirectional': True,
 'model.encoder.dropout': 0,
 'model.encoder.hidden_size': {'grid_search': [50, 100]},
 'model.encoder.num_layers': 1,
 'model.encoder.type': 'torch.nn.LSTM',
 'model.type': '__main__.SimpleTagger',
 'random_seed': 42,
 'training.num_epochs': 20,
 'training.optimizer.dict_attr': {1: '1', 2: '2'},
 'training.optimizer.lr': 