# Autocfg Basics

This notebook will demonstrate the basic functionalities of autocfg

In [1]:
from autocfg import dataclass, field  # drop-in replacement of dataclass decorator out of dataclasses

### dataclass decorator

The usage of dataclass decorator shouldn't be anything different than the native `dataclasses` introduced in python 3.7. In python 3.6 we use the backported `dataclasses` so it minimum requirement of `autocfg` package is python 3.6

Let's create some random configurations you will use in an experiment.

In [2]:
from typing import Union
# first is the common training config
@dataclass
class TrainConfig:
  batch_size : int = 32
  learning_rate : Union[float, int] = 1e-3
  weight_decay : float = 1e-5

followed by a nested config

In [3]:
# supports nested config
@dataclass
class MyExp:
  train : TrainConfig = TrainConfig() # or field(default_factory=TrainConfig)
  num_class : int = 1000
  depth : int = 50

Note that usually a mutable object such as `TrainConfig` is not allowed in dataclass, but we have made it simpler to specify default value for nested config. A former `field(default_factory=TrainConfig)` is recommended.

### Initialize, and direct access

In [4]:
# we can initialize the plain configs as-is
train = TrainConfig()
train1 = TrainConfig(batch_size=128)
print('train:', train)
print('train1:', train1)

train: TrainConfig(batch_size=32, learning_rate=0.001, weight_decay=1e-05)
train1: TrainConfig(batch_size=128, learning_rate=0.001, weight_decay=1e-05)


In [5]:
# the exp config, a nested class
exp = MyExp(depth=18)
print('exp with default train:', exp)

exp with default train: MyExp(train=TrainConfig(batch_size=32, learning_rate=0.001, weight_decay=1e-05), num_class=1000, depth=18)


In [6]:
# config can be viewed as normal dict
print('dict:', exp.asdict())

dict: {'train': {'batch_size': 32, 'learning_rate': 0.001, 'weight_decay': 1e-05}, 'num_class': 1000, 'depth': 18}


In [7]:
# To modify the values, attributes can be directly accessed
exp.num_class = 10
exp.train.learning_rate = 1.5
tt = TrainConfig()
tt.learning_rate = 1000
print('updated exp:', exp)

updated exp: MyExp(train=TrainConfig(batch_size=32, learning_rate=1.5, weight_decay=1e-05), num_class=10, depth=18)


### Type checking

The type validation is a built-in feature for each field, for example, an invalid `learning_rate` will triger `TypeError`

In [8]:
try:
    invalid_train = TrainConfig(learning_rate='1.0')
except TypeError as e:
    print('raised error:', e)

raised error: `<class '__main__.TrainConfig'>.learning_rate` requires typing.Union[float, int], given <class 'str'>


More importantly, `autocfg` also safe-guard the `setattr` behavior so you won't assign invalid values to fields

In [9]:
invalid_train = TrainConfig(learning_rate=1.0)
try:
    invalid_train.batch_size = 0.1
except TypeError as e:
    print('raised error:', e)

raised error: `<class '__main__.TrainConfig'>.batch_size` requires <class 'int'>, given <class 'float'>


### Serialization

`autocfg` prefers `yaml` as the human-readable format for serialization, which can be viewed and modified pretty effortlessly.

In [10]:
# save to 'exp.yaml'
exp.save('exp.yaml')
!cat exp.yaml

# MyExp
depth: 18
num_class: 10
train:
  batch_size: 32
  learning_rate: 1.5
  weight_decay: 1.0e-05


In [11]:
# directly load from file is also straight-forward
exp1 = MyExp.load('exp.yaml')
assert exp == exp1

In [12]:
# a python file-like object can be handy in case in-memory operation is preferred
import io
f = io.StringIO('depth: 1000')
exp2 = MyExp.load(f)
print(exp2)
assert exp2.depth == 1000

MyExp(train=TrainConfig(batch_size=32, learning_rate=0.001, weight_decay=1e-05), num_class=1000, depth=1000)


### Update config in-place
Though configs can be updated by direct access and assignment, we also need a faster `update` method similar to nested dict

In [13]:
exp2.update(exp1)
print(exp2)
assert exp2 == exp1

MyExp(train=TrainConfig(batch_size=32, learning_rate=1.5, weight_decay=1e-05), num_class=10, depth=18)


In [14]:
# update support files, file-like objects where configs has been dumped
exp2 = MyExp(num_class=200)
exp2.update('exp.yaml')
print(exp2)
assert exp2 == exp

MyExp(train=TrainConfig(batch_size=32, learning_rate=1.5, weight_decay=1e-05), num_class=10, depth=18)


In [15]:
# update with a dict
exp2.update({'num_class': 10, 'train': {'learning_rate': 1.0}})
print(exp2)
assert exp2.num_class == 10 and exp2.train.learning_rate == 1.0

MyExp(train=TrainConfig(batch_size=32, learning_rate=1.0, weight_decay=1e-05), num_class=10, depth=18)


In [16]:
# update a subsection only, specified by `key=xxx` , e.g., `train` only
exp2 = MyExp(num_class=200)
exp2.update(exp, key='train')
print('only update exp2.train', exp2)
assert exp2.num_class == 200 and exp2.train.learning_rate == exp.train.learning_rate
# supports multiple keys
exp2.update(exp, key=('train', 'num_class'))

only update exp2.train MyExp(train=TrainConfig(batch_size=32, learning_rate=1.5, weight_decay=1e-05), num_class=200, depth=50)


In [17]:
# update with kwargs with key=value pairs
exp2 = MyExp(num_class=200)
exp2.update(num_class=100, depth=100)
print(exp2)
# useful in a nested config
exp2.update(train={'batch_size': 4})
# or
exp2.train.update(batch_size=4, learning_rate=1.0)
print(exp2)

MyExp(train=TrainConfig(batch_size=32, learning_rate=0.001, weight_decay=1e-05), num_class=100, depth=100)
MyExp(train=TrainConfig(batch_size=4, learning_rate=1.0, weight_decay=1e-05), num_class=100, depth=100)


### Make copy or merge configs

Making copies or merge configs to another object without modifying the original configuration. In such cases, you can use the built-in `copy` module or `merge` method 

In [18]:
import copy
orig_cfg = MyExp()
new_cfg = copy.deepcopy(orig_cfg)
new_cfg.train.learning_rate = 0.5
print(orig_cfg, '\nversus...\n', new_cfg)

MyExp(train=TrainConfig(batch_size=32, learning_rate=0.001, weight_decay=1e-05), num_class=1000, depth=50) 
versus...
 MyExp(train=TrainConfig(batch_size=32, learning_rate=0.5, weight_decay=1e-05), num_class=1000, depth=50)


In [19]:
# merge a yaml file directly
new_cfg = orig_cfg.merge('exp.yaml')
print(orig_cfg, '\nversus...\n', new_cfg)
# merge with a dict
new_cfg = orig_cfg.merge({'train': {'learning_rate': 1.5}})
print(new_cfg)
# merge from another config object
new_cfg = orig_cfg.merge(exp2)
print(new_cfg)

MyExp(train=TrainConfig(batch_size=32, learning_rate=0.001, weight_decay=1e-05), num_class=1000, depth=50) 
versus...
 MyExp(train=TrainConfig(batch_size=32, learning_rate=1.5, weight_decay=1e-05), num_class=10, depth=18)
MyExp(train=TrainConfig(batch_size=32, learning_rate=1.5, weight_decay=1e-05), num_class=1000, depth=50)
MyExp(train=TrainConfig(batch_size=4, learning_rate=1.0, weight_decay=1e-05), num_class=100, depth=100)


Anyway, using `copy` or `merge`, we can make the original config intact while quickly update the desired fields. In paticular, `merge` support all updating method we have shown with `update` function.

### Argparse integration

It's always time consuming if you need to manually add a argparse parser to handle command line inputs, `autocfg` handles the issue out of the box, saving tons of efforts for you.

Since jupyter notebook doesnot play with `sys.argv` well, we will use list inputs to simulate command line inputs.

In [20]:
# auto generated helper
try:
    new_exp = MyExp.parse_args(['-h'])
except SystemExit as e:
    pass

usage: MyExp's auto argument parser [-h] [--train.batch-size TRAIN.BATCH_SIZE]
                                    [--train.learning-rate TRAIN.LEARNING_RATE]
                                    [--train.weight-decay TRAIN.WEIGHT_DECAY]
                                    [--num-class NUM_CLASS] [--depth DEPTH]

optional arguments:
  -h, --help            show this help message and exit
  --train.batch-size TRAIN.BATCH_SIZE
                        batch_size (default: 32)
  --train.learning-rate TRAIN.LEARNING_RATE
                        learning_rate (default: 0.001)
  --train.weight-decay TRAIN.WEIGHT_DECAY
                        weight_decay (default: 1e-05)
  --num-class NUM_CLASS
                        num_class (default: 1000)
  --depth DEPTH         depth (default: 50)


The default values are also available in the console output

In [21]:
# normal overriding
new_exp = MyExp.parse_args(['--depth', '100'])
print(new_exp)

MyExp(train=TrainConfig(batch_size=32, learning_rate=0.001, weight_decay=1e-05), num_class=1000, depth=100)


In [22]:
# nested overriding
new_exp = MyExp.parse_args(['--train.weight-decay', '100.0'])
print(new_exp)

MyExp(train=TrainConfig(batch_size=32, learning_rate=0.001, weight_decay=100.0), num_class=1000, depth=50)


### Diff configurations

Knowing what's been modified is an important feature of configuration systems, `autocfg` provide a `diff` function to evaluate the changes

In [23]:
from pprint import pprint
pprint(new_exp.diff(exp2))

['root.depth           50 != 100',
 'root.num_class       1000 != 100',
 'root.train.batch_size 32 != 4',
 'root.train.weight_decay 100.0 != 1e-05',
 'root.train.learning_rate 0.001 != 1.0']
