# Different training procedures

 ## What's it?

`train_steps` – configuration of ***different training procedures***. It allows to optimize parametrs of selected scope using selected optimizer, loss, decay.

`scope` – subset of weights to optimize during training. Can be either string or sequence of strings.
Value ```''``` is reserved for optimizing all trainable variables. Putting ```-``` sign before name stands for complement: optimize everything but the passed scope. Scope can be choosen from masks of the path to the model weights tensors.


In [1]:
import os
import sys
import warnings

sys.path.append('../../..')
from batchflow import Pipeline, B, C, V, D
from batchflow.opensets import MNIST
from batchflow.models.tf import EncoderDecoder

Specify which GPU(s) to be used. More about it in [CUDA documentation](https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#env-vars).

In [2]:
%env CUDA_DEVICE_ORDER=PCI_BUS_ID
%env CUDA_VISIBLE_DEVICES=5

env: CUDA_DEVICE_ORDER=PCI_BUS_ID
env: CUDA_VISIBLE_DEVICES=5


## Create a dataset, define a pipeline config, define a default model config


In [3]:
BATCH_SIZE = 100

dataset = MNIST(bar=True)

config = dict(model=EncoderDecoder)

model_config = {'inputs': {'images/shape': B.image_shape,
                           'masks/shape': B.image_shape},
                'initial_block': {'inputs': 'images'},
                'body/encoder/num_stages' : 3,
                'body/encoder/blocks': {'layout': 'cna cna',
                                        'kernel_size': 3,
                                        'filters': [16, 32, 128]},
                'body/decoder/blocks': {'layout': 'cna cna',
                                        'kernel_size': 3,
                                        'filters': [64, 32, 16]},
                'head': dict(layout='cna', kernel_size=3, filters=16),
                'loss': 'mse'}

100%|██████████| 8/8 [00:02<00:00,  1.91it/s]


# Add different training procedures 

Give name (key of dict) of *train step* as you wish. Choose optimizer, scope of weights of your neural network,
and learning rate decay config ([Avalible losses and decays](https://analysiscenter.github.io/batchflow/api/batchflow.models.tf.base.html)). For example:


```python
'train_steps': {'name_of_train_step': {'optimizer': 'Adam', 
                                       'scope': 'block/group/layer', 
                                       'decay': lr_decay_config, 
                                       'loss': 'mse'}}
```

Scope can be set up as list of scopes:
```python
'scope': ['block/group-0/some_layer', 'block/some_group']
```

In [4]:
model_config.update({'train_steps':
                     {
                      'encoder': {'optimizer': 'Adam', 'scope': ['body/encoder', 'initial_block'], 'loss': 'mse'},
                      'decoder': {'optimizer': 'Adam', 'scope': ['body/decoder', 'head'], 'loss': 'mse'}
                     }})

Optimizer and decay together may be reused by another *train step*. Use key `'use'` and name of *train step* to do that. 

For example:
```python
'all': {'optimizer': 'Adam', 'loss': 'mse', 'decay': lr_decay_config}
    
'head': {'use': 'all', 'scope': 'head', 'loss': 'ce'}
 ```

Also scope contain all trainable variables if it isn't set (as in the case `'all'`).

# Example of usage.

## Train the model

Parameter `train_mode` used in train_model to select *train step*. 

For example use:
```python 
train_mode='name_of_train_step'
```
to select ```name_of_train_step``` with selected config inside.

To fetch loss according to selected *train_step* use:
```python 
fetches='loss_name_of_train_step'
``` 

Now use `train_mode='encoder'`  to select train step `'encoder'`. And `fetches='loss_encoder'` to fetch corresponding loss.

In [5]:
train_encoder = (Pipeline(config=config)
                  .to_array()
                  .train_model('conv_nn', fetches='loss_encoder', 
                               train_mode='encoder',
                               images=B.images, masks=B.images,
                               save_to=V('loss_history', mode='a'),
                               use_lock=True)) << dataset.train

(train_encoder.before
 .init_variable('loss_history', default=[])
 .init_model('dynamic', C('model'),'conv_nn',
             config=model_config))

<batchflow.once_pipeline.OncePipeline at 0x7fd5c882a0f0>

In [6]:
train_encoder.run(BATCH_SIZE, shuffle=True, n_epochs=2, bar=True, drop_last=True)

100%|██████████| 1200/1200 [06:20<00:00,  3.21it/s]


<batchflow.pipeline.Pipeline at 0x7fd5d82c3400>

Now we have the neural network with trained weights from scope of `'encoder'` train step.

Next we train weights from `'decoder'` scope.

In [7]:
train_decoder = (Pipeline(config=config)
                  .to_array()
                  .train_model('conv_nn', fetches='loss_decoder', 
                               train_mode='decoder',
                               images=B.images, masks=B.images,
                               save_to=V('loss_history', mode='a'),
                               use_lock=True)) << dataset.train

(train_decoder.before
 .init_variable('loss_history', default=[])
 .import_model('conv_nn', train_encoder)
)

<batchflow.once_pipeline.OncePipeline at 0x7fd3bc49efd0>

In [8]:
train_decoder.run(BATCH_SIZE, shuffle=True, n_epochs=2, bar=True, drop_last=True)

100%|██████████| 1200/1200 [05:30<00:00,  3.53it/s]


<batchflow.pipeline.Pipeline at 0x7fd3bc49eef0>

Now we can train models with `train_steps`, tune training process and accept better results.

We get complete control and convenient customization of model training process.