# Complex Execution Strategies with Research Module

## Roadmap

1. Introduction to Research Module
    * Basic example
        * 1 pipeline with fixed parameters
             * creating research
             * running several repetitions of an experiment
             * defining results recording frequency aka execute='%n'
             * viewing research results
             * saving and loading research
    * Runnung experiments with different parameters aka grid
        * 1 pipeline with variable parameters
            Complex Execution Strategies with Research Module * creating and viewing grids
             * viewing filtered research results
             
             
2. Complex Execution Strategies with Research Module (**You are here**)
    * Reducing extra dataset loads
        * 1 pipeline with root and brunch + grid
    * More complex execution strategies
        * 2 pipelines, train & test + function + root&brunch + grid
            * adding test pipeline
            * adding functions
3. 
    * Cross-validation

    * Performance
            * execution tasks managing
    * Combining it all together
        * Super complex Research

## Reducing extra dataset loads

In [1]:
import warnings
warnings.filterwarnings('ignore')

from tensorflow import logging
logging.set_verbosity(logging.ERROR)

import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'

In [2]:
import sys
sys.path.append('../../..')

import matplotlib
%matplotlib inline

from batchflow import Pipeline, B, C, V
from batchflow.opensets import MNIST
from batchflow.models.tf import VGG7, VGG16
from batchflow.research import Research, Option

In [3]:
BATCH_SIZE=64
ITERATIONS=10

In [4]:
mnist = MNIST()
train_root = mnist.train.p.run(BATCH_SIZE, shuffle=True, n_epochs=None, lazy=True)

In [5]:
grid = Option('layout', ['cna', 'can']) * Option('bias', [True, False])
        
model_config={
    'inputs': dict(images={'shape': (28, 28, 1)},
                   labels={'classes': 10, 'transform': 'ohe', 'name': 'targets'}),
    'initial_block/inputs': 'images',
    'body/block/layout': C('layout'),
    'common/conv/use_bias': C('bias'),
}

train_template = (Pipeline()
            .init_variable('loss', init_on_each_run=list)
            .init_model('dynamic', VGG7, 'conv', config=model_config)
            .to_array()
            .train_model('conv', fetches='loss',
                         feed_dict={'images': B('images'),
                                    'labels': B('labels')},
                         save_to=V('loss', mode='w'))
)

In [6]:
research = (Research()
            .add_pipeline(train_root + train_template, variables='loss', name='train')
            .add_grid(grid))

research.run(n_reps=4, n_iters=ITERATIONS, name='research', bar=True)


Research research is starting...


  0%|          | 0/160 [00:00<?, ?it/s]

Distributor has 16 jobs with 10 iterations. Totally: 160


100%|██████████| 160/160 [04:47<00:00,  1.80s/it]


<batchflow.research.research.Research at 0x7f052ddb9c18>

In [7]:
research = (Research()
            .add_pipeline(root=train_root, branch=train_template, variables='loss', name='train')
            .add_grid(grid))

research.run(n_reps=4, n_iters=ITERATIONS, name='no_extra_dataload_research', bar=True)


Research no_extra_dataload_research is starting...


  0%|          | 0/160 [00:00<?, ?it/s]

Distributor has 16 jobs with 10 iterations. Totally: 160


100%|██████████| 160/160 [05:09<00:00,  1.94s/it]


<batchflow.research.research.Research at 0x7f051502e908>

## More complex execution strategies

In [8]:
grid = Option('model', [VGG7]) * Option('layout', ['cna', 'can']) #* Option('bias', [True, False])
        
model_config={
    'inputs': dict(images={'shape': (28, 28, 1)},
                   labels={'classes': 10, 'transform': 'ohe', 'name': 'targets'}),
    'initial_block/inputs': 'images',
    'body/block/layout': C('layout'),
    #'common/conv/use_bias': C('bias'),
}

train_template = (Pipeline()
            .init_variable('loss', init_on_each_run=list)
            .init_model('dynamic', C('model'), 'conv', config=model_config)
            .to_array()
            .train_model('conv', fetches='loss',
                         feed_dict={'images': B('images'),
                                    'labels': B('labels')},
                         save_to=V('loss', mode='w'))
)

In [9]:
test_root = mnist.test.p.run(BATCH_SIZE, shuffle=True, n_epochs=None, lazy=True)
test_template = (Pipeline()
                 .init_variable('accuracy', init_on_each_run=list)
                 .import_model('conv', C('import_from'))
                 .to_array()
                 .predict_model('conv', 
                               fetches='output_accuracy',
                               feed_dict={'images': B('images'),
                                          'labels': B('labels')},
                               save_to=V('accuracy', mode='a')))

In [10]:
research = (Research()
            .add_pipeline(root=train_root, branch=train_template, variables='loss', name='train')
            .add_pipeline(root=test_root, branch=test_template, variables='accuracy', name='test',
                         execute='%5', run=True, import_from='train')
            .add_grid(grid))

research.run(n_reps=4, n_iters=ITERATIONS, name='my_research', bar=True)


Research my_research is starting...


  0%|          | 0/80 [00:00<?, ?it/s]

Distributor has 8 jobs with 10 iterations. Totally: 80


  6%|▋         | 5/80 [01:03<15:49, 12.66s/it]Process Process-3:1:
Traceback (most recent call last):
  File "/home/antonina/miniconda3/lib/python3.7/site-packages/multiprocess/process.py", line 297, in _bootstrap
    self.run()
  File "/home/antonina/miniconda3/lib/python3.7/site-packages/multiprocess/process.py", line 99, in run
    self._target(*self._args, **self._kwargs)
  File "../../../batchflow/research/distributor.py", line 347, in _run_job
    self.run_job()
  File "../../../batchflow/research/workers.py", line 61, in run_job
    exceptions = job.parallel_execute_for(iteration, unit_name, exec_actions)
  File "../../../batchflow/research/job.py", line 80, in parallel_execute_for
    exceptions = self._parallel_run(iteration, name, batch, actions) #pylint:disable=assignment-from-no-return
  File "../../../batchflow/decorators.py", line 327, in wrapped_method
    x = wrap_with_threads(self, args, kwargs)
  File "../../../batchflow/decorators.py", line 224, in wrap_with_threads


KeyboardInterrupt: 