In [1]:
# default_exp train.loop

# Loop

Training should be done on a loop: subjects and treatments all getting consistent model evaluation with results stored centrally.  Basically, create reproducible results quickly, automate the easy stuff.

Here I gather a few utilities, practices, interfaces, and demonstrations to make this easier for everyone.

In [2]:
#hide
from nbdev.showdoc import *

In [5]:
import numpy as np
from scipy import stats

In [4]:
#export
def artifact_storage(*a, **kw):
    """Infer how the user intends to store
    artifacts: models, versions, treatments,
    and evaluation results."""
    
    def noop(subject, treatment, model, result): pass
    return noop

def generic_runner(subject, treatment, *a, **kw):
    """Naive concept. I have something smarter
    running somewhere, but I'm not sure where it is."""
    storage = artifact_storage(*a, **kw)
    model = None
    result = treatment(subject)
    return storage(subject, treatment, model, result)
    
def runner(*a, **kw):
    """Infer the runner and its artifact storage."""
    return generic_runner

def train_loop(subjects, treatments, *a, **kw):
    """The main training loop."""
    
    fn = runner(*a, **kw)
    
    for subject in np.array(subjects):
        for treatment in np.array(treatments):
            fn(subject, treatment, *a, **kw)


In [8]:
# This is not normal...expecting to use models, fit, predict,
# hyper parameters, etc.

def t(name, n, subject):
    result = round(subject + n, 2)
    print(f"{name}: {result}")
    return result

def t1(subject): return t('t1', 1, subject)
def t2(subject): return t('t2', 2, subject)

subjects = stats.norm.rvs(size=3) + 10
print(subjects)

train_loop(subjects, [t1, t2])

[10.42164023  9.2040922  10.05040423]
t1: 11.42
t2: 12.42
t1: 10.2
t2: 11.2
t1: 11.05
t2: 12.05


So, I don't have what I want yet, specifically:

* interfaces
* organizing and storing models
* reusing what already has been trained
* bringing in evaluation
* using good train/test split
* using good evalusation, even for imbalanced classes

I don't want anything setup, but the part involving storage, reuse, reproducability, and transparency can be done here.