# Quickstart

affe quickstart guide.

# Preliminaries

In [25]:
# This is a code-formatter, you cann comment it without losing functionality
%load_ext lab_black

## Imports

In [1]:
import affe
import numpy as np
import pandas as pd

In [2]:
from affe.execs import (
    CompositeExecutor,
    NativeExecutor,
    JoblibExecutor,
    GNUParallelExecutor,
)

In [3]:
from affe.tests import get_dummy_flow

# Basic Illustration: Flows saying _"hi"_

To illustrate, let us create 10 different workflows. Each of those says "hi" in a signature way.

In [4]:
# Making a flow is very easy.
flows = [
    get_dummy_flow(message="hi" * (i + 1), content=dict(i=i * 10)) for i in range(3)
]

In [5]:
flow = flows[0]

In [6]:
flow.config

{'io': {'fs': {'root': '/Users/zissou/repos/affe',
   'cli': 'root',
   'data': 'root',
   'out': 'root',
   'scripts': 'root',
   'out.flow.config': 'out.flow',
   'out.flow.logs': 'out.flow',
   'out.flow.results': 'out.flow',
   'out.flow.models': 'out.flow',
   'out.flow.timings': 'out.flow',
   'out.flow.tmp': 'out.flow',
   'out.flow.flows': 'out.flow',
   'out.flow': 'out'}},
 'message': 'hi',
 'content': {'i': 0}}

## Flow Execution

Now you can print some hello worlds, embedded in a Flow object.

In [7]:
flow.run()

Hello world
2 secs passed
hi


{'i': 0}

In [8]:
flow.run_with_log()

'/Users/zissou/repos/affe/out/flow/logs/logfilehi'

In [9]:
flows[1].run_with_log()

'/Users/zissou/repos/affe/out/flow/logs/logfilehihi'

## Flow Scheduling

= Execution of multiple flows, for instance via a tool like `joblib`

In [11]:
e = NativeExecutor
c_jl = JoblibExecutor(flows, e, n_jobs=3)

In [12]:
c_jl.run()

[{'i': 0}, {'i': 10}, {'i': 20}]

# Manual Creation of Flows

The "hi"-flows defined above were nice because they illustrate in the simplest way possible what a flow is and how it can be used. In this section, we dive in a bit deeper in how you can make a Flow yourself, from scratch.

## Your workflow

Typicall, you start from a certain workflow. As illustrated above, a _workflow_ is a piece of work you care about, and you want to be able to execute it in a controlled, experiment-like fashion. 

Here, we assume you are interested in the archetype machine learning task of predicting the specifies of the Iris flower

In [33]:
from sklearn import datasets
from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

# Load data
X, y = datasets.load_iris(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.3, random_state=42
)

# Fit classifier
clf = DecisionTreeClassifier(max_depth=2)
clf.fit(X_train, y_train)

# Predict and Evaluate
y_pred = clf.predict(X_test)

score = accuracy_score(y_test, y_pred, normalize=True)

score

0.9777777777777777

## Make your _workflow_ into a _Flow_

Now that you what you want to do, you want obtain a flow that implements this. The advantage is that annoying things like

- logging
- timeouts
- execution
- scheduling

are all taken care of, as soon as you succeed. This means removing boilerplate, and using battle-tested code instead.

### Implementation

- Subclass the Flow class
- Add anything you like

In [38]:
from affe import Flow

In [39]:
class IrisFlow(Flow):
    """Iris-Flow"""

    def __init__(self):
        return