# Basic Usage

Aim: Use the Controller to recover a simple ground truth theory from noisy data.

In [None]:
import numpy as np
from autora.experimentalist.pipeline import make_pipeline
from autora.variable import VariableCollection, Variable
from sklearn.linear_model import LinearRegression

from autora.workflow import Cycle
from itertools import takewhile

In [None]:
def ground_truth(x):
    return x + 1

The space of allowed x values is the integers between 0 and 10 inclusive, and we record the allowed output values as well.

In [None]:
metadata_0 = VariableCollection(
   independent_variables=[Variable(name="x1", allowed_values=range(11))],
   dependent_variables=[Variable(name="y", value_range=(-20, 20))],
   )

The experimentalist is used to propose experiments.
Since the space of values is so restricted, we can just sample them all each time.

In [None]:
example_experimentalist = make_pipeline(
    [metadata_0.independent_variables[0].allowed_values])

When we run a synthetic experiment, we get a reproducible noisy result:

In [None]:
def get_example_synthetic_experiment_runner():
    rng = np.random.default_rng(seed=180)
    def runner(x):
        return ground_truth(x) + rng.normal(0, 0.1, x.shape)
    return runner
example_synthetic_experiment_runner = get_example_synthetic_experiment_runner()
example_synthetic_experiment_runner(np.array([1]))

The theorist "tries" to work out the best theory. We use a trivial scikit-learn regressor.

In [None]:
example_theorist = LinearRegression()

    We initialize the Controller with the metadata describing the domain of the theory,
    the theorist, experimentalist and experiment runner,
    as well as a monitor which will let us know which cycle we're currently on.

In [None]:
cycle = Cycle(
    metadata=metadata_0,
    theorist=example_theorist,
    experimentalist=example_experimentalist,
    experiment_runner=example_synthetic_experiment_runner,
    monitor=lambda state: print(f"Generated {len(state.theories)} theories"),
)
cycle # doctest: +ELLIPSIS

We can run the cycle by calling the run method:

In [None]:
cycle.run(num_cycles=3)  # doctest: +ELLIPSIS

We can now interrogate the results. The first set of conditions which went into the
experiment runner were:

In [None]:
cycle.data.conditions[0]

The observations include the conditions and the results:

In [None]:
cycle.data.observations[0]

In the third cycle (index = 2) the first and last values are different again:

In [None]:
cycle.data.observations[2][[0,-1]]

The best fit theory after the first cycle is:

In [None]:
cycle.data.theories[0]

In [None]:
def report_linear_fit(m: LinearRegression,  precision=4):
    s = f"y = {np.round(m.coef_[0].item(), precision)} x " \
        f"+ {np.round(m.intercept_.item(), 4)}"
    return s
report_linear_fit(cycle.data.theories[0])

The best fit theory after all the cycles, including all the data, is:

In [None]:
report_linear_fit(cycle.data.theories[-1])

This is close to the ground truth theory of x -> (x + 1)
We can also run the cycle with more control over the execution flow:

In [None]:
next(cycle) # doctest: +ELLIPSIS

next(cycle) # doctest: +ELLIPSIS

In [None]:
next(cycle) # doctest: +ELLIPSIS

We can continue to run the cycle as long as we like,
with a simple arbitrary stopping condition like the number of theories generated:

In [None]:
_ = list(takewhile(lambda c: len(c.data.theories) < 9, cycle))

... or the precision (here we keep iterating while the difference between the gradients
of the second-last and last cycle is larger than 1x10^-3).

In [None]:
_ = list(
        takewhile(
            lambda c: np.abs(c.data.theories[-1].coef_.item() -
                           c.data.theories[-2].coef_.item()) > 1e-3,
            cycle
        )
    )


... or continue to run as long as we like:

In [20]:
_ = cycle.run(num_cycles=100) # doctest: +ELLIPSIS


Generated 112 theories
Generated 113 theories
Generated 114 theories
Generated 115 theories
Generated 116 theories
Generated 117 theories
Generated 118 theories
Generated 119 theories
Generated 120 theories
Generated 121 theories
Generated 122 theories
Generated 123 theories
Generated 124 theories
Generated 125 theories
Generated 126 theories
Generated 127 theories
Generated 128 theories
Generated 129 theories
Generated 130 theories
Generated 131 theories
Generated 132 theories
Generated 133 theories
Generated 134 theories
Generated 135 theories
Generated 136 theories
Generated 137 theories
Generated 138 theories
Generated 139 theories
Generated 140 theories
Generated 141 theories
Generated 142 theories
Generated 143 theories
Generated 144 theories
Generated 145 theories
Generated 146 theories
Generated 147 theories
Generated 148 theories
Generated 149 theories
Generated 150 theories
Generated 151 theories
Generated 152 theories
Generated 153 theories
Generated 154 theories
Generated 1