# Basic Introduction to Functions and States

Using the functions and objects in `autora.state`, we can build flexible pipelines and cycles which operate on state
objects.

## Theoretical Overview

The fundamental idea is this:
- We define a "state" object $S$ which can be modified with a "delta" (a new result) $\Delta S$.
- A new state at some point $i+1$ is $$S_{i+1} = S_i + \Delta S_{i+1}$$
- The cycle state after $n$ steps is thus $$S_n = S_{0} +  \sum^{n}_{i=1} \Delta S_{i}$$

To represent $S$ and $\Delta S$ in code, you can use `autora.state.delta.State` and `autora.state.delta.Delta`
respectively. To operate on these, we define functions.

- Each operation in an AER cycle (theorist, experimentalist, experiment_runner, etc.) is implemented as a
function with $n$ arguments $s_j$ which are members of $S$ and $m$ others $a_k$ which are not.
  $$ f(s_0, ..., s_n, a_0, ..., a_m) \rightarrow \Delta S_{i+1}$$
- There is a wrapper function $h$ (`autora.state.delta.wrap_to_use_state`) which changes the signature of $f$ to
require $S$ and aggregates the resulting $\Delta S_{i+1}$
  $$h\left[f(s_0, ..., s_n, a_0, ..., a_m) \rightarrow \Delta
S_{i+1}\right] \rightarrow \left[ f^\prime(S_i, a_0, ..., a_m) \rightarrow S_{i} + \Delta
S_{i+1} = S_{i+1}\right]$$

- Assuming that the other arguments $a_k$ are provided by partial evaluation of the $f^\prime$, the full AER cycle can
then be represented as:
  $$S_n = f_n^\prime(...f_2^\prime(f_1^\prime(S_0)))$$

There are additional helper functions to wrap common experimentalists, experiment runners and theorists so that we
can define a full AER cycle using python notation as shown in the following example.

## Example

First initialize the State. There are two variables `x` with a range [-10, 10] and `y` with an unspecified range.

In [None]:
from autora.state.bundled import BasicAERState
from autora.variable import VariableCollection, Variable

s_0 = BasicAERState(
    variables=VariableCollection(
        independent_variables=[Variable("x", value_range=(-10, 10))],
        dependent_variables=[Variable("y")]
    )
)

Specify the experimentalist. Use a standard function `random_pool_executor`.
This gets 5 independent random samples (by default, configurable using an argument)
from the value_range of the independent variables, and returns them in a DataFrame.

In [None]:
from autora.experimentalist.pooler.random_pooler import random_pool_executor
experimentalist = random_pool_executor

Specify the experiment runner. This calculates a linear function, adds noise, assigns the value to the `y` column
 in a new DataFrame.

In [None]:
import numpy as np
import pandas as pd
from autora.state.delta import Delta, wrap_to_use_state

rng = np.random.default_rng(180)

@wrap_to_use_state
def experiment_runner(conditions: pd.DataFrame, c=[2, 4]):
    x = conditions["x"]
    noise = rng.normal(0, 1, len(x))
    y = c[0] + (c[1] * x) + noise
    experiment_data = conditions.assign(y = y)
    return Delta(experiment_data=experiment_data)

Specify a theorist, using a standard LinearRegression from scikit-learn.

In [None]:
from sklearn.linear_model import LinearRegression
from autora.state.wrapper import theorist_from_estimator

theorist = theorist_from_estimator(LinearRegression(fit_intercept=True))

Define the cycle: run the experimentalist, experiment_runner and theorist ten times.

In [None]:
s_ = s_0
for i in range(10):
    s_ = experimentalist(s_)
    s_ = experiment_runner(s_)
    s_ = theorist(s_)

The experiment_data has 50 entries (10 cycles and 5 samples per cycle):

In [None]:
s_.experiment_data

Unnamed: 0,x,y
0,-4.451978,-15.373958
1,0.323487,2.561481
2,-2.867211,-10.516852
3,-2.030568,-5.247614
4,2.913797,12.957584
5,-7.340735,-27.82003
6,-6.019243,-21.600574
7,-8.893466,-31.496807
8,6.613056,27.020377
9,4.825417,21.875249


The fitted coefficients are close to the original intercept = 2, gradient = 4

In [None]:
print(s_.model.intercept_, s_.model.coef_)


[2.03390614] [[3.97374104]]
