# 1.1 Installations

First, you need to install `python3`:
* MacOS: http://docs.python-guide.org/en/latest/starting/install3/osx/
* Linux: http://docs.python-guide.org/en/latest/starting/install3/linux/
* Windows: https://www.python.org/downloads/windows/

Then, you need to make sure you installed `pip`:
```bash
python3 -m pip install pip
```
Assuming you installed `git`, you then do:
```bash
git clone https://github.com/cmu-mars/model-learner.git
cd model-learner
git checkout tutorial
python3 -m pip install --upgrade .
```
Now, you are ready to start!

# 1.2 Import libraries

In [3]:
# import the libraries, classes, methods we need for this tutorial
from learner.mlearner import MLearner
from learner.model import genModelTermsfromString, Model, genModelfromCoeff
import numpy as np

# 1.3 Define model

In [4]:
# Let's define a model, in this case a polynomial model
# Polynomial models are a great tool for determining which input factors drive responses and in what direction.
# Here we define a model with 20 dimensions, each variable represents a dimension and influence of each variable
# is different, e.g., o0 has the coefficient of 1, while o1 has the coefficient of 2 so its effect is twice comparing
# with o0. Also, we have two terms that represents the interactions of variables, e.g., 3 * o3 * o6.

ndim = 20 # defines the dimension of the model, i.e., the number of variables in the model
true_model = """10 + 1.00 * o0 + 2.00 * o1 + 3.00 * o2 +
4.00 * o3 + 5.00 * o4 + 6.00 * o5 + 7.00 * o6 + 8.00 * o7 + 
1.00 * o8 + 2.00 * o9 + 3.00 * o10 + 4.00 * o11 + 5.00 * o12 + 
6.00 * o13 + 7.00 * o14 + 8.00 * o15 + 1.00 * o16 + 2.00 * o17 + 
3.00 * o18 + 4.00 * o19 + 1 * o0 * o1 + 3 * o3 * o6"""

In [5]:
# The model above is just a representation in string, so we need to build a model that we can evaluate given an input
power_model_terms = genModelTermsfromString(true_model)
true_power_model = Model(power_model_terms, ndim)
print(true_power_model)

1.00 * o0 + 2.00 * o1 + 3.00 * o2 + 4.00 * o3 + 5.00 * o4 + 6.00 * o5 + 7.00 * o6 + 8.00 * o7 + 1.00 * o8 + 2.00 * o9 + 3.00 * o10 + 4.00 * o11 + 5.00 * o12 + 6.00 * o13 + 7.00 * o14 + 8.00 * o15 + 1.00 * o16 + 2.00 * o17 + 3.00 * o18 + 4.00 * o19 + 1.00 * o0 * o1 + 3.00 * o3 * o6 + 10.0


In [6]:
# lets see how we evaluate the model with specific input
xTest = np.ones((1, 20))
yTest = true_power_model.evaluateModelFast(xTest)
xTest, yTest[0]

(array([[ 1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,
          1.,  1.,  1.,  1.,  1.,  1.,  1.]]), 96.0)

In [7]:
# let's try with array of zeros, what you expect to get as a result of evaluation? 10?
xTest = np.zeros((1, 20))
yTest = true_power_model.evaluateModelFast(xTest)
xTest, yTest[0]

(array([[ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,
          0.,  0.,  0.,  0.,  0.,  0.,  0.]]), 10.0)

# 1.3 Sample data

In [26]:
# here we select how many sample data points we woould like to use for learning
budget = 10000
learner = MLearner(budget, ndim, true_power_model)
learner.sample_random()
learner.X, learner.y

(array([[0, 1, 0, ..., 0, 1, 0],
        [1, 0, 0, ..., 0, 1, 0],
        [1, 1, 1, ..., 0, 0, 0],
        ..., 
        [0, 1, 0, ..., 1, 0, 1],
        [1, 0, 1, ..., 0, 0, 0],
        [1, 0, 1, ..., 0, 0, 1]]),
 array([ 56.,  44.,  58., ...,  53.,  54.,  41.]))

# 1.4 Learn the model

In [27]:
# lets start learning
learned_model = learner.discover()

# 1.5 Generate output model representation

In [28]:
learned_power_model_terms = genModelfromCoeff(learned_model.named_steps['linear'].coef_, ndim)
learned_power_model = Model(learned_power_model_terms, ndim)
print(learned_power_model)

1.00 * o0 + 2.00 * o1 + 3.00 * o2 + 4.00 * o3 + 5.00 * o4 + 6.00 * o5 + 7.00 * o6 + 8.00 * o7 + 1.00 * o8 + 2.00 * o9 + 3.00 * o10 + 4.00 * o11 + 5.00 * o12 + 6.00 * o13 + 7.00 * o14 + 8.00 * o15 + 1.00 * o16 + 2.00 * o17 + 3.00 * o18 + 4.00 * o19 + 0.00 * o0 * o1 + 0.00 * o0 * o2 + -0.00 * o0 * o3 + 0.00 * o0 * o4 + -0.00 * o0 * o5 + -0.00 * o0 * o6 + 0.00 * o0 * o7 + 0.00 * o0 * o8 + 0.00 * o0 * o9 + -0.00 * o0 * o10 + 0.00 * o0 * o11 + 0.00 * o0 * o12 + -0.00 * o0 * o13 + 0.00 * o0 * o14 + 0.00 * o0 * o15 + 0.00 * o0 * o16 + 0.00 * o0 * o17 + 0.00 * o0 * o18 + -0.00 * o0 * o19 + -0.00 * o1 * o2 + 0.00 * o1 * o3 + -0.00 * o1 * o4 + -0.00 * o1 * o5 + 0.00 * o1 * o6 + 0.00 * o1 * o7 + 0.00 * o1 * o8 + -0.00 * o1 * o9 + 0.00 * o1 * o10 + 0.00 * o1 * o11 + -0.00 * o1 * o12 + 0.00 * o1 * o13 + 0.00 * o1 * o14 + 0.00 * o1 * o15 + 0.00 * o1 * o16 + 0.00 * o1 * o17 + -0.00 * o1 * o18 + -0.00 * o1 * o19 + -0.00 * o2 * o3 + -0.00 * o2 * o4 + 0.00 * o2 * o5 + 0.00 * o2 * o6 + 0.00 * o2 * o7 + -