In this notbook we are testing how the model responds when it is asked to produce three outputs from one input. One of the outputs will be equal to the input, and another the quotient of the other two.

First we import the required libraries. `numpy` is used to allow us to manipulate arrays with efficiency. `pandas` gives us access to Panda Dataframes which are the preferred way of storing our data. `matplotlib.pyplot` lets us plot graphs with our data. `twinlab` is the main library we are using. The libraries are renamed using `as` for convenience. 

In [None]:
# Third-party imports
import numpy as np

import pandas as pd
import matplotlib.pyplot as plt

# Project imports
import twinlab as tl

In this cell we train the model on our training data. We also set `n` to the number of items of training data we want.
- We give it the first output $y_1$ which is an array of `n` length filled with float numbers between 0 and 1.
- We give it the third output $y_3$ which is an array of `n` evenly spaced numbers between 0 and 1.
- We define the second output $y_2$ as an array, where each element is the corresponding $y_3$ element divided by the corresponding $y_1$ element.
- The only input is $X$, which is equal to $y_3$ plus a small amount of random noise.

In [None]:
dataset_id = "three_outputs.csv"
campaign_id = "three_outputs"

#Training Data
n = 100
y1 = np.random.rand(n)
y3 = np.linspace(0, 1, n) 
y2 = y3/y1
X = y3 + np.random.normal(0, 0.05, n)

train_data = pd.DataFrame({'X': X, 'y1': y1, 'y2':y2, "y3":y3})
print(train_data)

In this cell we set the parameters we are going to train the model on.

In [None]:
#defines parameters for our prediction
prediction_params = {
    "filename": dataset_id,
    "inputs" : ["X"],
    "outputs": ["y1", "y2", "y3"],
}

We now upload the training data to the twinLab cloud.

Whenever `verbose = true` is an argument, the function returns information about what it is doing to the user. This generates the grey text below the cells when they are run.

In [None]:
tl.upload_dataset(train_data, dataset_name=dataset_id, verbose=True)

`tl.list_datasets()` lets us check if the dataset we uploaded is in the right place.
`tl.query_dataset()` lets us view statistics about the data in our dataset.

In [None]:
_ = tl.list_datasets(verbose=True)
tl.query_dataset(dataset_id)

This cell trains the model on the dataset we provided, and within the parameters we provided.

In [None]:
tl.train_campaign(prediction_params, campaign_id, verbose=True)

This simply lists the current models on the twinlab cloud.

In [None]:
_ = tl.list_campaigns(verbose=True)

This displays information about the model we are using.

In [None]:
_ = tl.query_campaign(campaign_id, verbose=True)

This cell creates the input data we are going to give to the model for it to predict the three outputs of. 
The data is currently 1001 (defined by `num_predictions`) evenly spaced numbers between 0 and 1.

We also put them in a Pandas Dataframe.

In [None]:
num_predictions = 1001
input_dict = {
    "X": np.linspace(0., 1., num_predictions).tolist()
}

prediction_inputs = pd.DataFrame(input_dict)
print(prediction_inputs)

We then give these numbers to the model, and it generates what it thinks the three outputs should be. `df_mean` is the value it predicts. `df_std` is how uncertain the model is about that value.

In [None]:
print(prediction_inputs)
df_mean, df_std = tl.predict_campaign(prediction_inputs, campaign_id, verbose=True)

Now we plot the data on 3 graphs - one for X against $y_1$, one for X against $y_2$, and one for X against $y_3$.
- The black dots on the graph are the training data we gave it. 
- The darkest blue line in the graph is the `df_mean` value.
- The blue sections either side represent the range of uncertainty in the `df_mean` value.

$y_1$ settles to around a value of 0.5.
$y_2$'s average will increase the more numbers the model predicts, currently at around 2 - but with more data it would increase.
$y_2$ also has some enourmously high values which occur whenever $y_1$ is a very tiny number, so the result of the division is very high. 
The third graph shows the model is good at predicting $y_3$, because the training data shows it is the same as the $X$ value it is given.

In [None]:
# Plot parameters
nsigs = [1, 2]
color = "blue"
alpha = 0.5
plot_training_data = True
plot_model_mean = True
plot_model_bands = True

# Plot results
for Y, Ylabel in zip(["y1", "y2", "y3"], ["$y_1$", "$y_2$", "$y_3$"]):
    grid = prediction_inputs["X"]
    mean = df_mean[Y]
    err = df_std[Y]
    if plot_model_bands:
        label = "Model prediction"
        plt.fill_between(grid, np.nan, np.nan, lw=0, color=color, alpha=alpha, label=label)
        for isig, nsig in enumerate(nsigs):
            plt.fill_between(grid, mean-nsig*err, mean+nsig*err, lw=0, color=color, alpha=alpha/(isig+1))
    if plot_model_mean:
        label = "Model prediction" if not plot_model_bands else None
        plt.plot(grid, mean, color=color, alpha=alpha, label=label)
    if plot_training_data:
        plt.plot(train_data["X"], train_data[Y], ".", color="black", label="Training data")
    plt.xlim((0., 1.))
    plt.xlabel("$X$")
    plt.ylabel(Ylabel)
    plt.legend()
    plt.show()

print(prediction_inputs)
print(df_mean)
print(df_std)

We can finally remove our dataset and trained model from the twinlab cloud.

In [None]:
# Delete campaign and dataset (if desired)
tl.delete_campaign(campaign_id, verbose=True)
tl.delete_dataset(dataset_id, verbose=True)

These results show that it is very difficult to determine two numbers just based on their quotient, as there are many values they can take.