# Building Models with Multiple Outputs

In [1]:
import pandas as pd
import numpy as np
from numpy import unique
import matplotlib.pyplot as plt

In [2]:
from keras.layers import Input, Dense, Embedding, Flatten, Subtract, Add, Concatenate
from keras.models import Model
from keras.utils import plot_model

Using TensorFlow backend.


In [3]:
# load the data
full = pd.read_csv('./data/basket-ball/games_season.csv')
tournament = pd.read_csv('./data/basket-ball/games_tourney.csv')
full.shape, tournament.shape

((312178, 8), (4234, 9))

We'll build one model that makes two predictions: the scores of both teams in a given game. Our inputs will be the seed difference of the two teams, as well as the predicted score difference from the model we built in previous notebooks.

The output from our model will be the predicted score for team 1 as well as team 2. This is called "multiple target regression": one model making more than one prediction.

In [4]:
# Create a single input layer with 2 columns.
input_tensor = Input(shape=(2,))

# Connect this input to a Dense layer with 2 units.
output_tensor = Dense(2)(input_tensor)

# Create a model with input_tensor as the input and output_tensor as the output.
model = Model(input_tensor, output_tensor)

# Compile the model
model.compile(optimizer='adam', loss='mean_absolute_error')

Now that we've defined our 2-output model, we'll fit it to the tournament data. We'll split the data into `tournament_train` and `tournament_test`, so use the training set to fit for now.

This model will use the pre-tournament seeds, as well as your pre-tournament predictions from the regular season model we built previously.

As a reminder, this model will predict the scores of both teams.

In [9]:
# load the data from disc
import feather

tournament = feather.read_dataframe('./tmp/tournament')
print(tournament.shape)
tournament.head()

(4234, 10)


Unnamed: 0,season,team_1,team_2,home,seed_diff,score_diff,score_1,score_2,won,pred
0,1985,288,73,0,-3,-9,41,50,0,0.065246
1,1985,5929,73,0,4,6,61,55,1,0.120679
2,1985,9884,73,0,5,-4,59,63,0,0.105372
3,1985,73,288,0,3,9,50,41,1,0.062881
4,1985,3920,410,0,1,-9,54,63,0,0.185282


In [12]:
# split tournament dataset into train and test
tournament_train, tournament_test = tournament[:3168], tournament[3168:]
tournament_train.shape, tournament_test.shape

((3168, 10), (1066, 10))

In [13]:
# Fit the model to the games_tourney_train dataset using 100 epochs and a batch size of 16384.
# The input columns are 'seed_diff', and 'pred'.
# The target columns are 'score_1' and 'score_2'.
model.fit(
    tournament_train[['seed_diff', 'pred']],
    tournament_train[['score_1', 'score_2']],
    verbose=True,
    epochs=100,
    batch_size=16384
)

Epoch 1/100
Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 9/100
Epoch 10/100
Epoch 11/100
Epoch 12/100
Epoch 13/100
Epoch 14/100
Epoch 15/100
Epoch 16/100
Epoch 17/100
Epoch 18/100
Epoch 19/100
Epoch 20/100
Epoch 21/100
Epoch 22/100
Epoch 23/100
Epoch 24/100
Epoch 25/100
Epoch 26/100
Epoch 27/100
Epoch 28/100
Epoch 29/100
Epoch 30/100
Epoch 31/100
Epoch 32/100
Epoch 33/100
Epoch 34/100
Epoch 35/100
Epoch 36/100
Epoch 37/100
Epoch 38/100
Epoch 39/100
Epoch 40/100
Epoch 41/100
Epoch 42/100
Epoch 43/100
Epoch 44/100
Epoch 45/100
Epoch 46/100
Epoch 47/100
Epoch 48/100
Epoch 49/100
Epoch 50/100
Epoch 51/100
Epoch 52/100
Epoch 53/100
Epoch 54/100
Epoch 55/100
Epoch 56/100
Epoch 57/100
Epoch 58/100
Epoch 59/100
Epoch 60/100
Epoch 61/100
Epoch 62/100
Epoch 63/100
Epoch 64/100
Epoch 65/100
Epoch 66/100
Epoch 67/100
Epoch 68/100
Epoch 69/100
Epoch 70/100
Epoch 71/100
Epoch 72/100
Epoch 73/100
Epoch 74/100
Epoch 75/100
Epoch 76/100
Epoch 77/100
Epoch 78

Epoch 97/100
Epoch 98/100
Epoch 99/100
Epoch 100/100


<keras.callbacks.History at 0x7f66c5799ba8>

**Inspect the model**

We'll use the `.get_weights()` method to inspect your model's weights.

* the input layer will have 4 weights: 2 for each input times 2 for each output.

* the output layer will have 2 weights, one for each output.

In [16]:
# Print the model's weights
print(model.get_weights())
print()

# Print the column means of the training data
print(tournament_train.mean())

[array([[-0.01152442,  1.1569821 ],
       [ 1.29076   , -0.3129429 ]], dtype=float32), array([0.10000583, 0.10000583], dtype=float32)]

season        1997.045455
team_1        5546.025568
team_2        5546.025568
home             0.000000
seed_diff        0.000000
score_diff       0.000000
score_1         71.912247
score_2         71.912247
won              0.500000
pred             0.124636
dtype: float64


Both output weights are about `~72`. This is because, on average, a team will score about 72 points in the tournament.

**Evaluate the model**

We'll evaluate the model on the tournament test set to see how well it performs on new data.

In [18]:
# Evaluate the model on the tournament test data, use the same inputs and outputs as the training set.
model.evaluate(
    tournament_test[['seed_diff', 'pred']], 
    tournament_test[['score_1', 'score_2']]
)



68.60479482969245

### Build a model that performs both Regression and Classification

We'll create a 2-output model that will 

* predict the score difference  (instead of both team's scores) and 

* predict the probability that team 1 won the game. 

It will perform both classification and regression!

In this model, turn off the bias, or intercept for each layer. The inputs (seed difference and predicted score difference) have a mean of very close to zero, and the outputs both have means that are close to zero, so the model shouldn't need the bias term to fit the data well.

In [19]:
# Create an input layer with 2 columns
input_tensor = Input(shape=(2,))

# The first output layer should have 1 unit with 'linear' activation and no bias term.
output_tensor_1 = Dense(1, activation='linear', use_bias=False)(input_tensor)

# The second output layer should have 1 unit with 'sigmoid' activation and no bias term. 
# Use the first output layer as an input to this layer.
output_tensor_2 = Dense(1, activation='sigmoid', use_bias=False)(output_tensor_1)

# Create a model with 2 outputs
model = Model(input_tensor, [output_tensor_1, output_tensor_2])

Now that we have a model with 2 outputs, compile it with 2 loss functions: mean absolute error (MAE) for `'score_diff'` and binary cross-entropy (also known as logloss) for `'won'`. Then fit the model with `'seed_diff'` and `'pred'` as inputs. For outputs, predict `'score_diff'` and `'won'`.

This model can use the scores of the games to make sure that close games (small score diff) have lower win probabilities than blowouts (large score diff).

The regression problem is easier than the classification problem because MAE punishes the model less for a loss due to random chance. For example, if `score_diff` is `-1` and `won` is `0`, that means `team_1` had some bad luck and lost by a single free throw. The data for the easy problem helps the model find a solution to the hard problem.

In [21]:
# Import the Adam optimizer
from keras.optimizers import Adam

# Compile the model with 2 losses: 'mean_absolute_error' and 'binary_crossentropy', 
# and use the Adam optimizer with a learning rate of 0.01.
model.compile(loss=['mean_absolute_error', 'binary_crossentropy'], optimizer=Adam(lr=0.01))

# Fit the model with 'seed_diff' and 'pred' columns as the inputs and 
# 'score_diff' and 'won' columns as the targets. Use 10 epochs and a batch size of 16384.
model.fit(
    tournament_train[['seed_diff', 'pred']],
    [tournament_train[['score_diff']], tournament_train[['won']]],
    epochs=10,
    verbose=True,
    batch_size=16384
)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<keras.callbacks.History at 0x7f66b4070908>

**Inspect the Model**

Examine the weights for the model. In particular, note the last weight of the model. This weight converts the predicted score difference to a predicted win probability. If you multiply the predicted score difference by the last weight of the model and then apply the sigmoid function, you get the win probability of the game.

In [24]:
# Print the model weights
print(model.get_weights())
print()

# Print the training data means
print(tournament_train.mean())

[array([[1.1135603],
       [0.9131878]], dtype=float32), array([[0.14484733]], dtype=float32)]

season        1997.045455
team_1        5546.025568
team_2        5546.025568
home             0.000000
seed_diff        0.000000
score_diff       0.000000
score_1         71.912247
score_2         71.912247
won              0.500000
pred             0.124636
dtype: float64


In [25]:
# Import the sigmoid function from scipy
from scipy.special import expit as sigmoid

# Weight from the model
weight = 0.14

# Print the approximate win probability of a predicted close game (1 point difference)
print(sigmoid(1 * weight))

# Print the approximate win probability of a predicted blowout game (10 point difference)
print(sigmoid(10 * weight)) 

0.5349429451582145
0.8021838885585818


So `sigmoid(1 * 0.14)` is `0.53`, which represents a pretty close game and `sigmoid(10 * 0.14)` is `0.80`, which represents a pretty likely win. In other words, if the model predicts a win of 1 point, it is less sure of the win than if it predicts 10 points.

**Evaluate the Model**

We'll evaluate your model on the tournament test set to see how well it does on new data.

Note that in this case, Keras will return 3 numbers: the first number will be the sum of both the loss functions, and then the next 2 numbers will be the loss functions you used when defining the model.

In [27]:
# Evaluate the model on new data
model.evaluate(
    tournament_test[['seed_diff', 'pred']],
    [tournament_test[['score_diff']], tournament_test[['won']]]
)



[9.669232912403558, 9.085774332229013, 0.5834585309699597]