# Multiple Outputs

In this chapter, you will build neural networks with multiple outputs, which can be used to solve regression problems with multiple targets. You will also build a model that solves a regression problem and a classification problem simultaneously.

# (1) Two-output models 

## Simple model with 2 outputs

In [None]:
from keras.layers import Input, Concatenate, Dense
input_tensor = Input(shape=(1,))
output_tensor = Dense(2)(input_tensor)

<img src="image/Screenshot 2021-01-31 234418.png">

In [None]:
from keras.models import Model
model = Model(input_tensor, output_tensor)
model.compile(optimizer='adam', loss='mean_absolute_error')

<img src="image/Screenshot 2021-01-31 234612.png">

## Fiting a model with 2 outputs

In [None]:
games_tourney_train[['seed_diff', 'score_1', 'score_2']].head()

In [None]:
| | Seed_diff | Score_1 | Score_2 |
| :-: | :-: | :-: | :-: |
| 0 | -3 | 41 | 50 |
| 1 | 4 | 61 | 55 |
| 2 | 5 | 59 | 63 |
| 3 | 3 | 50 | 41 |
| 4 | 1 | 54 | 63 |

In [None]:
X = games_tourney_train[['seed_diff']]
y = games_tourney_train[['score_1', 'score_2']]
model.fit(X, y, epochs=500)

## Inspecting a 2 output model

In [None]:
model.get_weights()

In [None]:
[array([[0.60714734]], dtype=float32), 
array([70.39491, 70.39306], dtype=float32)]

## Evaluating a model with 2 outputs

In [None]:
X = games_tourney_test[['seed_diff']]
y = games_tourney_test[['score_1', 'score_2']]
model.evalute(X, y)

In [None]:
11.528035634635021

# Exercise I: Simple two-output model

In this exercise, you will use the tournament data to build one model that makes two predictions: the scores of both teams in a given game. Your inputs will be the seed difference of the two teams, as well as the predicted score difference from the model you built in chapter 3.

The output from your model will be the predicted score for team 1 as well as team 2. This is called "multiple target regression": one model making more than one prediction.

### Instructions

- Create a single input layer with 2 columns.
- Connect this input to a Dense layer with 2 units.
- Create a model with `input_tensor` as the input and `output_tensor` as the output.
- Compile the model with `'adam'` as the optimizer and `'mean_absolute_error'` as the loss function.


In [None]:
# Define the input
input_tensor = Input(shape=(2,))

# Define the output
output_tensor = Dense(2)(input_tensor)

# Create a model
model = Model(input_tensor, output_tensor)

# Compile the model
model.compile(optimizer='adam', loss='mean_absolute_error')

## Exercise II: Fit a model with two outputs

Now that you've defined your 2-output model, fit it to the tournament data. I've split the data into `games_tourney_train` and `games_tourney_test`, so use the training set to fit for now.

This model will use the pre-tournament seeds, as well as your pre-tournament predictions from the regular season model you built previously in this course.

As a reminder, this model will predict the scores of both teams.

### Instructions

- Fit the model to the `games_tourney_train` dataset using 100 epochs and a batch size of 16384.
- The input columns are `'seed_diff'`, and `'pred'`.
- The target columns are `'score_1'` and `'score_2'`.


In [None]:
# Fit the model
model.fit(games_tourney_train[['seed_diff', 'pred']],
  		  games_tourney_train[['score_1', 'score_2']],
  		  verbose=True,
  		  epochs=100,
  		  batch_size=16384)

# Exercise III: Inspect the model (I)

Now that you've fit your model, let's take a look at it. You can use the `.get_weights()` method to inspect your model's weights.

The input layer will have 4 weights: 2 for each input times 2 for each output.

The output layer will have 2 weights, one for each output.

### Instructions

- Print the `model`'s weights.
- Print the column means of the training data (`games_tourney_train`).


In [None]:
# Print the model's weights
print(model.get_weights())

# Print the column means of the training data
print(games_tourney_train.mean())

# Exercise IV: Evaluate the model

Now that you've fit your model and inspected it's weights to make sure it makes sense, evaluate it on the tournament test set to see how well it performs on new data.

### Instructions

- Evaluate the model on `games_tourney_test`.
- Use the same inputs and outputs as the training set.


In [None]:
# Evaluate the model on the tournament test data
print(model.evaluate(games_tourney_test[['seed_diff', 'pred']], games_tourney_test[['score_1', 'score_2']], verbose=False))

# (2) Single model for classification and regression

## Building a simple regressor/classifier

In [None]:
from keras.layers import Input, Dense
input_tensor = Input(shape=(1,))
output_tensor_reg = Dense(1)(input_tensor)
output_tensor_class = Dense(1, activation='sigmoid')(output_tensor_reg)

<img src="image/Screenshot 2021-02-01 001654.png">

## Make a regressor/classifier model

In [None]:
from keras.models import Model
model = Model(input_tensor, [output_tensor_reg, output_tensor_class])
model.compile(loss=['mean_absolute_error', 'binary_crossentropy'], optimizer='adam')

<img src="image/Screenshot 2021-02-01 001938.png">

## Fit the combination classifier/regressor

In [None]:
X = games_tourney_train[['seed_diff']]
y_reg = games_tourney_train[['score_diff']]
y_class = games_tourney_train['won']
model.fit(X, [y_reg, y_class], epochs=100)

## Look at the model's weights

In [None]:
model.get_weights()
[array([[1.2371823]], dtype=float32),
array([-0.05451894], dtype=float32),
array([0.13870609], dtype=float32),
array([0.00734114], dtype=float32)]

<img src="image/Screenshot 2021-02-01 002511.png">

In [None]:
model.get_weights()
[array([[1.2371823]], dtype=float32),
array([-0.05451894], dtype=float32),
array([0.13870609], dtype=float32),
array([0.00734114], dtype=float32)]

In [None]:
from scipy.special import expit as sigmoid
print(sigmoid(1 * 0.13870609 + 0.00734114))

In [None]:
0.5364470465211318

## Evaluate the model on new data

In [None]:
X = games_tourney_test[['seed_diff']]
y_reg = games_tourney_test[['score_diff']]
y_class = games_tourney_test[['won']]
model.evaluate(X, [y_reg, y_class])

In [None]:
[9.866300069455413, 9.281179495657208, 0.585120575627864]

# Exercise V: Classification and regression in one model

Now you will create a different kind of 2-output model. This time, you will predict the score difference, instead of both team's scores and then you will predict the probability that team 1 won the game. This is a pretty cool model: it is going to do both classification and regression!

In this model, turn off the bias, or intercept for each layer. Your inputs (seed difference and predicted score difference) have a mean of very close to zero, and your outputs both have means that are close to zero, so your model shouldn't need the bias term to fit the data well.

### Instructions

- Create a single input layer with 2 columns.
- The first output layer should have 1 unit with `'linear'` activation and no bias term.
- The second output layer should have 1 unit with `'sigmoid'` activation and no bias term. Also, use the first output layer as an input to this layer.
- Create a model with these input and outputs.


In [None]:
# Create an input layer with 2 columns
input_tensor = Input(shape=(2,))

# Create the first output
output_tensor_1 = Dense(1, activation='linear', use_bias=False)(input_tensor)

# Create the second output (use the first output as input here)
output_tensor_2 = Dense(1, activation='sigmoid', use_bias=False)(output_tensor_1)

# Create a model with 2 outputs
model = Model(input_tensor, [output_tensor_1, output_tensor_2])

# Exercise VI: Compile and fit the model

Now that you have a model with 2 outputs, compile it with 2 loss functions: mean absolute error (MAE) for `'score_diff'` and binary cross-entropy (also known as logloss) for `'won'`. Then fit the model with `'seed_diff'` and `'pred'` as inputs. For outputs, predict `'score_diff'` and `'won'`.

This model can use the scores of the games to make sure that close games (small score diff) have lower win probabilities than blowouts (large score diff).

The regression problem is easier than the classification problem because MAE punishes the model less for a loss due to random chance. For example, if `score_diff` is -1 and `won` is 0, that means `team_1` had some bad luck and lost by a single free throw. The data for the easy problem helps the model find a solution to the hard problem.

### Instructions

- Import `Adam` from `keras.optimizers`.
- Compile the model with 2 losses: `'mean_absolute_error'` and `'binary_crossentropy'`, and use the Adam optimizer with a learning rate of 0.01.
- Fit the model with `'seed_diff'` and 'pred' columns as the inputs and `'score_diff'` and `'won'` columns as the targets.
- Use 10 epochs and a batch size of 16384.


In [None]:
# Import the Adam optimizer
from keras.optimizers import Adam

# Compile the model with 2 losses and the Adam optimzer with a higher learning rate
model.compile(loss=['mean_absolute_error', 'binary_crossentropy'], optimizer=Adam(0.01))

# Fit the model to the tournament training data, with 2 inputs and 2 outputs
model.fit(games_tourney_train[['seed_diff', 'pred']],
          [games_tourney_train[['score_diff']], games_tourney_train[['won']]],
          epochs=10,
          verbose=True,
          batch_size=16384)

# Exercise VII: Inspect the model (II)

Now you should take a look at the weights for this model. In particular, note the last weight of the model. This weight converts the predicted score difference to a predicted win probability. If you multiply the predicted score difference by the last weight of the model and then apply the sigmoid function, you get the win probability of the game.

### Instructions 1/2

- Print the `model`'s weights.
- Print the column means of the training data (`games_tourney_train`).

In [None]:
# Print the model weights
print(model.get_weights())

# Print the training data means
print(games_tourney_train.mean())

### Instructions 2/2

- Print the approximate win probability predicted for a close game (1 point difference).
- Print the approximate win probability predicted blowout game (10 point difference).

In [None]:
# Import the sigmoid function from scipy
from scipy.special import expit as sigmoid

# Weight from the model
weight = 0.14

# Print the approximate win probability predicted close game
print(sigmoid(1 * weight))

# Print the approximate win probability predicted blowout game
print(sigmoid(10 * weight))

# Exercise VIII: Evaluate on new data with two metrics

Now that you've fit your model and inspected its weights to make sure they make sense, evaluate your model on the tournament test set to see how well it does on new data.

Note that in this case, Keras will return 3 numbers: the first number will be the sum of both the loss functions, and then the next 2 numbers will be the loss functions you used when defining the model.

Ready to take your deep learning to the next level? Check out ["Convolutional Neural Networks for Image Processing"](https://learn.datacamp.com/courses/convolutional-neural-networks-for-image-processing).

### Instructions

- Evaluate the model on `games_tourney_test`.
- Use the same inputs and outputs as the training set.


In [None]:
# Evaluate the model on new data
print(model.evaluate(games_tourney_test[['seed_diff', 'pred']],
               [games_tourney_test[['score_diff']], games_tourney_test[['won']]], verbose=False))

# Warp-up

## So far...
- Functional API
- Shared layers
- Categorical embeddings
- Multiple inputs
- Multiplt outputs
- Regression / Classification in one model

## Shared layers

Useful for making comparisons | Known in the academic literature as Siamese networks

- Basketball teams
- Image similarity / retrieval
- Document similarity
- [Link to blog post](https://medium.com/mlreview/implementing-malstm-on-kaggles-quora-question-pairs-competition-8b31b0b16a07)
- [Link to academic paper](http://people.csail.mit.edu/jonasmueller/info/MuellerThyagarajan_AAAI16.pdf)

<img src="image/Screenshot 2021-02-01 010129.png">

## Multiple inputs

<img src="image/Screenshot 2021-02-01 010227.png">

## Multiple outputs

<img src="image/Screenshot 2021-02-01 001938.png">

## Skip connections

In [None]:
input_tensor = Input((100,))
hidden_tensor = Dense(256, activation='relu')(input_tensor)
hidden_tensor = Dense(256, activation='relu')(hidden_tensor)
hidden_tensor = Dense(256, activation='relu')(hidden_tensor)
output_tensor = Concatenate()([input_tensor, hidden_tensor])
output_tensor = Dense(256, activation='relu')(output_tensor)

[Visulizing the Loss Landscape of Neural Nets](https://arxiv.org/pdf/1712.09913.pdf)

<img src="image/Screenshot 2021-02-01 010448.png">

# Best of luck!