# Two Input Networks Using Categorical Embeddings, Shared Layers, and Merge Layers
  
In this chapter, you will build two-input networks that use categorical embeddings to represent high-cardinality data, shared layers to specify re-usable building blocks, and merge layers to join multiple inputs to a single output. By the end of this chapter, you will have the foundational building blocks for designing neural networks with complex data flows.

## Resources
  
**Notebook Syntax**
  
<span style='color:#7393B3'>NOTE:</span>  
- Denotes additional information deemed to be *contextually* important
- Colored in blue, HEX #7393B3
  
<span style='color:#E74C3C'>WARNING:</span>  
- Significant information that is *functionally* critical  
- Colored in red, HEX #E74C3C
  
---
  
**Links**
  
[NumPy Documentation](https://numpy.org/doc/stable/user/index.html#user)  
[Pandas Documentation](https://pandas.pydata.org/docs/user_guide/index.html#user-guide)  
[Matplotlib Documentation](https://matplotlib.org/stable/index.html)  
[Seaborn Documentation](https://seaborn.pydata.org)  
[TensorFlow Documentation](https://www.tensorflow.org)  
[Scikit-Learn Documentation](https://scikit-learn.org/stable/)  
  
---
  
**Notable Functions**
  

<table>
  <tr>
    <th>Index</th>
    <th>Operator</th>
    <th>Use</th>
  </tr>
  <tr>
    <td>1</td>
    <td>numpy.array()</td>
    <td>Creates an array. An array is a grid of values and it contains information about the raw data, how to locate an element, and how to interpret an element. It has a grid of elements that can be indexed in various ways.</td>
  </tr>
  <tr>
    <td>2</td>
    <td>numpy.arange()</td>
    <td>Return evenly spaced values within a given interval. Params are start, stop, step</td>
  </tr>
  <tr>
    <td>3</td>
    <td>tensorflow.keras.models.Sequential</td>
    <td>Creates a sequential model in Keras, which is a linear stack of layers. This is the most common type of model in deep learning, where each layer is connected to the next in a sequential manner.</td>
  </tr>
  <tr>
    <td>4</td>
    <td>tensorflow.keras.layers.Dense</td>
    <td>A fully connected layer in a neural network. Dense layers are the most common type of layer used in deep learning models. They have a set of learnable weights and biases and each neuron is connected to every neuron in the previous layer.</td>
  </tr>
  <tr>
    <td>5</td>
    <td>tensorflow.keras.layers.Input</td>
    <td>A Keras tensor is a symbolic tensor-like object, which we augment with certain attributes that allow us to build a Keras model just by knowing the inputs and outputs of the model. For instance, if a, b and c are Keras tensors, it becomes possible to do: model = Model(input=[a, b], output=c)</td>
  </tr>
  <tr>
    <td>6</td>
    <td>tensorflow.keras.layers.Flatten</td>
    <td>Flattens the input. Does not affect the batch size.</td>
  </tr>
  <tr>
    <td>7</td>
    <td>tensorflow.keras.layers.Embedding</td>
    <td>Turns positive integers (indexes) into dense vectors of fixed size. Dictionary-like</td>
  </tr>
  <tr>
    <td>8</td>
    <td>tensorflow.keras.layers.Subtract</td>
    <td>Layer that subtracts two inputs. Element-wise</td>
  </tr>
  <tr>
    <td>9</td>
    <td>tensorflow.keras.layers.Add</td>
    <td>Layer that adds two inputs. Element-wise</td>
  </tr>
  <tr>
    <td>10</td>
    <td>model.compile()</td>
    <td>Compiles a Keras model. It configures the model for training by specifying the optimizer, loss function, and evaluation metrics. This step is required before training a model.</td>
  </tr>
  <tr>
    <td>11</td>
    <td>model.evaluate()</td>
    <td>Returns the loss value & metrics for the model in test mode.</td>
  </tr>
  <tr>
    <td>12</td>
    <td>sklearn.model_selection.train_test_split()</td>
    <td>Splits arrays or matrices into random train and test subsets. This function is commonly used for evaluating the performance of machine learning models.</td>
  </tr>
  <tr>
    <td>13</td>
    <td>keras.models.Model</td>
    <td>A generic Keras model that allows creating complex architectures by connecting different layers together.</td>
  </tr>
</table>
  
---
  
**Language and Library Information**  
  
Python 3.11.0  
  
Name: numpy  
Version: 1.24.3  
Summary: Fundamental package for array computing in Python  
  
Name: pandas  
Version: 2.0.3  
Summary: Powerful data structures for data analysis, time series, and statistics  
  
Name: matplotlib  
Version: 3.7.2  
Summary: Python plotting package  
  
Name: seaborn  
Version: 0.12.2  
Summary: Statistical data visualization  
  
Name: tensorflow  
Version: 2.13.0  
Summary: TensorFlow is an open source machine learning framework for everyone.  
  
Name: scikit-learn  
Version: 1.3.0  
Summary: A set of python modules for machine learning and data mining  
  
---
  
**Miscellaneous Notes**
  
<span style='color:#7393B3'>NOTE:</span>  
  
`python3.11 -m IPython` : Runs python3.11 interactive jupyter notebook in terminal.
  
`nohup ./relo_csv_D2S.sh > ./output/relo_csv_D2S.log &` : Runs csv data pipeline in headless log.  
  
`print(inspect.getsourcelines(test))` : Get self-defined function schema  
  
<span style='color:#7393B3'>NOTE:</span>  
  
Snippet to plot all built-in matplotlib styles :
  
```python

x = np.arange(-2, 8, .1)
y = 0.1 * x ** 3 - x ** 2 + 3 * x + 2
fig = plt.figure(dpi=100, figsize=(10, 20), tight_layout=True)
available = ['default'] + plt.style.available
for i, style in enumerate(available):
    with plt.style.context(style):
        ax = fig.add_subplot(10, 3, i + 1)
        ax.plot(x, y)
    ax.set_title(style)
```
  


In [1]:
import numpy as np                  # Numerical Python:         Arrays and linear algebra
import pandas as pd                 # Panel Datasets:           Dataset manipulation
import matplotlib.pyplot as plt     # MATLAB Plotting Library:  Visualizations
import seaborn as sns               # Seaborn:                  Visualizations
import tensorflow as tf             # TensorFlow:               Deep-Learning Neural Networks
from tensorflow import keras        # Keras:                    Tensorflow-Keras Integration


# Setting a standard figure size
plt.rcParams['figure.figsize'] = (8, 8)


2023-07-28 13:00:16.463960: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.


## Category embeddings
  
In chapter 1, our dataset of tournament games only contained about 4,000 rows. However, we have a much bigger dataset with over 300,000 regular season games. Let's see what we can learn from a much larger sample of data! In the 2 basketball datasets you will be using in this course, there are a little under 11,000 teams. Each team is coded as an integer starting with 1 and ending with 10,887. In this lesson, you will learn how to use those team IDs as inputs to a model that learns the strength of each team.
  
**Category embeddings**
  
Categorical embeddings are an advanced type of layer, only available in deep learning libraries. They are extremely useful for dealing with high cardinality categorical data. In this dataset, the team ID variable has high cardinality. Embedding layers are also very useful for dealing with text data, such as in Word2vec models, but that is beyond the scope of this course. To model these teams in the basketball data, you'll use a very simple model that learns a "strength" rating for each team and uses those ratings to make predictions. To map the integer team IDs to a decimal rating, we will use an embedding layer.
  
<center><img src='../_images/categorical-embeddings-nn.png' alt='img' width='500'></center>
  
**Inputs**
  
To get started with category embeddings, you will need an input layer. In this case, your input is a single number, ranging from 1 to 10,887, which represents each team's unique ID. Note that this dataset covers about 30 years of data, and has about 400 unique schools, giving us close to 12,000 IDs. We only have about 11,000 of those year/team combinations, because not every school has a basketball team every year.
  
```python
from tensorflow.keras.layers import Input
input_tensor = Input(shape=(1, ))
```
  
**Embedding Layer**
  
To create an embedding layer, use the `Embedding()` function from `tensorflow.keras.layers`. Since you have 10,887 unique teams in the dataset, you define the input dimension of the embedding layer as 10,887. As you are representing each team as a single integer, use an input length of 1. You want to produce a single team strength rating, so use an output dimension of 1. Finally, name your layer, so you can easily find it when looking at the `model.summary()`, or plot. To use the embedding layer, connect it to the tensor produced by the input layer. This will produce an embedding output tensor.
  
```python
from tensorflow.keras.layers import Input, Embedding

input_tensor = Input(shape=(1, ))

N_TEAMS = 10887
embed_layer = Embedding(
    input_dim=N_TEAMS,
    input_length=1,
    output_dim=1,
    name='Team-Strength-Lookup'
)

embed_tensor = embed_layer(input_tensor)
```
  
**Flattening**
  
Embedding layers increase the dimensionality of your data. The input CSV has two dimensions (rows and columns), but embedding layers add a third dimension. This third dimension can be useful when dealing with images and text, so it is not as relevant to this course. Therefore, we use the flatten layer to flatten the embeddings from 3D to 2D. The flatten layer is also the output layer for the embedding process. Flatten layers are an advanced layer for deep learning models and can be used to transform data from multiple dimensions back down to two dimensions. They are useful for dealing with time-series data, text data, and images.
  
```python
from tensorflow.keras.layers import Input, Embedding, Flatten

input_tensor = Input(shape=(1, ))

N_TEAMS = 10887
embed_layer = Embedding(
    input_dim=N_TEAMS,
    input_length=1,
    output_dim=1,
    name='Team-Strength-Lookup'
)

embed_tensor = embed_layer(input_tensor)

flatten_tensor = Flatten()(embed_tensor)
```
  
**Put it all together**
  
Now you can wrap your embedding layer in a model. This will allow you to reuse the model for multiple inputs in the dataset. You do this by defining an input layer, then an embedding layer, then a flatten layer for the output. Finally, wrap the input tensor and flatten tensor in a model. This model can be treated exactly the same as a layer, and re-used inside of another model.
  
```python
from tensorflow.keras.layers import Input, Embedding, Flatten
from tensorflow.keras.models import Model

input_tensor = Input(shape=(1, ))

N_TEAMS = 10887
embed_layer = Embedding(
    input_dim=N_TEAMS,
    input_length=1,
    output_dim=1,
    name='Team-Strength-Lookup'
)

embed_tensor = embed_layer(input_tensor)

flatten_tensor = Flatten()(embed_tensor)

model = Model(input_tensor, flatten_tensor)
```
  

### Define team lookup
  
Shared layers allow a model to use the same weight matrix for multiple steps. In this exercise, you will build a "team strength" layer that represents each team by a single number. You will use this number for both teams in the model. The model will learn a number for each team that works well both when the team is `team_1` and when the team is `team_2` in the input data.
  
The `games_season` DataFrame is available in your workspace.
  
1. Count the number of unique teams.
2. Create an embedding layer that maps each team ID to a single number representing that team's strength.
3. The output shape should be 1 dimension (as we want to represent the teams by a single number).
4. The input length should be 1 dimension (as each team is represented by exactly one id).

In [2]:
games_season = pd.read_csv('../_datasets/games_season.csv')
print(games_season.shape)
games_season.head()

(312178, 8)


Unnamed: 0,season,team_1,team_2,home,score_diff,score_1,score_2,won
0,1985,3745,6664,0,17,81,64,1
1,1985,126,7493,1,7,77,70,1
2,1985,288,3593,1,7,63,56,1
3,1985,1846,9881,1,16,70,54,1
4,1985,2675,10298,1,12,86,74,1


In [3]:
games_season[['team_1', 'team_2']].nunique()

team_1    10888
team_2    10888
dtype: int64

In [4]:
np.unique(games_season['team_1']).shape

(10888,)

In [5]:
from keras.layers import Embedding

# Count the unique number of teams
N_TEAMS = np.unique(games_season['team_1']).shape[0]

# Create an embedding layer
team_lookup = Embedding(
    input_dim=N_TEAMS,
    output_dim=1,
    input_length=1,
    name='Team-Strength'
)


The embedding layer is a lot like a dictionary, but your model learns the values for each key.

### Define team model
  
The team strength lookup has three components: an input, an embedding layer, and a flatten layer that creates the output.
  
If you wrap these three layers in a model with an input and output, you can re-use that stack of three layers at multiple places.
  
Note again that the weights for all three layers will be shared everywhere we use them.
  
1. Create a 1D input layer for the team ID (which will be an integer). Be sure to set the correct input shape!
2. Pass this input to the team strength lookup layer you created previously.
3. Flatten the output of the team strength lookup.
4. Create a model that uses the 1D input as input and flattened team strength as output.

In [6]:
from keras.layers import Input, Flatten
from keras.models import Model

# Create an input layer for the team ID
teamid_in = Input(shape=(1, ))

# Lookup the input in the team strength embedding layer
strength_lookup = team_lookup(teamid_in)

# Flatten the output
strength_lookup_flat = Flatten()(strength_lookup)

# Combine the operations into a single, re-usable model
team_strength_model = Model(teamid_in, strength_lookup_flat, name='Team-Strength-Model')
team_strength_model.summary()

Model: "Team-Strength-Model"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 input_1 (InputLayer)        [(None, 1)]               0         
                                                                 
 Team-Strength (Embedding)   (None, 1, 1)              10888     
                                                                 
 flatten (Flatten)           (None, 1)                 0         
                                                                 
Total params: 10888 (42.53 KB)
Trainable params: 10888 (42.53 KB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________


The model will be reusable, so you can use it in two places in your final model.

## Shared layers
  
In this chapter, you will create a model with two inputs: one for each team in the basketball dataset. However, you want these two teams to each use the same embedding layer you defined in the previous lesson. Accomplishing this requires a shared layer.
  
- Requires the functional API  
- Very flexible  
  
Shared layers are an advanced deep learning concept, and are only possible with the Keras functional API. They allow you to define an operation and then apply the exact same operation (with the exact same weights) on different inputs. In this model, we will share team rating for both inputs. The learned rating will be the same, whether it applies to team 1 or team 2.
  
<center><img src='../_images/shared-layers-nn-functional.png' alt='img' width='750'></center>
  
To create a shared layer, you must first create two (or more) inputs, each of which will be passed to the shared layer. In this case, you will use two inputs.
  
Once you have two inputs, the magic of the Keras functional API becomes apparent. Recall from chapter 1 that the `Dense()` function returns a function as its output. This function, which `Dense()` outputs, takes a tensor as input and produces a tensor as output. You can use the same `Dense()` function to create a shared layer! Doing so is as simple as calling the function twice, with a different input tensor each time.
  
**Sharing multiple layers as a model**
  
Recall the category embedding model we made in the previous lesson. This model first embeds an input and then flattens it. You can also share models, not just layers. This is really cool and is part of what makes the functional API so useful. You can define modular components of models and then reuse them. We define an embedding layer and wrap it in a model. We then define 2 input tensors, and pass each one to the same model, producing 2 output tensors. This will use the same model, with the same layers and the same weights, for mapping each input to its corresponding output.
  
In other words, you can take an arbitrary sequence of keras layers, and wrap them up in a model. Once you have a model, you can re-use that model to share that sequence of steps for different input layers. Now you will create a shared layer using the team strength embedding model you made in the previous lesson.

### Defining two inputs
  
In this exercise, you will define two input layers for the two teams in your model. This allows you to specify later in the model how the data from each team will be used differently.
  
1. Create an input layer to use for team 1. Recall that our input dimension is 1.
2. Name the input "Team-1-In" so you can later distinguish it from team 2.
3. Create an input layer to use for team 2, named "Team-2-In".

In [7]:
# Load the input layer from tensorflow.keras.layers
from keras.layers import Input

# Input layer for team 1
team_in_1 = Input(shape=(1, ), name='Team-1-In')

# Separate input layer for team 2
team_in_2 = Input(shape=(1, ), name='Team-2-In')


These two inputs will be used later for the shared layer.

### Lookup both inputs in the same model
  
Now that you have a team strength model and an input layer for each team, you can lookup the team inputs in the shared team strength model. The two inputs will share the same weights.
  
In this dataset, you have 10,888 unique teams. You want to learn a strength rating for each team, such that if any pair of teams plays each other, you can predict the score, even if those two teams have never played before. Furthermore, you want the strength rating to be the same, regardless of whether the team is the home team or the away team.
  
To achieve this, you use a shared layer, defined by the re-usable model (`team_strength_model()`) you built in exercise 3 and the two input layers (`team_in_1` and `team_in_2`) from the previous exercise, all of which are available in your workspace.
  
1. Lookup the first team ID in the team strength model.
2. Lookup the second team ID in the team strength model.

In [8]:
# Lookup team 1 in the team strength model
team_1_strength = team_strength_model(team_in_1)

# Lookup team 2 in the team strength model
team_2_strength = team_strength_model(team_in_2)

Now your model knows how strong each team is.

## Merge layers
  
Now that you've got multiple inputs and a shared layer, you need to combine your inputs into a single layer that you can use to predict a single output. This requires a Merge layer. Merge layers allow you to define advanced, non-sequential network topologies. This can give you a lot of flexibility to creatively design networks to solve specific problems.
  
There are many kinds of merge layers available in Keras. `Add`, `Subtract`, and `Multiply` layers do simple arithmetic operations *by element* on the input layers, and require them to be the same shape. For example, if we wanted to multiply our team strength ratings together, we could use a `Multiply` layer. `Concatenate` layers simply append the 2 layers together, similar to the `numpy.hstack()` function from numpy. Unlike the other merge layers, the `Concatenate` layer can operate on layers with different numbers of columns.
  
**Merge layers**
  
- Add
- Subtract
- Multiply
- Concatenate
  
Let's build a simple Keras model that takes in two numbers and adds them together. You accomplish this by defining two input layers and using the Add layer to add them together.
  
```python
from tensorflow.keras.layers import Input, Add

in_tensor_1 = Input(shape=(1, ))
in_tensor_2 = Input(shape=(1, ))

out_tensor = Add()([in_tensor_1, in_tensor_2])
```
  
If you'd like to add together many inputs, you can pass a list with more than two elements to an Add layer. Note that all of the inputs are required to have the same shape, so they can be combined element-wise. The Subtract and Multiply layers work the same way.
  
**Create the model**
  
Now you can wrap the output from your `Add` layer inside a `Model`, which will then allow you to fit it to data. Note that the model takes in a list of inputs because it has more than one input.
  
```python
from tensorflow.keras.models import Model

model = Model([in_tensor_1, in_tensor_2], out_tensor)
```
  
**Compile the model**
  
As with other Keras models, you need to compile it before fitting. Use the `"adam"` `optimizer=` and `mean_absolute_error ` as the `loss=` function. Now you can practice using merge layers.

### Output layer using shared layer
  
Now that you've looked up how "strong" each team is, subtract the team strengths to determine which team is expected to win the game.
  
This is a bit like the seeds that the tournament committee uses, which are also a measure of team strength. But rather than using seed differences to predict score differences, you'll use the difference of your own team strength model to predict score differences.
  
The subtract layer will combine the weights from the two layers by subtracting them.
  
1. Import the `Subtract` layer from `keras.layers`.
2. Combine the two-team strength lookups you did earlier.

In [9]:
from keras.layers import Subtract

# Create a subtract layer using the inputs from the previous exercise
score_diff = Subtract()([team_1_strength, team_2_strength])

This setup subracts the team strength ratings to determine a winner.

### Model using two inputs and one output
  
Now that you have your two inputs (`team_in_1` and `team_in_2`) and output (`score_diff`), you can wrap them up in a model so you can use it later for fitting to data and evaluating on new data.
  
1. Define a model with the two teams as inputs and use the score difference as the output.
2. Compile the model with the `'adam'` `optimizer=` and `'mean_absolute_error'` `loss=`.

In [10]:
from keras.models import Model

# Create the model
model = Model([team_in_1, team_in_2], score_diff)

# Compile the model
model.compile(optimizer='adam', loss='mean_absolute_error')

Now your model is finalized and ready to fit to data.

## Fitting and Predicting with multiple inputs
  
Keras models with multiple inputs work just like Keras models with a single input. They use the same `.fit()`, `.evaluate()`, and `.predict()` methods. The only difference is that all of these methods take lists of inputs, rather a single input.
  
**Fit with multiple inputs**
  
To fit a model with multiple inputs, provide the model a list of inputs. 
  
```python
model.fit([in_one, in_two], target)
```
  
In this case, since you have two inputs, the model needs to have an input list of length 2. You want to use this model to predict a single target, so the target for training is still a single object. While this network is very simple, the concept it illustrates is quite advanced. Later in the course, you will process different inputs to the network in different ways. In other words, multiple inputs let you do data pre-processing as part of the model you learn!
  
**Predict with multiple inputs**
  
To make predictions from a model with two inputs, you also need to provide two inputs to the model's `.predict()` method, again as a list. In this case, I've defined a model that adds numbers. So in order to add 1 and 2, first convert 1 and 2 into 2D numpy arrays. Then pass 1 as the first input, and 2 as the second input. The model outputs 3. Note that the data type of the output is float32. You can also add other numbers with this simple model, e.g. 42 and 119. Which add up to 161.
  
<center><img src='../_images/shared-layers-nn-functional1.png' alt='img' width='500'></center>
  
**Evaluate with multiple inputs**
  
To evaluate a model with multiple inputs, simply give it a list of inputs, along with a single output, and the model will return its loss on the new data. In this case, since I've hard-coded the model to add the 2 inputs, the evaluation error on the test data is zero. Now that you know how to pass lists to models with multiple inputs, fit your multiple-input basketball model to some data.
  
<center><img src='../_images/shared-layers-nn-functional2.png' alt='img' width='500'></center>
  

### Fit the model to the regular season training data
  
Now that you've defined a complete team strength model, you can fit it to the basketball data! Since your model has two inputs now, you need to pass the input data as a list.
  
1. Assign the `'team_1'` and `'team_2'` columns from `games_season` to `input_1` and `input_2`, respectively.
2. Use `'score_diff'` column from `games_season` as the target.
3. Fit the model using 1 `epochs=`, a `batch_size=` of 2048, and a 10% `validation_split=`.

In [11]:
# Get the team_1 column from the regular season data
input_1 = games_season['team_1']

# Get the team_2 column from the regular season data
input_2 = games_season['team_2']

# Fit the model to input 1 and 2, using score diff as a target
model.fit(
    x=[input_1, input_2],
    y=games_season['score_diff'], 
    epochs=1, 
    batch_size=2048, 
    validation_split=0.1, 
    verbose=1
)




<keras.src.callbacks.History at 0x131ca6790>

Now our model has learned a strength rating for every team.

### Evaluate the model on the tournament test data
  
The model you fit to the regular season data (`model`) in the previous exercise and the tournament dataset (`games_tourney`) are available in your workspace.
  
In this exercise, you will evaluate the model on this new dataset. This evaluation will tell you how well you can predict the tournament games, based on a model trained with the regular season data. This is interesting because many teams play each other in the tournament that did not play in the regular season, so this is a very good check that your model is not overfitting.
  
1. Assign the `'team_1'` and `'team_2'` columns from `games_tourney` to `input_1` and `input_2`, respectively.
2. Evaluate the model.

In [12]:
games_tourney = pd.read_csv('../_datasets/games_tourney.csv')
print(games_tourney.shape)
games_tourney.head()

(4234, 9)


Unnamed: 0,season,team_1,team_2,home,seed_diff,score_diff,score_1,score_2,won
0,1985,288,73,0,-3,-9,41,50,0
1,1985,5929,73,0,4,6,61,55,1
2,1985,9884,73,0,5,-4,59,63,0
3,1985,73,288,0,3,9,50,41,1
4,1985,3920,410,0,1,-9,54,63,0


In [13]:
# Get team_1 from the tournament data
input_1 = games_tourney['team_1']

# Get team_2 from the tournament data
input_2 = games_tourney['team_2']

# Evaluate the model using these inputs
print(model.evaluate([input_1, input_2], games_tourney['score_diff'], verbose=0))

11.68185043334961


Great job! Its time to move on to models with more than two inputs.