# Two Input Networks Using Categorical Embeddings, Shared Layers, and Merge Layers
We're gonna use a larger dataset to predict team strength. Each time has an integer ID.

In [1]:
import pandas as pd
import numpy as np
from keras.layers import Input
from keras.layers import Dense
from keras.models import Model
from keras.utils import plot_model
from keras.layers import Embedding
from keras.layers import Flatten
import matplotlib.pyplot as plt

Using TensorFlow backend.


In [10]:
df = pd.read_csv('basketball_data/games_season.csv')
df.head()

Unnamed: 0,season,team_1,team_2,home,score_diff,score_1,score_2,won
0,1985,3745,6664,0,17,81,64,1
1,1985,126,7493,1,7,77,70,1
2,1985,288,3593,1,7,63,56,1
3,1985,1846,9881,1,16,70,54,1
4,1985,2675,10298,1,12,86,74,1


## Category embeddings
Advanced type of layer. They are very useful in dealing with high cardinality categorical data like out team's ID. 

Also useful when using word to vector models.

We're gonna use a cat embedding layer to map integer team ID to a decimal rating. 

In [7]:
from keras.layers import Embedding
input_tensor = Input(shape=(1, ))
n_teams = 10887

embed_layer = Embedding(input_dim = n_teams,
                        input_length =1, #because each team is an integer
                        output_dim =1, #we only want a single rating per team
                        name = 'Team-Strength-Lookup')
embed_tensor = embed_layer(input_tensor)

Instructions for updating:
Colocations handled automatically by placer.


Embedding layers increase dimesionality of the data. This extra dimesion can be useful when dealing with images or text. We won't use this here so we will flatten it.

The flatten layer is also the output layer for the embedding process

In [8]:
from keras.layers import Flatten
flatten_tensor = Flatten()(embed_tensor)

We can wrap the whole thing in our model:

In [9]:
model = Model(input_tensor, flatten_tensor)

We can now use this model as a layer for another model. 

### Practice: Define team lookup
Shared layers allow a model to use the same weight matrix for multiple steps. In this exercise, you will build a "team strength" layer that represents each team by a single number. You will use this number for both teams in the model. The model will learn a number for each team that works well both when the team is team_1 and when the team is team_2 in the input data.

The games_season DataFrame is available in your workspace.

In [12]:
# Imports
from keras.layers import Embedding
from numpy import unique

# Count the unique number of teams
n_teams = unique(df['team_1']).shape[0]

# Create an embedding layer
team_lookup = Embedding(input_dim=n_teams,
                        output_dim=1,
                        input_length=1,
                        name='Team-Strength')

#The embedding layer is a lot like a dictionary, but your model learns the values for each key.

### Practice: Define team model
The team strength lookup has three components: an input, an embedding layer, and a flatten layer that creates the output.

If you wrap these three layers in a model with an input and output, you can re-use that stack of three layers at multiple places.

Note again that the weights for all three layers will be shared everywhere we use them.

In [13]:
# Imports
from keras.layers import Input, Embedding, Flatten
from keras.models import Model

# Create an input layer for the team ID
teamid_in = Input(shape=(1,))

# Lookup the input in the team strength embedding layer
strength_lookup = team_lookup(teamid_in)

# Flatten the output
strength_lookup_flat = Flatten()(strength_lookup)

# Combine the operations into a single, re-usable model
team_strength_model = Model(teamid_in, strength_lookup_flat, name='Team-Strength-Model')

#The model will be reusable, so you can use it in two places in your final model