# Training TFRS

 - [ ] Fix the data 
     - [x] Get a reasonable amount of data, make sure there is overlap in train/test 
     - [ ] Set up a flag so we can use all vs. subset of data depending on CPU/GPU
 - [ ] Set up eval procedure - **Clean this up a bit more**
     - [x] Metrics 
     - [ ] Coverage/Popularity
     - [x] Qualitative evaluation of predictions 
 - [x] Baselines - **Done, just need to clean**
     - [x] Most popular 
     - [x] Domain Knowledge 
     - [x] kNN
 - [ ] TFRS
     - [x] Simple model 
     - [ ] With Context Features
     - [ ] Sequential 
     - [ ] Memory Efficient
 - [ ] Serving 
     - [x] In memory 
     - [ ] TFS
 - [ ] E2E with TFX
 - [ ] Alternatives 
     - [ ] LightFM, Microsoftrecommenders, Transformer recommends
 - [ ] Clean Notebook
     - [ ] References to Papers / Books
     - [ ] Evaluation notes
     - [ ] Shortcomings/Future work 
    
After doing with context features, do a more advanced on GPU, and then do E2E with TFX 

In [1]:
from typing import Dict, Any, Text

import numpy as np 
import pandas as pd

import tensorflow as tf
import tensorflow_recommenders as tfrs
import tensorflow_data_validation as tfdv

# **Reading in the Data** 

First we will read in the training and test data. 

<div class="alert alert-block alert-info">
<b>NOTE:</b> See <code>EDA.ipynb</code> for analysis on the data and details on how the train and test sets were created. 
</div>

In the following cells we will cheat a bit and create an even smaller version of the dataset so that we can train on a reasonable amount of time on a CPU. 

In [2]:
train_df = pd.read_csv('train.csv', dtype={'user_no': str, 'item_no': str})
test_df = pd.read_csv('test.csv', dtype={'user_no': str, 'item_no': str})

# For evaluation
item_info_df = pd.read_csv('item_info.csv', dtype={'item_no': str})

In order to create the smaller version of the dataset so that we can train quickly, we will just take the top few thousand users. Note that this will signficantly change the distribution of the features in the dataset and that we will not be able to accurately assess how well any trained models can deal with the user cold-start problem. 

In [3]:
NUM_USERS = 2000

top_users = train_df['user_no'].value_counts()[:NUM_USERS].index

# Create smaller versions of the dataset
train_df_filtered = train_df.loc[train_df['user_no'].isin(top_users)]
test_df_filtered = test_df.loc[test_df['user_no'].isin(top_users)]
# Separately store the 'catalogue' of items so we can use them as our candidates
items = train_df_filtered['item_no'].unique()

print(len(train_df_filtered))
print(len(test_df_filtered))

36909
2273


In the following cell we create TensorFlow datasets out of the Pandas DataFrames

In [4]:
train_dataset = tf.data.Dataset.from_tensor_slices(dict(train_df_filtered))
test_dataset = tf.data.Dataset.from_tensor_slices(dict(test_df_filtered))

items_dataset = tf.data.Dataset.from_tensor_slices(items)

2022-02-08 15:08:35.681987: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.


In [5]:
for item in items_dataset.take(3):
    print(item)

tf.Tensor(b'-1119687312509640915', shape=(), dtype=string)
tf.Tensor(b'-3219910350938683317', shape=(), dtype=string)
tf.Tensor(b'1179978263120783371', shape=(), dtype=string)


In [6]:
for elem in train_dataset.take(3):
    print(elem)

{'user_no': <tf.Tensor: shape=(), dtype=string, numpy=b'-2683506524939646253'>, 'item_no': <tf.Tensor: shape=(), dtype=string, numpy=b'-1119687312509640915'>, 'gender_description': <tf.Tensor: shape=(), dtype=string, numpy=b'unisex'>, 'brand': <tf.Tensor: shape=(), dtype=string, numpy=b'reima'>, 'product_group': <tf.Tensor: shape=(), dtype=string, numpy=b'boots'>, 'first_interaction_month': <tf.Tensor: shape=(), dtype=int64, numpy=11>}
{'user_no': <tf.Tensor: shape=(), dtype=string, numpy=b'-8270295623916047084'>, 'item_no': <tf.Tensor: shape=(), dtype=string, numpy=b'-3219910350938683317'>, 'gender_description': <tf.Tensor: shape=(), dtype=string, numpy=b'boys'>, 'brand': <tf.Tensor: shape=(), dtype=string, numpy=b'moschino kid-teen'>, 'product_group': <tf.Tensor: shape=(), dtype=string, numpy=b'tops'>, 'first_interaction_month': <tf.Tensor: shape=(), dtype=int64, numpy=11>}
{'user_no': <tf.Tensor: shape=(), dtype=string, numpy=b'-1493854771764820101'>, 'item_no': <tf.Tensor: shape=()

In [7]:
print(f"There are {train_df_filtered['user_no'].nunique()} unique users in the training dataset")
print(f"There are {test_df_filtered['user_no'].nunique()} unique users in the test dataset")
print(f"There are {train_df_filtered['item_no'].nunique()} unique items in the training dataset")
print(f"There are {test_df_filtered['item_no'].nunique()} unique items in the test dataset")

num_new_items_in_test = len(set(test_df_filtered['item_no']) - set(train_df_filtered['item_no']))
print(f"There are {num_new_items_in_test} 'unseen' items in the test dataset")

There are 2000 unique users in the training dataset
There are 1568 unique users in the test dataset
There are 20177 unique items in the training dataset
There are 2111 unique items in the test dataset
There are 786 'unseen' items in the test dataset


# Creating the Model

We will start by creating a very simple model similar to the one created in [the TFRS basic retrieval tutorial](https://www.tensorflow.org/recommenders/examples/basic_retrieval). Quoting from the tutorial, the model will be created by two-submodels: 

> 1. A query model computing the query representation (normally a fixed-dimensionality embedding vector) using query features
> 2. A candidate model computing the candidate representation (an equally-sized vector using the candidate features
> 
> The outputs of the two models are then multiplied together to give a query-candidate affinity score, with higher scores expressing a better match between the candidate and the query.

For our use case, we will pretend that we want to recommend items to users. As such, our **query** model will produce representations of the **users** (and potentially additional **context**, such as time, device, etc.) and our **candidate** model will produce representations of the **items**. 

For the rest of the notebook we will refer to the "query" model as a `user_model` and the "candidate" model as a `item_model`

In [40]:
def get_vocab(df, feature, top_n=None):
    return df[feature].value_counts()[:top_n].index

def create_embedding_model(feature, num_oov_indices=1, embedding_dim=32):
    feature_vocab = get_vocab(train_df_filtered, feature)
    feature_input = tf.keras.Input(shape=(), dtype="string", name=feature)
    feature_lookup = tf.keras.layers.StringLookup(
        vocabulary=feature_vocab,
        mask_token=None,
        num_oov_indices=num_oov_indices,
        name=f"{feature}_lookup"
    )(feature_input)
    feature_embedding = tf.keras.layers.Embedding(len(feature_vocab) + num_oov_indices, 
                                                  embedding_dim)(feature_lookup)
    return tf.keras.models.Model(feature_input, feature_embedding)

class SimpleTFRSModel(tfrs.Model):

    def __init__(self, user_model, item_model, task):
        super().__init__()
        self.user_model: tf.keras.Model = user_model
        self.item_model: tf.keras.Model = item_model
        self.task: tf.keras.layers.Layer = task
            

    def compute_loss(self, features: Dict[Text, tf.Tensor], training=False) -> tf.Tensor:
        # We pick out the user features and pass them into the user model
        # and item features to pass to the item model. Use the returned embeddings 
        # to calculate the loss
        user_embeddings = self.user_model(features['user_no'])
        positive_item_embeddings = self.item_model(features['item_no'])
        # The task computes the loss and the metrics.
        return self.task(user_embeddings, positive_item_embeddings, compute_metrics=not training)

In [53]:
user_model = create_embedding_model("user_no")
item_model = create_embedding_model("item_no")
metrics = tfrs.metrics.FactorizedTopK(
  candidates=items_dataset.batch(128).map(item_model)
)
task = tfrs.tasks.Retrieval(
  metrics=metrics
)

simple_tfrs_model = SimpleTFRSModel(user_model, item_model, task)

---
---

<div class="alert alert-block alert-warning">
<b>The above is just a convenience!</b> The following class is a simplified version of what
is actually going on under-the-hood:

```python 
class NonTFRSModel(tf.keras.Model):
    def __init__(self, user_model, item_model, metrics):
        """
        Note that we don't pass in the task! That's because we define 
        what the task is here.
        """
        super().__init__()
        self.user_model = user_model 
        self.item_model = item_model 
        # When we perform retrieval, the default loss is actually just good 
        # old CategoricalCrossentropy :) 
        self._loss = tf.keras.losses.CategoricalCrossentropy(
            from_logits=True, reduction=tf.keras.losses.Reduction.SUM
        )
        self._factorized_metrics = metrics

    def calc_loss(self, query_embeddings, candidate_embeddings): 
        scores = tf.linalg.matmul(
            query_embeddings, 
            candidate_embeddings, 
            transpose_b=True
        )
        num_queries, num_candidates = scores.shape
        labels = tf.eye(num_queries, num_candidates)
        loss = self._loss(y_true=labels, y_pred=scores)
        self._factorized_metrics.update_state(
            query_embeddings, 
            candidate_embeddings
        )
        return loss
    

    def train_step(self, features: Dict[Text, tf.Tensor]) -> tf.Tensor:
        with tf.GradientTape() as tape: 
            user_embeddings = self.user_model(features['user_no'])
            positive_item_embeddings = self.item_model(features['item_no'])
            loss = self.calc_loss(user_embeddings, positive_item_embeddings)

        gradients = tape.gradient(loss, self.trainable_variables)
        self.optimizer.apply_gradients(zip(gradients, self.trainable_variables))

        metrics = {metric.name: metric.result() for metric in self.metrics}
        return metrics 

    def test_step(self, features: Dict[Text, tf.Tensor]) -> tf.Tensor: 
        user_embeddings = self.user_model(features['user_no'])
        positive_item_embeddings = self.item_model(features['item_no'])

        loss = self.compute_loss(user_embeddings, positive_item_embeddings)        

        metrics = {metric.name: metric.result() for metric in self.metrics}
        return metrics 
```

We can then instantiate and compile a model like so: 

```python 
simple_model = NonTFRSModel(user_model, item_model, metrics)
# Need to specify run_eagerly=True because we need the shape of the scores 
# in the calc_loss function
simple_model.compile(optimizer=tf.keras.optimizers.Adam(), run_eagerly=True)
```

After that we can just train the model the same as below :)

</div>
---
---

In [54]:
train_dataset_interactions = train_dataset.map(lambda x: {
    'user_no': x['user_no'],
    'item_no': x['item_no']
})
test_dataset_interactions = test_dataset.map(lambda x: {
    'user_no': x['user_no'],
    'item_no': x['item_no']
})

cached_train = train_dataset_interactions.shuffle(1_000).batch(4096).cache()
cached_test = test_dataset_interactions.batch(512).cache()

In [55]:
callback_early_stopping = tf.keras.callbacks.EarlyStopping(
    monitor='val_factorized_top_k/top_100_categorical_accuracy', patience=3, restore_best_weights=True, mode="max")

simple_tfrs_model.compile(optimizer=tf.keras.optimizers.Adagrad(0.1))

In [56]:
history = simple_tfrs_model.fit(cached_train, epochs=10, validation_data=cached_test,
                  callbacks=[callback_early_stopping])

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10


## Evaluation

In [58]:
train_results = simple_tfrs_model.evaluate(cached_train, return_dict=True)
test_results = simple_tfrs_model.evaluate(cached_test, return_dict=True)



In [60]:
print(f"Train top-100 accuracy:  {train_results['factorized_top_k/top_100_categorical_accuracy']}")
print(f"Test top-100 accuracy:  {test_results['factorized_top_k/top_100_categorical_accuracy']}")

Train top-100 accuracy:  0.9409900307655334
Test top-100 accuracy:  0.047954246401786804


Severe overfitting! 

## Serving and Qualitative Evaluation

In [61]:
# Create a model that takes in raw query features, and
index = tfrs.layers.factorized_top_k.BruteForce(simple_tfrs_model.user_model)
# recommends items out of the entire items dataset.
_ = index.index_from_dataset(
        tf.data.Dataset.zip((items_dataset.batch(100), 
                             items_dataset.batch(100).map(simple_tfrs_model.item_model))))

In [62]:
random_user = np.random.choice(train_df_filtered['user_no'].unique())
train_df_filtered.loc[train_df_filtered['user_no'] == random_user]

Unnamed: 0,user_no,item_no,gender_description,brand,product_group,first_interaction_month
8923,-2397642298805843183,-904384509832430199,unisex,reima,boots,10
30226,-2397642298805843183,8252940203222144649,unisex,stoy,first toys and baby toys,11
95437,-2397642298805843183,-4483740504516161354,unisex,najell,carriers and slings,11
131933,-2397642298805843183,582854757623314903,girls,jacadi,all in ones,10
144638,-2397642298805843183,3470588284991950186,girls,bonpoint,eyewear,3
186084,-2397642298805843183,-7704887968268080568,unisex,kuling,fleeces and midlayers,10
192091,-2397642298805843183,6858248263439961548,unisex,bbhugme,breast feeding,2
193257,-2397642298805843183,-2927729873393290990,unisex,reima,gloves and mittens,10
204802,-2397642298805843183,8818496003429638190,unisex,stoy,first toys and baby toys,11
257828,-2397642298805843183,7345485044526784625,unisex,kuling,baselayers,11


In [63]:
%%time
# Get recommendations.
_, titles = index(tf.constant([random_user]))

CPU times: user 4.95 ms, sys: 3.33 ms, total: 8.28 ms
Wall time: 12.8 ms


In [64]:
%%time
items_to_exclude = train_df_filtered.loc[train_df_filtered['user_no'] == random_user]['item_no'].unique()
_, titles = index.query_with_exclusions(tf.constant([random_user]), 
                                       tf.constant([items_to_exclude]))

CPU times: user 557 ms, sys: 77 ms, total: 634 ms
Wall time: 817 ms


In [65]:
recommendations = [item.numpy().decode() for item in titles[0]]
item_info_df.loc[item_info_df['item_no'].isin(recommendations)]

Unnamed: 0,item_no,colour,gender_description,brand,product_group,min_age,max_age
3824,-1690395835756691110,purple,girls,mini rodini,dresses,0.875,11.0
4725,4011112455912821856,beige,unisex,kuling,coveralls,1.0,12.0
9659,-92825680858531790,beige,unisex,kuling,fleeces and midlayers,0.125,2.0
12558,6592676930236846124,grey,unisex,buddy & hope,baby changing,,
17515,6032649609097040334,blue,unisex,stoy,vehicles,6.0,10.0
30965,-7258898587632840201,blue,boys,jacadi,coveralls,0.375,2.0
35980,6425170194131031062,black,unisex,kuling,coveralls,1.0,12.0
38850,2401791534489837026,pink,unisex,kuling,gloves and mittens,0.625,8.0
40033,1904724343631049611,grey,girls,adidas,tops,6.0,14.0
46028,8412152782576815204,brown,unisex,kuling,boots,0.875,11.0


---
---
---

## **Baselines**

### **Top Items**

**Let's find the top 100 items in the training dataset and always predict during the test dataset**

In [None]:
NUM_TOP_ITEMS = 100
top_items = train_df_filtered['item_no'].value_counts()[:100].index

In [None]:
top_items_in_test_dataset = test_df_filtered.loc[test_df_filtered['item_no'].isin(top_items)]

print(len(top_items_in_test_dataset))
print(len(test_df_filtered['item_no'].unique()))
print(len(test_df_filtered))

In [None]:
ks = (1, 5, 10, 50, 100)
metrics = [tf.keras.metrics.Mean() for k in ks]

In [None]:
true_candidates = tf.expand_dims(tf.constant(test_df_filtered['item_no'].values), 1)

In [None]:
print(true_candidates)

In [None]:
retrieved_candidates = tf.expand_dims(top_items, 1)
retrieved_candidates = tf.transpose(tf.repeat(retrieved_candidates, tf.constant(true_candidates.shape[0]), axis=1))

In [None]:
ids_match = tf.cast(tf.math.equal(true_candidates, retrieved_candidates), tf.float32)

In [None]:
ids_match

In [None]:
for k, metric in zip(ks, metrics):
    # By slicing until :k we assume scores are sorted.
    # Clip to only count multiple matches once.
    match_found = tf.clip_by_value(
        tf.reduce_sum(ids_match[:, :k], axis=1, keepdims=True),
        0.0, 1.0
    )
    metric.update_state(match_found)

In [None]:
for metric in metrics:
    print(metric.result())

### **Top Items Domain Knowledge**

Since the test data is in November let's exclude certain product groups

In [None]:
item_info_df.loc[item_info_df['item_no'].isin(top_items)]['product_group'].unique()

In [None]:
GROUPS_TO_INCLUDE = ['jumpers and knitwear', 'coveralls', 'boots', 'coats and jackets', 'stroller accessories', 
                      'fleeces and midlayers', 'winter sets', 'gloves and mittens', 'headwear']

items_to_consider = item_info_df.loc[item_info_df['product_group'].isin(GROUPS_TO_INCLUDE)]['item_no']

In [None]:
top_items_filtered = train_df_filtered[
    train_df_filtered['item_no'].isin(items_to_consider)]['item_no'].value_counts()[:100].index

In [None]:
len(set(top_items_filtered) - set(top_items))

In [None]:
retrieved_candidates = tf.expand_dims(top_items_filtered, 1)
retrieved_candidates = tf.transpose(tf.repeat(retrieved_candidates, tf.constant(true_candidates.shape[0]), axis=1))

In [None]:
ids_match = tf.cast(tf.math.equal(true_candidates, retrieved_candidates), tf.float32)

In [None]:
metrics = [tf.keras.metrics.Mean() for k in ks]
for k, metric in zip(ks, metrics):
    # By slicing until :k we assume scores are sorted.
    # Clip to only count multiple matches once.
    match_found = tf.clip_by_value(
        tf.reduce_sum(ids_match[:, :k], axis=1, keepdims=True),
        0.0, 1.0
    )
    metric.update_state(match_found)

In [None]:
for metric in metrics:
    print(metric.result())

## Content-Based

In [None]:
top_brands = train_df_filtered['brand'].value_counts()[:100].index
top_groups = train_df_filtered['product_group'].value_counts()[:50].index
train_df_filtered.loc[:, 'brand'] = train_df_filtered['brand'].apply(lambda x: x if x in top_brands else 'niche_brand')
train_df_filtered.loc[:, 'product_group'] = train_df_filtered['product_group'].apply(lambda x: x if x in top_groups else 'niche_group')

In [None]:
train_df_filtered

In [None]:
train_df_one_hot = pd.get_dummies(train_df_filtered[['user_no', 'gender_description', 'brand', 'product_group']], 
                                  columns=['gender_description', 'brand', 'product_group'])
train_df_one_hot

In [None]:
user_embeddings = train_df_one_hot.groupby('user_no').agg('mean')

user_embeddings

In [None]:
user_embeddings_matrix = np.concatenate((np.zeros((1, 155)), user_embeddings.values))

In [None]:
user_embedding_layer = tf.keras.layers.Embedding(*user_embeddings_matrix.shape, 
                                                 embeddings_initializer=tf.keras.initializers.Constant(user_embeddings_matrix),
                                                 trainable=False)

In [None]:
user_model = tf.keras.Sequential([
  tf.keras.layers.StringLookup(
      vocabulary=user_embeddings.index, 
      num_oov_indices=NUM_OOV_INDICES),
  user_embedding_layer
])

In [None]:
item_info_df.loc[:, 'brand'] = item_info_df['brand'].apply(lambda x: x if x in top_brands else 'niche_brand')
item_info_df.loc[:, 'product_group'] = item_info_df['product_group'].apply(lambda x: x if x in top_groups else 'niche_group')
item_embeddings = pd.get_dummies(item_info_df[['gender_description', 'brand', 'product_group']], 
                                 columns=['gender_description', 'brand', 'product_group'])

In [None]:
item_embeddings

In [None]:
item_embeddings_matrix = np.concatenate((np.zeros((1, 155)), item_embeddings.values))

item_embedding_layer = tf.keras.layers.Embedding(*item_embeddings_matrix.shape, 
                                                 embeddings_initializer=tf.keras.initializers.Constant(item_embeddings_matrix),
                                                 trainable=False)

In [None]:
item_model = tf.keras.Sequential([
  tf.keras.layers.StringLookup(
      vocabulary=item_info_df['item_no'], 
      num_oov_indices=NUM_OOV_INDICES),
  item_embedding_layer
])

In [None]:
item_model('206890150141030846')

In [None]:
items_dataset = tf.data.Dataset.from_tensor_slices(item_info_df['item_no'])

In [None]:
# Create a model that takes in raw query features, and
index = tfrs.layers.factorized_top_k.BruteForce(user_model)
# recommends items out of the entire items dataset.
index.index_from_dataset(
  tf.data.Dataset.zip((items_dataset.batch(100), items_dataset.batch(100).map(item_model)))
)

In [None]:
random_user = np.random.choice(train_df_filtered['user_no'].unique())
train_df_filtered.loc[train_df_filtered['user_no'] == random_user]

In [None]:
%%time
items_to_exclude = train_df_filtered.loc[train_df_filtered['user_no'] == random_user]['item_no'].unique()
_, titles = index.query_with_exclusions(tf.constant([random_user]), 
                                       tf.constant([items_to_exclude]))

In [None]:
recommendations = [item.numpy().decode() for item in titles[0]]
item_info_df.loc[item_info_df['item_no'].isin(recommendations)]

**Looks like it 'memorizes' users' tastes more**

In [None]:
test_users_dataset = tf.data.Dataset.from_tensor_slices(test_df_filtered['user_no'])

In [None]:
_, retrieved_items = index(test_df_filtered['user_no'], k=100)

In [None]:
ids_match = tf.cast(tf.math.equal(true_candidates, retrieved_items), tf.float32)

In [None]:
metrics = [tf.keras.metrics.Mean() for k in ks]
for k, metric in zip(ks, metrics):
    # By slicing until :k we assume scores are sorted.
    # Clip to only count multiple matches once.
    match_found = tf.clip_by_value(
        tf.reduce_sum(ids_match[:, :k], axis=1, keepdims=True),
        0.0, 1.0
    )
    metric.update_state(match_found)

In [None]:
for metric in metrics:
    print(metric.result())

---

## Context Features

Now let's add context features

In [None]:
class UserModel(tf.keras.Model):
    def __init__(self, unique_users, num_oov_indices=1, embedding_dim=32):
        super().__init__()
        
        self.user_embedding = tf.keras.Sequential([
            tf.keras.layers.StringLookup(vocabulary=unique_users, 
                                         num_oov_indices=num_oov_indices),
            tf.keras.layers.Embedding(len(unique_users) + num_oov_indices, embedding_dim)
        ])
        
    def call(self, inputs):
        return self.user_embedding(inputs['user_no'])
    
class ItemModel(tf.keras.Model):
    def __init__(self, 
                 items, 
                 gender_description,
                 top_brands, 
                 top_groups, 
                 num_oov_indices=1, 
                 embedding_dim=16):
        super().__init__()
        
        self.item_embedding = tf.keras.Sequential([
            tf.keras.layers.StringLookup(vocabulary=items, 
                                         num_oov_indices=num_oov_indices),
            tf.keras.layers.Embedding(len(items) + num_oov_indices, 16)
        ])
        
        self.gender_description_lookup = tf.keras.layers.StringLookup(vocabulary=gender_description, 
                                                                      output_mode='one_hot',
                                                                      num_oov_indices=0)
        self.brand_embedding = tf.keras.Sequential([
            tf.keras.layers.StringLookup(vocabulary=top_brands, 
                                         num_oov_indices=num_oov_indices),
            tf.keras.layers.Embedding(len(top_brands) + num_oov_indices, 8)
        ])
        self.product_group_embedding = tf.keras.Sequential([
            tf.keras.layers.StringLookup(vocabulary=top_groups, 
                                         num_oov_indices=num_oov_indices),
            tf.keras.layers.Embedding(len(top_groups) + num_oov_indices, 5)
        ])
        
    def call(self, inputs):
        return tf.concat([
             self.item_embedding(inputs['item_no']),
             self.gender_description_lookup(inputs['gender_description']),
             self.brand_embedding(inputs['brand']),
             self.product_group_embedding(inputs['product_group'])
        ], axis=1)
    
class TFRSContextModel(tfrs.models.Model):
    def __init__(self, 
                 unique_users,
                 items, 
                 gender_description,
                 top_brands, 
                 top_groups):
        super().__init__()
        self.query_model = tf.keras.Sequential([
            UserModel(unique_users), 
            #tf.keras.layers.Dense(32)
        ])
        self.candidate_model = tf.keras.Sequential([
            ItemModel(items, gender_description, top_brands, top_groups),
            #tf.keras.layers.Dense(32)
        ])
        self.task = tfrs.tasks.Retrieval(
            metrics=tfrs.metrics.FactorizedTopK(
                candidates=items_dataset_w_context.batch(128).map(self.candidate_model)
            )
        )
    def compute_loss(self, inputs, training=False):
        query_embeddings = self.query_model({
            'user_no': inputs['user_no']
        })
        candidate_embeddings = self.candidate_model({
            'item_no': inputs['item_no'],
            'gender_description': inputs['gender_description'],
            'brand': inputs['brand'],
            'product_group': inputs['product_group']
        })
        
        return self.task(query_embeddings, candidate_embeddings)

**FIX ITEMS DATASET!!!**

In [None]:
items_df = item_info_df.loc[item_info_df['item_no'].isin(items)][
    ['item_no', 'gender_description', 'brand', 'product_group']]

items_dataset_w_context = tf.data.Dataset.from_tensor_slices(dict(items_df))

In [None]:
model = TFRSContextModel(unique_users, items, gender_description, top_brands, top_groups)

In [None]:
model.compile(optimizer=tf.keras.optimizers.Adam())

In [None]:
cached_train = train_dataset.shuffle(1_000).batch(1024).cache()
cached_test = test_dataset.batch(512).cache()

In [None]:
history = model.fit(cached_train, epochs=5)

In [None]:
results = model.evaluate(cached_test, return_dict=True)

In [None]:
# Create a model that takes in raw query features, and
index = tfrs.layers.factorized_top_k.BruteForce(model.query_model)
# recommends items out of the entire items dataset.
_ = index.index_from_dataset(
        tf.data.Dataset.zip((items_dataset.batch(100), 
                             items_dataset_w_context.batch(100).map(model.candidate_model))))

In [None]:
for item in train_dataset.take(3).batch(3):
    print(item)

In [None]:
unique_users

In [None]:
top_brands = train_df_filtered['brand'].value_counts()[:100].index
top_groups = train_df_filtered['product_group'].value_counts()[:50].index
gender_description = train_df_filtered['gender_description'].unique()
item_model = ItemModel(items, gender_description, top_brands, top_groups)

In [None]:
for item in train_dataset.take(3).batch(3):
    print(item)

In [None]:
item_model(item)

In [None]:
train_df['gender_description'].unique()

In [None]:
tf.keras.layers.StringLookup?

In [None]:
gender_lookup = tf.keras.layers.StringLookup(vocabulary=train_df['gender_description'].unique(), 
                                             output_mode='one_hot', 
                                             num_oov_indices=0)

In [None]:
gender_lookup(tf.constant(['boys']))

In [None]:
class MovieModel(tf.keras.Model):

  def __init__(self):
    super().__init__()

    max_tokens = 10_000

    self.title_embedding = tf.keras.Sequential([
      tf.keras.layers.StringLookup(
          vocabulary=unique_movie_titles, mask_token=None),
      tf.keras.layers.Embedding(len(unique_movie_titles) + 1, 32)
    ])

    self.title_vectorizer = tf.keras.layers.TextVectorization(
        max_tokens=max_tokens)

    self.title_text_embedding = tf.keras.Sequential([
      self.title_vectorizer,
      tf.keras.layers.Embedding(max_tokens, 32, mask_zero=True),
      tf.keras.layers.GlobalAveragePooling1D(),
    ])

    self.title_vectorizer.adapt(movies)

  def call(self, titles):
    return tf.concat([
        self.title_embedding(titles),
        self.title_text_embedding(titles),
    ], axis=1)

In [None]:
for item in train_dataset.take(1):
    print(item)