diff --git a/examples/keras_rs/img/two_stage_rs_with_marketing_interaction/architecture.jpg b/examples/keras_rs/img/two_stage_rs_with_marketing_interaction/architecture.jpg
new file mode 100644
index 0000000000..05e81acfa3
Binary files /dev/null and b/examples/keras_rs/img/two_stage_rs_with_marketing_interaction/architecture.jpg differ
diff --git a/examples/keras_rs/img/two_stage_rs_with_marketing_interaction/two_stage_rs_with_marketing_interaction_13_60.png b/examples/keras_rs/img/two_stage_rs_with_marketing_interaction/two_stage_rs_with_marketing_interaction_13_60.png
new file mode 100644
index 0000000000..72409279ad
Binary files /dev/null and b/examples/keras_rs/img/two_stage_rs_with_marketing_interaction/two_stage_rs_with_marketing_interaction_13_60.png differ
diff --git a/examples/keras_rs/img/two_stage_rs_with_marketing_interaction/two_stage_rs_with_marketing_interaction_9_90.png b/examples/keras_rs/img/two_stage_rs_with_marketing_interaction/two_stage_rs_with_marketing_interaction_9_90.png
new file mode 100644
index 0000000000..0fb9ff322f
Binary files /dev/null and b/examples/keras_rs/img/two_stage_rs_with_marketing_interaction/two_stage_rs_with_marketing_interaction_9_90.png differ
diff --git a/examples/keras_rs/ipynb/two_stage_rs_with_marketing_interaction.ipynb b/examples/keras_rs/ipynb/two_stage_rs_with_marketing_interaction.ipynb
new file mode 100644
index 0000000000..66c3ce85c0
--- /dev/null
+++ b/examples/keras_rs/ipynb/two_stage_rs_with_marketing_interaction.ipynb
@@ -0,0 +1,699 @@
+{
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "colab_type": "text"
+ },
+ "source": [
+ "# Two Stage Recommender System with Marketing Interaction\n",
+ "\n",
+ "**Author:** Mansi Mehta \n",
+ "**Date created:** 26/11/2025 \n",
+ "**Last modified:** 26/11/2025 \n",
+ "**Description:** Recommender System with Ranking and Retrival model for Marketing interaction."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "colab_type": "text"
+ },
+ "source": [
+ "# **Introduction**\n",
+ "\n",
+ "This tutorial demonstrates a critical business scenario: a user lands on a website, and a\n",
+ "marketing engine must decide which specific ad to display from an inventory of thousands.\n",
+ "The goal is to maximize the Click-Through Rate (CTR). Showing irrelevant ads wastes\n",
+ "marketing budget and annoys the user. Therefore, we need a system that predicts the\n",
+ "probability of a specific user clicking on a specific ad based on their demographics and\n",
+ "browsing habits.\n",
+ "\n",
+ "**Architecture**\n",
+ "1. **The Retrieval Stage:** Efficiently select an initial set of roughly 10-100\n",
+ "candidates from millions of possibilities. It weeds out items the user is definitely not\n",
+ "interested in.\n",
+ "User Tower: Embeds user features (ID, demographics, behavior) into a vector.\n",
+ "Item Tower: Embeds ad features (Ad ID, Topic) into a vector.\n",
+ "Interaction: The dot product of these two vectors represents similarity.\n",
+ "2. **The Ranking Stage:** It takes the output of the retrieval model and fine-tune the\n",
+ "order to select the single best ad to show.\n",
+ "A Deep Neural Network (MLP).\n",
+ "Interaction: It takes the User Embedding, Ad Embedding, and their similarity score to\n",
+ "predict a precise probability (0% to 100%) that the user will click.\n",
+ "\n",
+ ""
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "colab_type": "text"
+ },
+ "source": [
+ "# **Dataset**\n",
+ "We will use the [Ad Click\n",
+ "Prediction](https://www.kaggle.com/datasets/mafrojaakter/ad-click-data) Dataset from\n",
+ "Kaggle\n",
+ "\n",
+ "**Feature Distribution of dataset:**\n",
+ "User Tower describes who is looking and features contains i.e Gender, City, Country, Age,\n",
+ "Daily Internet Usage, Daily Time Spent on Site, and Area Income.\n",
+ "Item Tower describes what is being shown and features contains Ad Topic Line, Ad ID.\n",
+ "\n",
+ "In this tutorial, we are going to build and train a Two-Tower (User Tower and Ad Tower)\n",
+ "model using the Ad Click Prediction dataset from Kaggle.\n",
+ "We're going to:\n",
+ "1. **Data Pipeline:** Get our data and preprocess it for both Retrieval (implicit\n",
+ "feedback) and Ranking (explicit labels).\n",
+ "2. **Retrieval:** Implement and train a Two-Tower model to generate candidates.\n",
+ "3. **Ranking:** Implement and train a Neural Ranking model to predict click probabilities.\n",
+ "4. **Inference:** Run an end-to-end test (Retrieval --> Ranking) to generate\n",
+ "recommendations for a specific user."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 0,
+ "metadata": {
+ "colab_type": "code"
+ },
+ "outputs": [],
+ "source": [
+ "!!pip install -q keras-rs"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 0,
+ "metadata": {
+ "colab_type": "code"
+ },
+ "outputs": [],
+ "source": [
+ "import os\n",
+ "\n",
+ "os.environ[\"KERAS_BACKEND\"] = \"tensorflow\"\n",
+ "import keras\n",
+ "import matplotlib.pyplot as plt\n",
+ "import numpy as np\n",
+ "import tensorflow as tf\n",
+ "import pandas as pd\n",
+ "import keras_rs\n",
+ "import tensorflow_datasets as tfds\n",
+ "from mpl_toolkits.axes_grid1 import make_axes_locatable\n",
+ "from keras import layers\n",
+ "from concurrent.futures import ThreadPoolExecutor\n",
+ "from sklearn.model_selection import train_test_split\n",
+ "from sklearn.preprocessing import MinMaxScaler\n",
+ ""
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "colab_type": "text"
+ },
+ "source": [
+ "# **Preparing Dataset**"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 0,
+ "metadata": {
+ "colab_type": "code"
+ },
+ "outputs": [],
+ "source": [
+ "!pip install -q kaggle\n",
+ "!# Download the dataset (requires Kaggle API key in ~/.kaggle/kaggle.json)\n",
+ "!kaggle datasets download -d mafrojaakter/ad-click-data --unzip -p ./ad_click_dataset"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 0,
+ "metadata": {
+ "colab_type": "code"
+ },
+ "outputs": [],
+ "source": [
+ "data_path = \"./ad_click_dataset/Ad_click_data.csv\"\n",
+ "if not os.path.exists(data_path):\n",
+ " # Fallback for filenames with spaces or different casing\n",
+ " data_path = \"./ad_click_dataset/Ad Click Data.csv\"\n",
+ "\n",
+ "ads_df = pd.read_csv(data_path)\n",
+ "# Clean column names\n",
+ "ads_df.columns = ads_df.columns.str.strip()\n",
+ "# Rename the column name\n",
+ "ads_df = ads_df.rename(\n",
+ " columns={\n",
+ " \"Male\": \"gender\",\n",
+ " \"Ad Topic Line\": \"ad_topic\",\n",
+ " \"City\": \"city\",\n",
+ " \"Country\": \"country\",\n",
+ " \"Daily Time Spent on Site\": \"time_on_site\",\n",
+ " \"Daily Internet Usage\": \"internet_usage\",\n",
+ " \"Area Income\": \"area_income\",\n",
+ " }\n",
+ ")\n",
+ "# Add user_id and add_id column\n",
+ "ads_df[\"user_id\"] = \"user_\" + ads_df.index.astype(str)\n",
+ "ads_df[\"ad_id\"] = \"ad_\" + ads_df[\"ad_topic\"].astype(\"category\").cat.codes.astype(str)\n",
+ "# Remove nulls and normalize\n",
+ "ads_df = ads_df.dropna()\n",
+ "# normalize\n",
+ "numeric_cols = [\"time_on_site\", \"internet_usage\", \"area_income\", \"Age\"]\n",
+ "scaler = MinMaxScaler()\n",
+ "ads_df[numeric_cols] = scaler.fit_transform(ads_df[numeric_cols])\n",
+ "\n",
+ "# Split the train and test datasets\n",
+ "x_train, x_test = train_test_split(ads_df, test_size=0.2, random_state=42)\n",
+ "\n",
+ "\n",
+ "def dict_to_tensor_features(df_features, continuous_features):\n",
+ " tensor_dict = {}\n",
+ " for k, v in df_features.items():\n",
+ " if k in continuous_features:\n",
+ " tensor_dict[k] = tf.expand_dims(tf.constant(v, dtype=\"float32\"), axis=-1)\n",
+ " else:\n",
+ " v_str = np.array(v).astype(str).tolist()\n",
+ " tensor_dict[k] = tf.expand_dims(tf.constant(v_str, dtype=\"string\"), axis=-1)\n",
+ " return tensor_dict\n",
+ "\n",
+ "\n",
+ "def create_retrieval_dataset(\n",
+ " data_df,\n",
+ " all_ads_features,\n",
+ " all_ad_ids,\n",
+ " user_features_list,\n",
+ " ad_features_list,\n",
+ " continuous_features_list,\n",
+ "):\n",
+ "\n",
+ " # Filter for Positive Interactions (Cicks)\n",
+ " positive_interactions = data_df[data_df[\"Clicked on Ad\"] == 1].copy()\n",
+ "\n",
+ " if positive_interactions.empty:\n",
+ " return None\n",
+ "\n",
+ " def sample_negative(positive_ad_id):\n",
+ " neg_ad_id = positive_ad_id\n",
+ " while neg_ad_id == positive_ad_id:\n",
+ " neg_ad_id = np.random.choice(all_ad_ids)\n",
+ " return neg_ad_id\n",
+ "\n",
+ " def create_triplets_row(pos_row):\n",
+ " pos_ad_id = pos_row.ad_id\n",
+ " neg_ad_id = sample_negative(pos_ad_id)\n",
+ "\n",
+ " neg_ad_row = all_ads_features[all_ads_features[\"ad_id\"] == neg_ad_id].iloc[0]\n",
+ " user_features_dict = {\n",
+ " name: getattr(pos_row, name) for name in user_features_list\n",
+ " }\n",
+ " pos_ad_features_dict = {\n",
+ " name: getattr(pos_row, name) for name in ad_features_list\n",
+ " }\n",
+ " neg_ad_features_dict = {name: neg_ad_row[name] for name in ad_features_list}\n",
+ "\n",
+ " return {\n",
+ " \"user\": user_features_dict,\n",
+ " \"positive_ad\": pos_ad_features_dict,\n",
+ " \"negative_ad\": neg_ad_features_dict,\n",
+ " }\n",
+ "\n",
+ " with ThreadPoolExecutor(max_workers=8) as executor:\n",
+ " triplets = list(\n",
+ " executor.map(\n",
+ " create_triplets_row, positive_interactions.itertuples(index=False)\n",
+ " )\n",
+ " )\n",
+ "\n",
+ " triplets_df = pd.DataFrame(triplets)\n",
+ " user_df = triplets_df[\"user\"].apply(pd.Series)\n",
+ " pos_ad_df = triplets_df[\"positive_ad\"].apply(pd.Series)\n",
+ " neg_ad_df = triplets_df[\"negative_ad\"].apply(pd.Series)\n",
+ "\n",
+ " user_features_tensor = dict_to_tensor_features(\n",
+ " user_df.to_dict(\"list\"), continuous_features_list\n",
+ " )\n",
+ " pos_ad_features_tensor = dict_to_tensor_features(\n",
+ " pos_ad_df.to_dict(\"list\"), continuous_features_list\n",
+ " )\n",
+ " neg_ad_features_tensor = dict_to_tensor_features(\n",
+ " neg_ad_df.to_dict(\"list\"), continuous_features_list\n",
+ " )\n",
+ "\n",
+ " features = {\n",
+ " \"user\": user_features_tensor,\n",
+ " \"positive_ad\": pos_ad_features_tensor,\n",
+ " \"negative_ad\": neg_ad_features_tensor,\n",
+ " }\n",
+ " y_true = tf.ones((triplets_df.shape[0], 1), dtype=tf.float32)\n",
+ " dataset = tf.data.Dataset.from_tensor_slices((features, y_true))\n",
+ " buffer_size = len(triplets_df)\n",
+ " dataset = (\n",
+ " dataset.shuffle(buffer_size=buffer_size)\n",
+ " .batch(64)\n",
+ " .cache()\n",
+ " .prefetch(tf.data.AUTOTUNE)\n",
+ " )\n",
+ " return dataset\n",
+ "\n",
+ "\n",
+ "user_clicked_ads = (\n",
+ " x_train[x_train[\"Clicked on Ad\"] == 1]\n",
+ " .groupby(\"user_id\")[\"ad_id\"]\n",
+ " .apply(set)\n",
+ " .to_dict()\n",
+ ")\n",
+ "\n",
+ "for u in x_train[\"user_id\"].unique():\n",
+ " if u not in user_clicked_ads:\n",
+ " user_clicked_ads[u] = set()\n",
+ "\n",
+ "AD_FEATURES = [\"ad_id\", \"ad_topic\"]\n",
+ "USER_FEATURES = [\n",
+ " \"user_id\",\n",
+ " \"gender\",\n",
+ " \"city\",\n",
+ " \"country\",\n",
+ " \"time_on_site\",\n",
+ " \"internet_usage\",\n",
+ " \"area_income\",\n",
+ " \"Age\",\n",
+ "]\n",
+ "continuous_features = [\"time_on_site\", \"internet_usage\", \"area_income\", \"Age\"]\n",
+ "\n",
+ "all_ads_features = x_train[AD_FEATURES].drop_duplicates().reset_index(drop=True)\n",
+ "all_ad_ids = all_ads_features[\"ad_id\"].tolist()\n",
+ "\n",
+ "retrieval_train_dataset = create_retrieval_dataset(\n",
+ " data_df=x_train,\n",
+ " all_ads_features=all_ads_features,\n",
+ " all_ad_ids=all_ad_ids,\n",
+ " user_features_list=USER_FEATURES,\n",
+ " ad_features_list=AD_FEATURES,\n",
+ " continuous_features_list=continuous_features,\n",
+ ")\n",
+ "\n",
+ "retrieval_test_dataset = create_retrieval_dataset(\n",
+ " data_df=x_test,\n",
+ " all_ads_features=all_ads_features,\n",
+ " all_ad_ids=all_ad_ids,\n",
+ " user_features_list=USER_FEATURES,\n",
+ " ad_features_list=AD_FEATURES,\n",
+ " continuous_features_list=continuous_features,\n",
+ ")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "colab_type": "text"
+ },
+ "source": [
+ "# **Implement the Retrival Model**\n",
+ "For the Retrieval stage, we will build a Two-Tower Model.\n",
+ "\n",
+ "**The Architecture Components:**\n",
+ "\n",
+ "1. User Tower:User features (User ID, demographics, behavior metrics like time_on_site).\n",
+ "It encodes these mixed features into a fixed-size vector representation called the User\n",
+ "Embedding.\n",
+ "2. Item (Ad) Tower:Ad features (Ad ID, Ad Topic Line).It encodes these features into a\n",
+ "fixed-size vector representation called the Item Embedding.\n",
+ "3. Interaction (Similarity):We calculate the Dot Product between the User Embedding and\n",
+ "the Item Embedding."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 0,
+ "metadata": {
+ "colab_type": "code"
+ },
+ "outputs": [],
+ "source": [
+ "keras.utils.set_random_seed(42)\n",
+ "\n",
+ "vocab_map = {\n",
+ " \"user_id\": x_train[\"user_id\"].unique(),\n",
+ " \"gender\": x_train[\"gender\"].astype(str).unique(),\n",
+ " \"city\": x_train[\"city\"].unique(),\n",
+ " \"country\": x_train[\"country\"].unique(),\n",
+ " \"ad_id\": x_train[\"ad_id\"].unique(),\n",
+ " \"ad_topic\": x_train[\"ad_topic\"].unique(),\n",
+ "}\n",
+ "cont_feats = [\"time_on_site\", \"internet_usage\", \"area_income\", \"Age\"]\n",
+ "\n",
+ "normalizers = {}\n",
+ "for f in cont_feats:\n",
+ " norm = layers.Normalization(axis=None)\n",
+ " norm.adapt(x_train[f].values.astype(\"float32\"))\n",
+ " normalizers[f] = norm\n",
+ "\n",
+ "\n",
+ "def build_tower(feature_names, continuous_names=None, embed_dim=64, name=\"tower\"):\n",
+ " inputs, embeddings = {}, []\n",
+ "\n",
+ " for feat in feature_names:\n",
+ " if feat in vocab_map:\n",
+ " inp = keras.Input(shape=(1,), dtype=tf.string, name=feat)\n",
+ " inputs[feat] = inp\n",
+ " vocab = list(vocab_map[feat])\n",
+ " x = layers.StringLookup(vocabulary=vocab)(inp)\n",
+ " x = layers.Embedding(\n",
+ " len(vocab) + 1, embed_dim, embeddings_regularizer=\"l2\"\n",
+ " )(x)\n",
+ " embeddings.append(layers.Flatten()(x))\n",
+ "\n",
+ " if continuous_names:\n",
+ " for feat in continuous_names:\n",
+ " inp = keras.Input(shape=(1,), dtype=tf.float32, name=feat)\n",
+ " inputs[feat] = inp\n",
+ " embeddings.append(normalizers[feat](inp))\n",
+ "\n",
+ " x = layers.Concatenate()(embeddings)\n",
+ " x = layers.Dense(128, activation=\"relu\")(x)\n",
+ " x = layers.Dropout(0.2)(x)\n",
+ " x = layers.Dense(64, activation=\"relu\")(x)\n",
+ " output = layers.Dense(embed_dim)(layers.Dropout(0.2)(x))\n",
+ "\n",
+ " return keras.Model(inputs=inputs, outputs=output, name=name)\n",
+ "\n",
+ "\n",
+ "user_tower = build_tower(\n",
+ " [\"user_id\", \"gender\", \"city\", \"country\"], cont_feats, name=\"user_tower\"\n",
+ ")\n",
+ "ad_tower = build_tower([\"ad_id\", \"ad_topic\"], name=\"ad_tower\")\n",
+ "\n",
+ "\n",
+ "def bpr_hinge_loss(y_true, y_pred):\n",
+ " margin = 1.0\n",
+ " return -tf.math.log(tf.nn.sigmoid(y_pred) + 1e-10)\n",
+ "\n",
+ "\n",
+ "class RetrievalModel(keras.Model):\n",
+ " def __init__(self, user_tower_instance, ad_tower_instance, **kwargs):\n",
+ " super().__init__(**kwargs)\n",
+ " self.user_tower = user_tower\n",
+ " self.ad_tower = ad_tower\n",
+ " self.ln_user = layers.LayerNormalization()\n",
+ " self.ln_ad = layers.LayerNormalization()\n",
+ "\n",
+ " def call(self, inputs):\n",
+ " u_emb = self.ln_user(self.user_tower(inputs[\"user\"]))\n",
+ " pos_emb = self.ln_ad(self.ad_tower(inputs[\"positive_ad\"]))\n",
+ " neg_emb = self.ln_ad(self.ad_tower(inputs[\"negative_ad\"]))\n",
+ " pos_score = keras.ops.sum(u_emb * pos_emb, axis=1, keepdims=True)\n",
+ " neg_score = keras.ops.sum(u_emb * neg_emb, axis=1, keepdims=True)\n",
+ " return pos_score - neg_score\n",
+ "\n",
+ " def get_embeddings(self, inputs):\n",
+ " u_emb = self.ln_user(self.user_tower(inputs[\"user\"]))\n",
+ " ad_emb = self.ln_ad(self.ad_tower(inputs[\"positive_ad\"]))\n",
+ " dot_interaction = keras.ops.sum(u_emb * ad_emb, axis=1, keepdims=True)\n",
+ " return u_emb, ad_emb, dot_interaction\n",
+ "\n",
+ "\n",
+ "retrieval_model = RetrievalModel(user_tower, ad_tower)\n",
+ "retrieval_model.compile(\n",
+ " optimizer=keras.optimizers.Adam(learning_rate=1e-3), loss=bpr_hinge_loss\n",
+ ")\n",
+ "history = retrieval_model.fit(retrieval_train_dataset, epochs=30)\n",
+ "\n",
+ "pd.DataFrame(history.history).plot(\n",
+ " subplots=True, layout=(1, 3), figsize=(12, 4), title=\"Retrival Model Metrics\"\n",
+ ")\n",
+ "plt.show()"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "colab_type": "text"
+ },
+ "source": [
+ "# **Predictions of Retrival Model**\n",
+ "Two-Tower model is trained, we need to use it to generate candidates.\n",
+ "\n",
+ "We can implement inference pipeline using three steps:\n",
+ "1. Indexing: We can run the Item Tower once for all available ads to generate their\n",
+ "embeddings.\n",
+ "2. Query Encoding: When a user arrives, we pass their features through the User Tower to\n",
+ "generate a User Embedding.\n",
+ "3. Nearest Neighbor Search: We search the index to find the Ad Embeddings closest to the\n",
+ "User Embedding (highest dot product).\n",
+ "\n",
+ "Keras-RS [BruteForceRetrieval\n",
+ "layer](https://keras.io/keras_rs/api/retrieval_layers/brute_force_retrieval/) calculates\n",
+ "dot product between the user and every single item in the index to find exact top-K\n",
+ "matches"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 0,
+ "metadata": {
+ "colab_type": "code"
+ },
+ "outputs": [],
+ "source": [
+ "USER_CATEGORICAL = [\"user_id\", \"gender\", \"city\", \"country\"]\n",
+ "CONTINUOUS_FEATURES = [\"time_on_site\", \"internet_usage\", \"area_income\", \"Age\"]\n",
+ "USER_FEATURES = USER_CATEGORICAL + CONTINUOUS_FEATURES\n",
+ "\n",
+ "\n",
+ "class BruteForceRetrievalWrapper:\n",
+ " def __init__(self, model, ads_df, ad_features, user_features, k=10):\n",
+ " self.model, self.k = model, k\n",
+ " self.user_features = user_features\n",
+ " unique_ads = ads_df[ad_features].drop_duplicates(\"ad_id\").reset_index(drop=True)\n",
+ " self.ids = unique_ads[\"ad_id\"].values\n",
+ " self.topic_map = dict(zip(unique_ads[\"ad_id\"], unique_ads[\"ad_topic\"]))\n",
+ " ad_inputs = {\n",
+ " \"ad_id\": tf.constant(self.ids.astype(str)),\n",
+ " \"ad_topic\": tf.constant(unique_ads[\"ad_topic\"].astype(str).values),\n",
+ " }\n",
+ " self.candidate_embs = model.ln_ad(model.ad_tower(ad_inputs))\n",
+ "\n",
+ " def query_batch(self, user_df):\n",
+ " inputs = {\n",
+ " k: tf.constant(\n",
+ " user_df[k].values.astype(float if k in CONTINUOUS_FEATURES else str)\n",
+ " )\n",
+ " for k in self.user_features\n",
+ " if k in user_df.columns\n",
+ " }\n",
+ " u_emb = self.model.ln_user(self.model.user_tower(inputs))\n",
+ " scores = tf.linalg.matmul(u_emb, self.candidate_embs, transpose_b=True)\n",
+ " top_scores, top_indices = tf.math.top_k(scores, k=self.k)\n",
+ " return top_scores.numpy(), top_indices.numpy()\n",
+ "\n",
+ " def decode_results(self, scores, indices):\n",
+ " results = []\n",
+ " for row_scores, row_indices in zip(scores, indices):\n",
+ " retrieved_ids = self.ids[row_indices]\n",
+ " results.append(\n",
+ " [\n",
+ " {\"ad_id\": aid, \"ad_topic\": self.topic_map[aid], \"score\": float(s)}\n",
+ " for aid, s in zip(retrieved_ids, row_scores)\n",
+ " ]\n",
+ " )\n",
+ " return results\n",
+ "\n",
+ "\n",
+ "retrieval_engine = BruteForceRetrievalWrapper(\n",
+ " model=retrieval_model,\n",
+ " ads_df=ads_df,\n",
+ " ad_features=[\"ad_id\", \"ad_topic\"],\n",
+ " user_features=USER_FEATURES,\n",
+ " k=10,\n",
+ ")\n",
+ "sample_user = pd.DataFrame([x_test.iloc[0]])\n",
+ "scores, indices = retrieval_engine.query_batch(sample_user)\n",
+ "top_ads = retrieval_engine.decode_results(scores, indices)[0]"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "colab_type": "text"
+ },
+ "source": [
+ "# **Implementation of Ranking Model**\n",
+ "Retrieval model only calculates a simple similarity score (Dot Product). It doesn't\n",
+ "account for complex feature interactions.\n",
+ "So we need to build ranking model after words retrival model.\n",
+ "\n",
+ "**Architecture**\n",
+ "1. **Feature Extraction:** We reuse the trained User Tower and Ad Tower from the\n",
+ "Retrieval stage. We freeze these towers (trainable = False) so their weights don't\n",
+ "change.\n",
+ "2. **Interaction:** Instead of just a dot product, we concatenate three inputs- The User\n",
+ "EmbeddingThe Ad EmbeddingThe Dot Product (Similarity)\n",
+ "3. **Scorer(MLP):** These concatenated inputs are fed into a Multi-Layer Perceptron\u2014a\n",
+ "stack of Dense layers. This network learns the non-linear relationships between the user\n",
+ "and the ad.\n",
+ "4. **Output:** The final layer uses a Sigmoid activation to output a single probability\n",
+ "between 0.0 and 1.0 (Likelihood of a Click)."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 0,
+ "metadata": {
+ "colab_type": "code"
+ },
+ "outputs": [],
+ "source": [
+ "retrieval_model.trainable = False\n",
+ "\n",
+ "\n",
+ "def create_ranking_ds(df):\n",
+ " inputs = {\n",
+ " \"user\": dict_to_tensor_features(df[USER_FEATURES], continuous_features),\n",
+ " \"positive_ad\": dict_to_tensor_features(df[AD_FEATURES], continuous_features),\n",
+ " }\n",
+ " return (\n",
+ " tf.data.Dataset.from_tensor_slices(\n",
+ " (inputs, df[\"Clicked on Ad\"].values.astype(\"float32\"))\n",
+ " )\n",
+ " .shuffle(10000)\n",
+ " .batch(256)\n",
+ " .prefetch(tf.data.AUTOTUNE)\n",
+ " )\n",
+ "\n",
+ "\n",
+ "ranking_train_dataset = create_ranking_ds(x_train)\n",
+ "ranking_test_dataset = create_ranking_ds(x_test)\n",
+ "\n",
+ "\n",
+ "class RankingModel(keras.Model):\n",
+ " def __init__(self, retrieval_model, **kwargs):\n",
+ " super().__init__(**kwargs)\n",
+ " self.retrieval = retrieval_model\n",
+ " self.mlp = keras.Sequential(\n",
+ " [\n",
+ " layers.Dense(256, activation=\"relu\"),\n",
+ " layers.Dropout(0.2),\n",
+ " layers.Dense(128, activation=\"relu\"),\n",
+ " layers.Dropout(0.2),\n",
+ " layers.Dense(64, activation=\"relu\"),\n",
+ " layers.Dense(1, activation=\"sigmoid\"),\n",
+ " ]\n",
+ " )\n",
+ "\n",
+ " def call(self, inputs):\n",
+ " u_emb, ad_emb, dot = self.retrieval.get_embeddings(inputs)\n",
+ " return self.mlp(keras.ops.concatenate([u_emb, ad_emb, dot], axis=-1))\n",
+ "\n",
+ "\n",
+ "ranking_model = RankingModel(retrieval_model)\n",
+ "ranking_model.compile(\n",
+ " optimizer=keras.optimizers.Adam(1e-4),\n",
+ " loss=\"binary_crossentropy\",\n",
+ " metrics=[\"AUC\", \"accuracy\"],\n",
+ ")\n",
+ "history1 = ranking_model.fit(ranking_train_dataset, epochs=20)\n",
+ "\n",
+ "pd.DataFrame(history1.history).plot(\n",
+ " subplots=True, layout=(1, 3), figsize=(12, 4), title=\"Ranking Model Metrics\"\n",
+ ")\n",
+ "plt.show()\n",
+ "\n",
+ "ranking_model.evaluate(ranking_test_dataset)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "colab_type": "text"
+ },
+ "source": [
+ "# **Predictions of Ranking Model**\n",
+ "The retrieval model gave us a list of ads that are generally relevant (high dot product\n",
+ "similarity). The ranking model will now calculate the specific probability (0% to 100%)\n",
+ "that the user will click each of those ads.\n",
+ "\n",
+ "The Ranking model expects pairs of (User, Ad). Since we are scoring 10 ads for 1 user, we\n",
+ "cannot just pass the user features once.We effectively take user's features 10 times to\n",
+ "create a batch."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 0,
+ "metadata": {
+ "colab_type": "code"
+ },
+ "outputs": [],
+ "source": [
+ "\n",
+ "def rerank_ads_for_user(user_row, retrieved_ads, ranking_model):\n",
+ " ads_df = pd.DataFrame(retrieved_ads)\n",
+ " num_ads = len(ads_df)\n",
+ " user_inputs = {\n",
+ " k: tf.fill(\n",
+ " (num_ads, 1),\n",
+ " str(user_row[k]) if k not in continuous_features else float(user_row[k]),\n",
+ " )\n",
+ " for k in USER_FEATURES\n",
+ " }\n",
+ " ad_inputs = {\n",
+ " k: tf.reshape(tf.constant(ads_df[k].astype(str).values), (-1, 1))\n",
+ " for k in AD_FEATURES\n",
+ " }\n",
+ " scores = (\n",
+ " ranking_model({\"user\": user_inputs, \"positive_ad\": ad_inputs}).numpy().flatten()\n",
+ " )\n",
+ " ads_df[\"ranking_score\"] = scores\n",
+ " return ads_df.sort_values(\"ranking_score\", ascending=False).to_dict(\"records\")\n",
+ "\n",
+ "\n",
+ "sample_user = x_test.iloc[0]\n",
+ "scores, indices = retrieval_engine.query_batch(pd.DataFrame([sample_user]))\n",
+ "top_ads = retrieval_engine.decode_results(scores, indices)[0]\n",
+ "final_ranked_ads = rerank_ads_for_user(sample_user, top_ads, ranking_model)\n",
+ "print(f\"User: {sample_user['user_id']}\")\n",
+ "print(f\"{'Ad ID':<10} | {'Topic':<30} | {'Retrival Score':<11} | {'Rank Probability'}\")\n",
+ "for item in final_ranked_ads:\n",
+ " print(\n",
+ " f\"{item['ad_id']:<10} | {item['ad_topic'][:28]:<30} | {item['score']:.4f} |{item['ranking_score']*100:.2f}%\"\n",
+ " )"
+ ]
+ }
+ ],
+ "metadata": {
+ "accelerator": "GPU",
+ "colab": {
+ "collapsed_sections": [],
+ "name": "two_stage_rs_with_marketing_interaction",
+ "private_outputs": false,
+ "provenance": [],
+ "toc_visible": true
+ },
+ "kernelspec": {
+ "display_name": "Python 3",
+ "language": "python",
+ "name": "python3"
+ },
+ "language_info": {
+ "codemirror_mode": {
+ "name": "ipython",
+ "version": 3
+ },
+ "file_extension": ".py",
+ "mimetype": "text/x-python",
+ "name": "python",
+ "nbconvert_exporter": "python",
+ "pygments_lexer": "ipython3",
+ "version": "3.7.0"
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 0
+}
\ No newline at end of file
diff --git a/examples/keras_rs/md/two_stage_rs_with_marketing_interaction.md b/examples/keras_rs/md/two_stage_rs_with_marketing_interaction.md
new file mode 100644
index 0000000000..87c31267c8
--- /dev/null
+++ b/examples/keras_rs/md/two_stage_rs_with_marketing_interaction.md
@@ -0,0 +1,836 @@
+# Two Stage Recommender System with Marketing Interaction
+
+**Author:** Mansi Mehta
+**Date created:** 26/11/2025
+**Last modified:** 26/11/2025
+**Description:** Recommender System with Ranking and Retrival model for Marketing interaction.
+
+
+ [**View in Colab**](https://colab.research.google.com/github/keras-team/keras-io/blob/master/examples/keras_rs/ipynb/two_stage_rs_with_marketing_interaction.ipynb) • [**GitHub source**](https://github.com/keras-team/keras-io/blob/master/examples/keras_rs/two_stage_rs_with_marketing_interaction.py)
+
+
+
+# **Introduction**
+
+This tutorial demonstrates a critical business scenario: a user lands on a website, and a
+marketing engine must decide which specific ad to display from an inventory of thousands.
+The goal is to maximize the Click-Through Rate (CTR). Showing irrelevant ads wastes
+marketing budget and annoys the user. Therefore, we need a system that predicts the
+probability of a specific user clicking on a specific ad based on their demographics and
+browsing habits.
+
+**Architecture**
+1. **The Retrieval Stage:** Efficiently select an initial set of roughly 10-100
+candidates from millions of possibilities. It weeds out items the user is definitely not
+interested in.
+User Tower: Embeds user features (ID, demographics, behavior) into a vector.
+Item Tower: Embeds ad features (Ad ID, Topic) into a vector.
+Interaction: The dot product of these two vectors represents similarity.
+2. **The Ranking Stage:** It takes the output of the retrieval model and fine-tune the
+order to select the single best ad to show.
+A Deep Neural Network (MLP).
+Interaction: It takes the User Embedding, Ad Embedding, and their similarity score to
+predict a precise probability (0% to 100%) that the user will click.
+
+
+
+# **Dataset**
+We will use the [Ad Click
+Prediction](https://www.kaggle.com/datasets/mafrojaakter/ad-click-data) Dataset from
+Kaggle
+
+**Feature Distribution of dataset:**
+User Tower describes who is looking and features contains i.e Gender, City, Country, Age,
+Daily Internet Usage, Daily Time Spent on Site, and Area Income.
+Item Tower describes what is being shown and features contains Ad Topic Line, Ad ID.
+
+In this tutorial, we are going to build and train a Two-Tower (User Tower and Ad Tower)
+model using the Ad Click Prediction dataset from Kaggle.
+We're going to:
+1. **Data Pipeline:** Get our data and preprocess it for both Retrieval (implicit
+feedback) and Ranking (explicit labels).
+2. **Retrieval:** Implement and train a Two-Tower model to generate candidates.
+3. **Ranking:** Implement and train a Neural Ranking model to predict click probabilities.
+4. **Inference:** Run an end-to-end test (Retrieval --> Ranking) to generate
+recommendations for a specific user.
+
+
+```python
+!!pip install -q keras-rs
+```
+
+
+
+```python
+import os
+
+os.environ["KERAS_BACKEND"] = "tensorflow"
+import keras
+import matplotlib.pyplot as plt
+import numpy as np
+import tensorflow as tf
+import pandas as pd
+import keras_rs
+import tensorflow_datasets as tfds
+from mpl_toolkits.axes_grid1 import make_axes_locatable
+from keras import layers
+from concurrent.futures import ThreadPoolExecutor
+from sklearn.model_selection import train_test_split
+from sklearn.preprocessing import MinMaxScaler
+
+```
+
+```
+['',
+ '\x1b[1m[\x1b[0m\x1b[34;49mnotice\x1b[0m\x1b[1;39;49m]\x1b[0m\x1b[39;49m A new release of pip is available: \x1b[0m\x1b[31;49m23.2.1\x1b[0m\x1b[39;49m -> \x1b[0m\x1b[32;49m25.3\x1b[0m',
+ '\x1b[1m[\x1b[0m\x1b[34;49mnotice\x1b[0m\x1b[1;39;49m]\x1b[0m\x1b[39;49m To update, run: \x1b[0m\x1b[32;49mpip install --upgrade pip\x1b[0m']
+```
+
+
+# **Preparing Dataset**
+
+
+```python
+!pip install -q kaggle
+!# Download the dataset (requires Kaggle API key in ~/.kaggle/kaggle.json)
+!kaggle datasets download -d mafrojaakter/ad-click-data --unzip -p ./ad_click_dataset
+```
+
+
+
+```
+[[34;49mnotice[1;39;49m][39;49m To update, run: [32;49mpip install --upgrade pip
+
+Dataset URL: https://www.kaggle.com/datasets/mafrojaakter/ad-click-data
+License(s): unknown
+
+Downloading ad-click-data.zip to ./ad_click_dataset
+```
+
+
+ 0%| | 0.00/37.6k [00:00, ?B/s]
+
+
+100%|███████████████████████████████████████| 37.6k/37.6k [00:00<00:00, 207kB/s]
+
+100%|███████████████████████████████████████| 37.6k/37.6k [00:00<00:00, 206kB/s]
+
+
+
+```python
+data_path = "./ad_click_dataset/Ad_click_data.csv"
+if not os.path.exists(data_path):
+ # Fallback for filenames with spaces or different casing
+ data_path = "./ad_click_dataset/Ad Click Data.csv"
+
+ads_df = pd.read_csv(data_path)
+# Clean column names
+ads_df.columns = ads_df.columns.str.strip()
+# Rename the column name
+ads_df = ads_df.rename(
+ columns={
+ "Male": "gender",
+ "Ad Topic Line": "ad_topic",
+ "City": "city",
+ "Country": "country",
+ "Daily Time Spent on Site": "time_on_site",
+ "Daily Internet Usage": "internet_usage",
+ "Area Income": "area_income",
+ }
+)
+# Add user_id and add_id column
+ads_df["user_id"] = "user_" + ads_df.index.astype(str)
+ads_df["ad_id"] = "ad_" + ads_df["ad_topic"].astype("category").cat.codes.astype(str)
+# Remove nulls and normalize
+ads_df = ads_df.dropna()
+# normalize
+numeric_cols = ["time_on_site", "internet_usage", "area_income", "Age"]
+scaler = MinMaxScaler()
+ads_df[numeric_cols] = scaler.fit_transform(ads_df[numeric_cols])
+
+# Split the train and test datasets
+x_train, x_test = train_test_split(ads_df, test_size=0.2, random_state=42)
+
+
+def dict_to_tensor_features(df_features, continuous_features):
+ tensor_dict = {}
+ for k, v in df_features.items():
+ if k in continuous_features:
+ tensor_dict[k] = tf.expand_dims(tf.constant(v, dtype="float32"), axis=-1)
+ else:
+ v_str = np.array(v).astype(str).tolist()
+ tensor_dict[k] = tf.expand_dims(tf.constant(v_str, dtype="string"), axis=-1)
+ return tensor_dict
+
+
+def create_retrieval_dataset(
+ data_df,
+ all_ads_features,
+ all_ad_ids,
+ user_features_list,
+ ad_features_list,
+ continuous_features_list,
+):
+
+ # Filter for Positive Interactions (Cicks)
+ positive_interactions = data_df[data_df["Clicked on Ad"] == 1].copy()
+
+ if positive_interactions.empty:
+ return None
+
+ def sample_negative(positive_ad_id):
+ neg_ad_id = positive_ad_id
+ while neg_ad_id == positive_ad_id:
+ neg_ad_id = np.random.choice(all_ad_ids)
+ return neg_ad_id
+
+ def create_triplets_row(pos_row):
+ pos_ad_id = pos_row.ad_id
+ neg_ad_id = sample_negative(pos_ad_id)
+
+ neg_ad_row = all_ads_features[all_ads_features["ad_id"] == neg_ad_id].iloc[0]
+ user_features_dict = {
+ name: getattr(pos_row, name) for name in user_features_list
+ }
+ pos_ad_features_dict = {
+ name: getattr(pos_row, name) for name in ad_features_list
+ }
+ neg_ad_features_dict = {name: neg_ad_row[name] for name in ad_features_list}
+
+ return {
+ "user": user_features_dict,
+ "positive_ad": pos_ad_features_dict,
+ "negative_ad": neg_ad_features_dict,
+ }
+
+ with ThreadPoolExecutor(max_workers=8) as executor:
+ triplets = list(
+ executor.map(
+ create_triplets_row, positive_interactions.itertuples(index=False)
+ )
+ )
+
+ triplets_df = pd.DataFrame(triplets)
+ user_df = triplets_df["user"].apply(pd.Series)
+ pos_ad_df = triplets_df["positive_ad"].apply(pd.Series)
+ neg_ad_df = triplets_df["negative_ad"].apply(pd.Series)
+
+ user_features_tensor = dict_to_tensor_features(
+ user_df.to_dict("list"), continuous_features_list
+ )
+ pos_ad_features_tensor = dict_to_tensor_features(
+ pos_ad_df.to_dict("list"), continuous_features_list
+ )
+ neg_ad_features_tensor = dict_to_tensor_features(
+ neg_ad_df.to_dict("list"), continuous_features_list
+ )
+
+ features = {
+ "user": user_features_tensor,
+ "positive_ad": pos_ad_features_tensor,
+ "negative_ad": neg_ad_features_tensor,
+ }
+ y_true = tf.ones((triplets_df.shape[0], 1), dtype=tf.float32)
+ dataset = tf.data.Dataset.from_tensor_slices((features, y_true))
+ buffer_size = len(triplets_df)
+ dataset = (
+ dataset.shuffle(buffer_size=buffer_size)
+ .batch(64)
+ .cache()
+ .prefetch(tf.data.AUTOTUNE)
+ )
+ return dataset
+
+
+user_clicked_ads = (
+ x_train[x_train["Clicked on Ad"] == 1]
+ .groupby("user_id")["ad_id"]
+ .apply(set)
+ .to_dict()
+)
+
+for u in x_train["user_id"].unique():
+ if u not in user_clicked_ads:
+ user_clicked_ads[u] = set()
+
+AD_FEATURES = ["ad_id", "ad_topic"]
+USER_FEATURES = [
+ "user_id",
+ "gender",
+ "city",
+ "country",
+ "time_on_site",
+ "internet_usage",
+ "area_income",
+ "Age",
+]
+continuous_features = ["time_on_site", "internet_usage", "area_income", "Age"]
+
+all_ads_features = x_train[AD_FEATURES].drop_duplicates().reset_index(drop=True)
+all_ad_ids = all_ads_features["ad_id"].tolist()
+
+retrieval_train_dataset = create_retrieval_dataset(
+ data_df=x_train,
+ all_ads_features=all_ads_features,
+ all_ad_ids=all_ad_ids,
+ user_features_list=USER_FEATURES,
+ ad_features_list=AD_FEATURES,
+ continuous_features_list=continuous_features,
+)
+
+retrieval_test_dataset = create_retrieval_dataset(
+ data_df=x_test,
+ all_ads_features=all_ads_features,
+ all_ad_ids=all_ad_ids,
+ user_features_list=USER_FEATURES,
+ ad_features_list=AD_FEATURES,
+ continuous_features_list=continuous_features,
+)
+```
+
+# **Implement the Retrival Model**
+For the Retrieval stage, we will build a Two-Tower Model.
+
+**The Architecture Components:**
+
+1. User Tower:User features (User ID, demographics, behavior metrics like time_on_site).
+It encodes these mixed features into a fixed-size vector representation called the User
+Embedding.
+2. Item (Ad) Tower:Ad features (Ad ID, Ad Topic Line).It encodes these features into a
+fixed-size vector representation called the Item Embedding.
+3. Interaction (Similarity):We calculate the Dot Product between the User Embedding and
+the Item Embedding.
+
+
+```python
+keras.utils.set_random_seed(42)
+
+vocab_map = {
+ "user_id": x_train["user_id"].unique(),
+ "gender": x_train["gender"].astype(str).unique(),
+ "city": x_train["city"].unique(),
+ "country": x_train["country"].unique(),
+ "ad_id": x_train["ad_id"].unique(),
+ "ad_topic": x_train["ad_topic"].unique(),
+}
+cont_feats = ["time_on_site", "internet_usage", "area_income", "Age"]
+
+normalizers = {}
+for f in cont_feats:
+ norm = layers.Normalization(axis=None)
+ norm.adapt(x_train[f].values.astype("float32"))
+ normalizers[f] = norm
+
+
+def build_tower(feature_names, continuous_names=None, embed_dim=64, name="tower"):
+ inputs, embeddings = {}, []
+
+ for feat in feature_names:
+ if feat in vocab_map:
+ inp = keras.Input(shape=(1,), dtype=tf.string, name=feat)
+ inputs[feat] = inp
+ vocab = list(vocab_map[feat])
+ x = layers.StringLookup(vocabulary=vocab)(inp)
+ x = layers.Embedding(
+ len(vocab) + 1, embed_dim, embeddings_regularizer="l2"
+ )(x)
+ embeddings.append(layers.Flatten()(x))
+
+ if continuous_names:
+ for feat in continuous_names:
+ inp = keras.Input(shape=(1,), dtype=tf.float32, name=feat)
+ inputs[feat] = inp
+ embeddings.append(normalizers[feat](inp))
+
+ x = layers.Concatenate()(embeddings)
+ x = layers.Dense(128, activation="relu")(x)
+ x = layers.Dropout(0.2)(x)
+ x = layers.Dense(64, activation="relu")(x)
+ output = layers.Dense(embed_dim)(layers.Dropout(0.2)(x))
+
+ return keras.Model(inputs=inputs, outputs=output, name=name)
+
+
+user_tower = build_tower(
+ ["user_id", "gender", "city", "country"], cont_feats, name="user_tower"
+)
+ad_tower = build_tower(["ad_id", "ad_topic"], name="ad_tower")
+
+
+def bpr_hinge_loss(y_true, y_pred):
+ margin = 1.0
+ return -tf.math.log(tf.nn.sigmoid(y_pred) + 1e-10)
+
+
+class RetrievalModel(keras.Model):
+ def __init__(self, user_tower_instance, ad_tower_instance, **kwargs):
+ super().__init__(**kwargs)
+ self.user_tower = user_tower
+ self.ad_tower = ad_tower
+ self.ln_user = layers.LayerNormalization()
+ self.ln_ad = layers.LayerNormalization()
+
+ def call(self, inputs):
+ u_emb = self.ln_user(self.user_tower(inputs["user"]))
+ pos_emb = self.ln_ad(self.ad_tower(inputs["positive_ad"]))
+ neg_emb = self.ln_ad(self.ad_tower(inputs["negative_ad"]))
+ pos_score = keras.ops.sum(u_emb * pos_emb, axis=1, keepdims=True)
+ neg_score = keras.ops.sum(u_emb * neg_emb, axis=1, keepdims=True)
+ return pos_score - neg_score
+
+ def get_embeddings(self, inputs):
+ u_emb = self.ln_user(self.user_tower(inputs["user"]))
+ ad_emb = self.ln_ad(self.ad_tower(inputs["positive_ad"]))
+ dot_interaction = keras.ops.sum(u_emb * ad_emb, axis=1, keepdims=True)
+ return u_emb, ad_emb, dot_interaction
+
+
+retrieval_model = RetrievalModel(user_tower, ad_tower)
+retrieval_model.compile(
+ optimizer=keras.optimizers.Adam(learning_rate=1e-3), loss=bpr_hinge_loss
+)
+history = retrieval_model.fit(retrieval_train_dataset, epochs=30)
+
+pd.DataFrame(history.history).plot(
+ subplots=True, layout=(1, 3), figsize=(12, 4), title="Retrival Model Metrics"
+)
+plt.show()
+```
+
+
+```
+Epoch 1/30
+
+6/6 ━━━━━━━━━━━━━━━━━━━━ 2s 2ms/step - loss: 2.8117
+
+Epoch 2/30
+
+6/6 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 1.3631
+
+Epoch 3/30
+
+6/6 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 1.0918
+
+Epoch 4/30
+
+6/6 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.9143
+
+Epoch 5/30
+
+6/6 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.7872
+
+Epoch 6/30
+
+6/6 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.6925
+
+Epoch 7/30
+
+6/6 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.6203
+
+Epoch 8/30
+
+6/6 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.5641
+
+Epoch 9/30
+
+6/6 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.5190
+
+Epoch 10/30
+
+6/6 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.4817
+
+Epoch 11/30
+
+6/6 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.4499
+
+Epoch 12/30
+
+6/6 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.4220
+
+Epoch 13/30
+
+6/6 ━━━━━━━━━━━━━━━━━━━━ 0s 8ms/step - loss: 0.3970
+
+Epoch 14/30
+
+6/6 ━━━━━━━━━━━━━━━━━━━━ 0s 6ms/step - loss: 0.3743
+
+Epoch 15/30
+
+6/6 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3537
+
+Epoch 16/30
+
+6/6 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3346
+
+Epoch 17/30
+
+6/6 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.3171
+
+Epoch 18/30
+
+6/6 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.3009
+
+Epoch 19/30
+
+6/6 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.2858
+
+Epoch 20/30
+
+6/6 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.2718
+
+Epoch 21/30
+
+6/6 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.2587
+
+Epoch 22/30
+
+6/6 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.2465
+
+Epoch 23/30
+
+6/6 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.2350
+
+Epoch 24/30
+
+6/6 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.2243
+
+Epoch 25/30
+
+6/6 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.2142
+
+Epoch 26/30
+
+6/6 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.2046
+
+Epoch 27/30
+
+6/6 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.1956
+
+Epoch 28/30
+
+6/6 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.1871
+
+Epoch 29/30
+
+6/6 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.1791
+
+Epoch 30/30
+
+6/6 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.1715
+```
+
+
+
+
+
+
+# **Predictions of Retrival Model**
+Two-Tower model is trained, we need to use it to generate candidates.
+
+We can implement inference pipeline using three steps:
+1. Indexing: We can run the Item Tower once for all available ads to generate their
+embeddings.
+2. Query Encoding: When a user arrives, we pass their features through the User Tower to
+generate a User Embedding.
+3. Nearest Neighbor Search: We search the index to find the Ad Embeddings closest to the
+User Embedding (highest dot product).
+
+Keras-RS [BruteForceRetrieval
+layer](https://keras.io/keras_rs/api/retrieval_layers/brute_force_retrieval/) calculates
+dot product between the user and every single item in the index to find exact top-K
+matches
+
+
+```python
+USER_CATEGORICAL = ["user_id", "gender", "city", "country"]
+CONTINUOUS_FEATURES = ["time_on_site", "internet_usage", "area_income", "Age"]
+USER_FEATURES = USER_CATEGORICAL + CONTINUOUS_FEATURES
+
+
+class BruteForceRetrievalWrapper:
+ def __init__(self, model, ads_df, ad_features, user_features, k=10):
+ self.model, self.k = model, k
+ self.user_features = user_features
+ unique_ads = ads_df[ad_features].drop_duplicates("ad_id").reset_index(drop=True)
+ self.ids = unique_ads["ad_id"].values
+ self.topic_map = dict(zip(unique_ads["ad_id"], unique_ads["ad_topic"]))
+ ad_inputs = {
+ "ad_id": tf.constant(self.ids.astype(str)),
+ "ad_topic": tf.constant(unique_ads["ad_topic"].astype(str).values),
+ }
+ self.candidate_embs = model.ln_ad(model.ad_tower(ad_inputs))
+
+ def query_batch(self, user_df):
+ inputs = {
+ k: tf.constant(
+ user_df[k].values.astype(float if k in CONTINUOUS_FEATURES else str)
+ )
+ for k in self.user_features
+ if k in user_df.columns
+ }
+ u_emb = self.model.ln_user(self.model.user_tower(inputs))
+ scores = tf.linalg.matmul(u_emb, self.candidate_embs, transpose_b=True)
+ top_scores, top_indices = tf.math.top_k(scores, k=self.k)
+ return top_scores.numpy(), top_indices.numpy()
+
+ def decode_results(self, scores, indices):
+ results = []
+ for row_scores, row_indices in zip(scores, indices):
+ retrieved_ids = self.ids[row_indices]
+ results.append(
+ [
+ {"ad_id": aid, "ad_topic": self.topic_map[aid], "score": float(s)}
+ for aid, s in zip(retrieved_ids, row_scores)
+ ]
+ )
+ return results
+
+
+retrieval_engine = BruteForceRetrievalWrapper(
+ model=retrieval_model,
+ ads_df=ads_df,
+ ad_features=["ad_id", "ad_topic"],
+ user_features=USER_FEATURES,
+ k=10,
+)
+sample_user = pd.DataFrame([x_test.iloc[0]])
+scores, indices = retrieval_engine.query_batch(sample_user)
+top_ads = retrieval_engine.decode_results(scores, indices)[0]
+```
+
+# **Implementation of Ranking Model**
+Retrieval model only calculates a simple similarity score (Dot Product). It doesn't
+account for complex feature interactions.
+So we need to build ranking model after words retrival model.
+
+**Architecture**
+1. **Feature Extraction:** We reuse the trained User Tower and Ad Tower from the
+Retrieval stage. We freeze these towers (trainable = False) so their weights don't
+change.
+2. **Interaction:** Instead of just a dot product, we concatenate three inputs- The User
+EmbeddingThe Ad EmbeddingThe Dot Product (Similarity)
+3. **Scorer(MLP):** These concatenated inputs are fed into a Multi-Layer Perceptron—a
+stack of Dense layers. This network learns the non-linear relationships between the user
+and the ad.
+4. **Output:** The final layer uses a Sigmoid activation to output a single probability
+between 0.0 and 1.0 (Likelihood of a Click).
+
+
+```python
+retrieval_model.trainable = False
+
+
+def create_ranking_ds(df):
+ inputs = {
+ "user": dict_to_tensor_features(df[USER_FEATURES], continuous_features),
+ "positive_ad": dict_to_tensor_features(df[AD_FEATURES], continuous_features),
+ }
+ return (
+ tf.data.Dataset.from_tensor_slices(
+ (inputs, df["Clicked on Ad"].values.astype("float32"))
+ )
+ .shuffle(10000)
+ .batch(256)
+ .prefetch(tf.data.AUTOTUNE)
+ )
+
+
+ranking_train_dataset = create_ranking_ds(x_train)
+ranking_test_dataset = create_ranking_ds(x_test)
+
+
+class RankingModel(keras.Model):
+ def __init__(self, retrieval_model, **kwargs):
+ super().__init__(**kwargs)
+ self.retrieval = retrieval_model
+ self.mlp = keras.Sequential(
+ [
+ layers.Dense(256, activation="relu"),
+ layers.Dropout(0.2),
+ layers.Dense(128, activation="relu"),
+ layers.Dropout(0.2),
+ layers.Dense(64, activation="relu"),
+ layers.Dense(1, activation="sigmoid"),
+ ]
+ )
+
+ def call(self, inputs):
+ u_emb, ad_emb, dot = self.retrieval.get_embeddings(inputs)
+ return self.mlp(keras.ops.concatenate([u_emb, ad_emb, dot], axis=-1))
+
+
+ranking_model = RankingModel(retrieval_model)
+ranking_model.compile(
+ optimizer=keras.optimizers.Adam(1e-4),
+ loss="binary_crossentropy",
+ metrics=["AUC", "accuracy"],
+)
+history1 = ranking_model.fit(ranking_train_dataset, epochs=20)
+
+pd.DataFrame(history1.history).plot(
+ subplots=True, layout=(1, 3), figsize=(12, 4), title="Ranking Model Metrics"
+)
+plt.show()
+
+ranking_model.evaluate(ranking_test_dataset)
+```
+
+
+```
+Epoch 1/20
+
+3/3 ━━━━━━━━━━━━━━━━━━━━ 1s 5ms/step - AUC: 0.6079 - accuracy: 0.4961 - loss: 0.6890
+
+Epoch 2/20
+
+3/3 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - AUC: 0.8329 - accuracy: 0.5748 - loss: 0.6423
+
+Epoch 3/20
+
+3/3 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - AUC: 0.9284 - accuracy: 0.7467 - loss: 0.5995
+
+Epoch 4/20
+
+3/3 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - AUC: 0.9636 - accuracy: 0.8766 - loss: 0.5599
+
+Epoch 5/20
+
+3/3 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - AUC: 0.9763 - accuracy: 0.9213 - loss: 0.5229
+
+Epoch 6/20
+
+3/3 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - AUC: 0.9824 - accuracy: 0.9304 - loss: 0.4876
+
+Epoch 7/20
+
+3/3 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - AUC: 0.9862 - accuracy: 0.9331 - loss: 0.4540
+
+Epoch 8/20
+
+3/3 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - AUC: 0.9880 - accuracy: 0.9357 - loss: 0.4224
+
+Epoch 9/20
+
+3/3 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - AUC: 0.9898 - accuracy: 0.9436 - loss: 0.3920
+
+Epoch 10/20
+
+3/3 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - AUC: 0.9911 - accuracy: 0.9475 - loss: 0.3633
+
+Epoch 11/20
+
+3/3 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - AUC: 0.9914 - accuracy: 0.9528 - loss: 0.3361
+
+Epoch 12/20
+
+3/3 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - AUC: 0.9923 - accuracy: 0.9580 - loss: 0.3103
+
+Epoch 13/20
+
+3/3 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - AUC: 0.9925 - accuracy: 0.9619 - loss: 0.2866
+
+Epoch 14/20
+
+3/3 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - AUC: 0.9931 - accuracy: 0.9633 - loss: 0.2643
+
+Epoch 15/20
+
+3/3 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - AUC: 0.9935 - accuracy: 0.9633 - loss: 0.2436
+
+Epoch 16/20
+
+3/3 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - AUC: 0.9938 - accuracy: 0.9659 - loss: 0.2247
+
+Epoch 17/20
+
+3/3 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - AUC: 0.9942 - accuracy: 0.9646 - loss: 0.2076
+
+Epoch 18/20
+
+3/3 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - AUC: 0.9945 - accuracy: 0.9659 - loss: 0.1918
+
+Epoch 19/20
+
+3/3 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - AUC: 0.9947 - accuracy: 0.9672 - loss: 0.1777
+
+Epoch 20/20
+
+3/3 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - AUC: 0.9953 - accuracy: 0.9685 - loss: 0.1645
+```
+
+
+
+
+
+
+
+
+```
+1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 230ms/step - AUC: 0.9904 - accuracy: 0.9476 - loss: 0.2319
+
+[0.2318607121706009, 0.9903508424758911, 0.9476439952850342]
+```
+
+
+# **Predictions of Ranking Model**
+The retrieval model gave us a list of ads that are generally relevant (high dot product
+similarity). The ranking model will now calculate the specific probability (0% to 100%)
+that the user will click each of those ads.
+
+The Ranking model expects pairs of (User, Ad). Since we are scoring 10 ads for 1 user, we
+cannot just pass the user features once.We effectively take user's features 10 times to
+create a batch.
+
+
+```python
+
+def rerank_ads_for_user(user_row, retrieved_ads, ranking_model):
+ ads_df = pd.DataFrame(retrieved_ads)
+ num_ads = len(ads_df)
+ user_inputs = {
+ k: tf.fill(
+ (num_ads, 1),
+ str(user_row[k]) if k not in continuous_features else float(user_row[k]),
+ )
+ for k in USER_FEATURES
+ }
+ ad_inputs = {
+ k: tf.reshape(tf.constant(ads_df[k].astype(str).values), (-1, 1))
+ for k in AD_FEATURES
+ }
+ scores = (
+ ranking_model({"user": user_inputs, "positive_ad": ad_inputs}).numpy().flatten()
+ )
+ ads_df["ranking_score"] = scores
+ return ads_df.sort_values("ranking_score", ascending=False).to_dict("records")
+
+
+sample_user = x_test.iloc[0]
+scores, indices = retrieval_engine.query_batch(pd.DataFrame([sample_user]))
+top_ads = retrieval_engine.decode_results(scores, indices)[0]
+final_ranked_ads = rerank_ads_for_user(sample_user, top_ads, ranking_model)
+print(f"User: {sample_user['user_id']}")
+print(f"{'Ad ID':<10} | {'Topic':<30} | {'Retrival Score':<11} | {'Rank Probability'}")
+for item in final_ranked_ads:
+ print(
+ f"{item['ad_id']:<10} | {item['ad_topic'][:28]:<30} | {item['score']:.4f} |{item['ranking_score']*100:.2f}%"
+ )
+```
+
+
+```
+User: user_216
+Ad ID | Topic | Retrival Score | Rank Probability
+ad_305 | Front-line fault-tolerant in | 8.2131 |99.27%
+ad_318 | Front-line upward-trending g | 7.6231 |99.17%
+ad_758 | Right-sized multi-tasking so | 7.1814 |99.06%
+ad_767 | Robust object-oriented Graph | 7.2068 |99.02%
+ad_620 | Polarized modular function | 7.2857 |98.92%
+ad_522 | Open-architected full-range | 7.0892 |98.82%
+ad_771 | Robust web-enabled attitude | 7.3828 |98.81%
+ad_810 | Sharable optimal capacity | 6.7046 |98.69%
+ad_31 | Ameliorated well-modulated c | 6.9498 |98.40%
+ad_104 | Configurable 24/7 hub | 6.7244 |98.39%
+```
+
diff --git a/examples/keras_rs/two_stage_rs_with_marketing_interaction.py b/examples/keras_rs/two_stage_rs_with_marketing_interaction.py
new file mode 100644
index 0000000000..b2c1e572ca
--- /dev/null
+++ b/examples/keras_rs/two_stage_rs_with_marketing_interaction.py
@@ -0,0 +1,556 @@
+"""
+Title: Two Stage Recommender System with Marketing Interaction
+Author: Mansi Mehta
+Date created: 26/11/2025
+Last modified: 26/11/2025
+Description: Recommender System with Ranking and Retrival model for Marketing interaction.
+Accelerator: GPU
+"""
+
+"""
+# **Introduction**
+
+This tutorial demonstrates a critical business scenario: a user lands on a website, and a
+marketing engine must decide which specific ad to display from an inventory of thousands.
+The goal is to maximize the Click-Through Rate (CTR). Showing irrelevant ads wastes
+marketing budget and annoys the user. Therefore, we need a system that predicts the
+probability of a specific user clicking on a specific ad based on their demographics and
+browsing habits.
+
+**Architecture**
+1. **The Retrieval Stage:** Efficiently select an initial set of roughly 10-100
+candidates from millions of possibilities. It weeds out items the user is definitely not
+interested in.
+User Tower: Embeds user features (ID, demographics, behavior) into a vector.
+Item Tower: Embeds ad features (Ad ID, Topic) into a vector.
+Interaction: The dot product of these two vectors represents similarity.
+2. **The Ranking Stage:** It takes the output of the retrieval model and fine-tune the
+order to select the single best ad to show.
+A Deep Neural Network (MLP).
+Interaction: It takes the User Embedding, Ad Embedding, and their similarity score to
+predict a precise probability (0% to 100%) that the user will click.
+
+
+"""
+
+"""
+# **Dataset**
+We will use the [Ad Click
+Prediction](https://www.kaggle.com/datasets/mafrojaakter/ad-click-data) Dataset from
+Kaggle
+
+**Feature Distribution of dataset:**
+User Tower describes who is looking and features contains i.e Gender, City, Country, Age,
+Daily Internet Usage, Daily Time Spent on Site, and Area Income.
+Item Tower describes what is being shown and features contains Ad Topic Line, Ad ID.
+
+In this tutorial, we are going to build and train a Two-Tower (User Tower and Ad Tower)
+model using the Ad Click Prediction dataset from Kaggle.
+We're going to:
+1. **Data Pipeline:** Get our data and preprocess it for both Retrieval (implicit
+feedback) and Ranking (explicit labels).
+2. **Retrieval:** Implement and train a Two-Tower model to generate candidates.
+3. **Ranking:** Implement and train a Neural Ranking model to predict click probabilities.
+4. **Inference:** Run an end-to-end test (Retrieval --> Ranking) to generate
+recommendations for a specific user.
+"""
+
+"""shell
+!pip install -q keras-rs
+"""
+
+import os
+
+os.environ["KERAS_BACKEND"] = "tensorflow"
+import keras
+import matplotlib.pyplot as plt
+import numpy as np
+import tensorflow as tf
+import pandas as pd
+import keras_rs
+import tensorflow_datasets as tfds
+from mpl_toolkits.axes_grid1 import make_axes_locatable
+from keras import layers
+from concurrent.futures import ThreadPoolExecutor
+from sklearn.model_selection import train_test_split
+from sklearn.preprocessing import MinMaxScaler
+
+
+"""
+# **Preparing Dataset**
+"""
+
+"""shell
+pip install -q kaggle
+# Download the dataset (requires Kaggle API key in ~/.kaggle/kaggle.json)
+kaggle datasets download -d mafrojaakter/ad-click-data --unzip -p ./ad_click_dataset
+"""
+data_path = "./ad_click_dataset/Ad_click_data.csv"
+if not os.path.exists(data_path):
+ # Fallback for filenames with spaces or different casing
+ data_path = "./ad_click_dataset/Ad Click Data.csv"
+
+ads_df = pd.read_csv(data_path)
+# Clean column names
+ads_df.columns = ads_df.columns.str.strip()
+# Rename the column name
+ads_df = ads_df.rename(
+ columns={
+ "Male": "gender",
+ "Ad Topic Line": "ad_topic",
+ "City": "city",
+ "Country": "country",
+ "Daily Time Spent on Site": "time_on_site",
+ "Daily Internet Usage": "internet_usage",
+ "Area Income": "area_income",
+ }
+)
+# Add user_id and add_id column
+ads_df["user_id"] = "user_" + ads_df.index.astype(str)
+ads_df["ad_id"] = "ad_" + ads_df["ad_topic"].astype("category").cat.codes.astype(str)
+# Remove nulls and normalize
+ads_df = ads_df.dropna()
+# normalize
+numeric_cols = ["time_on_site", "internet_usage", "area_income", "Age"]
+scaler = MinMaxScaler()
+ads_df[numeric_cols] = scaler.fit_transform(ads_df[numeric_cols])
+
+# Split the train and test datasets
+x_train, x_test = train_test_split(ads_df, test_size=0.2, random_state=42)
+
+
+def dict_to_tensor_features(df_features, continuous_features):
+ tensor_dict = {}
+ for k, v in df_features.items():
+ if k in continuous_features:
+ tensor_dict[k] = tf.expand_dims(tf.constant(v, dtype="float32"), axis=-1)
+ else:
+ v_str = np.array(v).astype(str).tolist()
+ tensor_dict[k] = tf.expand_dims(tf.constant(v_str, dtype="string"), axis=-1)
+ return tensor_dict
+
+
+def create_retrieval_dataset(
+ data_df,
+ all_ads_features,
+ all_ad_ids,
+ user_features_list,
+ ad_features_list,
+ continuous_features_list,
+):
+
+ # Filter for Positive Interactions (Cicks)
+ positive_interactions = data_df[data_df["Clicked on Ad"] == 1].copy()
+
+ if positive_interactions.empty:
+ return None
+
+ def sample_negative(positive_ad_id):
+ neg_ad_id = positive_ad_id
+ while neg_ad_id == positive_ad_id:
+ neg_ad_id = np.random.choice(all_ad_ids)
+ return neg_ad_id
+
+ def create_triplets_row(pos_row):
+ pos_ad_id = pos_row.ad_id
+ neg_ad_id = sample_negative(pos_ad_id)
+
+ neg_ad_row = all_ads_features[all_ads_features["ad_id"] == neg_ad_id].iloc[0]
+ user_features_dict = {
+ name: getattr(pos_row, name) for name in user_features_list
+ }
+ pos_ad_features_dict = {
+ name: getattr(pos_row, name) for name in ad_features_list
+ }
+ neg_ad_features_dict = {name: neg_ad_row[name] for name in ad_features_list}
+
+ return {
+ "user": user_features_dict,
+ "positive_ad": pos_ad_features_dict,
+ "negative_ad": neg_ad_features_dict,
+ }
+
+ with ThreadPoolExecutor(max_workers=8) as executor:
+ triplets = list(
+ executor.map(
+ create_triplets_row, positive_interactions.itertuples(index=False)
+ )
+ )
+
+ triplets_df = pd.DataFrame(triplets)
+ user_df = triplets_df["user"].apply(pd.Series)
+ pos_ad_df = triplets_df["positive_ad"].apply(pd.Series)
+ neg_ad_df = triplets_df["negative_ad"].apply(pd.Series)
+
+ user_features_tensor = dict_to_tensor_features(
+ user_df.to_dict("list"), continuous_features_list
+ )
+ pos_ad_features_tensor = dict_to_tensor_features(
+ pos_ad_df.to_dict("list"), continuous_features_list
+ )
+ neg_ad_features_tensor = dict_to_tensor_features(
+ neg_ad_df.to_dict("list"), continuous_features_list
+ )
+
+ features = {
+ "user": user_features_tensor,
+ "positive_ad": pos_ad_features_tensor,
+ "negative_ad": neg_ad_features_tensor,
+ }
+ y_true = tf.ones((triplets_df.shape[0], 1), dtype=tf.float32)
+ dataset = tf.data.Dataset.from_tensor_slices((features, y_true))
+ buffer_size = len(triplets_df)
+ dataset = (
+ dataset.shuffle(buffer_size=buffer_size)
+ .batch(64)
+ .cache()
+ .prefetch(tf.data.AUTOTUNE)
+ )
+ return dataset
+
+
+user_clicked_ads = (
+ x_train[x_train["Clicked on Ad"] == 1]
+ .groupby("user_id")["ad_id"]
+ .apply(set)
+ .to_dict()
+)
+
+for u in x_train["user_id"].unique():
+ if u not in user_clicked_ads:
+ user_clicked_ads[u] = set()
+
+AD_FEATURES = ["ad_id", "ad_topic"]
+USER_FEATURES = [
+ "user_id",
+ "gender",
+ "city",
+ "country",
+ "time_on_site",
+ "internet_usage",
+ "area_income",
+ "Age",
+]
+continuous_features = ["time_on_site", "internet_usage", "area_income", "Age"]
+
+all_ads_features = x_train[AD_FEATURES].drop_duplicates().reset_index(drop=True)
+all_ad_ids = all_ads_features["ad_id"].tolist()
+
+retrieval_train_dataset = create_retrieval_dataset(
+ data_df=x_train,
+ all_ads_features=all_ads_features,
+ all_ad_ids=all_ad_ids,
+ user_features_list=USER_FEATURES,
+ ad_features_list=AD_FEATURES,
+ continuous_features_list=continuous_features,
+)
+
+retrieval_test_dataset = create_retrieval_dataset(
+ data_df=x_test,
+ all_ads_features=all_ads_features,
+ all_ad_ids=all_ad_ids,
+ user_features_list=USER_FEATURES,
+ ad_features_list=AD_FEATURES,
+ continuous_features_list=continuous_features,
+)
+
+"""
+# **Implement the Retrival Model**
+For the Retrieval stage, we will build a Two-Tower Model.
+
+**The Architecture Components:**
+
+1. User Tower:User features (User ID, demographics, behavior metrics like time_on_site).
+It encodes these mixed features into a fixed-size vector representation called the User
+Embedding.
+2. Item (Ad) Tower:Ad features (Ad ID, Ad Topic Line).It encodes these features into a
+fixed-size vector representation called the Item Embedding.
+3. Interaction (Similarity):We calculate the Dot Product between the User Embedding and
+the Item Embedding.
+"""
+
+keras.utils.set_random_seed(42)
+
+vocab_map = {
+ "user_id": x_train["user_id"].unique(),
+ "gender": x_train["gender"].astype(str).unique(),
+ "city": x_train["city"].unique(),
+ "country": x_train["country"].unique(),
+ "ad_id": x_train["ad_id"].unique(),
+ "ad_topic": x_train["ad_topic"].unique(),
+}
+cont_feats = ["time_on_site", "internet_usage", "area_income", "Age"]
+
+normalizers = {}
+for f in cont_feats:
+ norm = layers.Normalization(axis=None)
+ norm.adapt(x_train[f].values.astype("float32"))
+ normalizers[f] = norm
+
+
+def build_tower(feature_names, continuous_names=None, embed_dim=64, name="tower"):
+ inputs, embeddings = {}, []
+
+ for feat in feature_names:
+ if feat in vocab_map:
+ inp = keras.Input(shape=(1,), dtype=tf.string, name=feat)
+ inputs[feat] = inp
+ vocab = list(vocab_map[feat])
+ x = layers.StringLookup(vocabulary=vocab)(inp)
+ x = layers.Embedding(
+ len(vocab) + 1, embed_dim, embeddings_regularizer="l2"
+ )(x)
+ embeddings.append(layers.Flatten()(x))
+
+ if continuous_names:
+ for feat in continuous_names:
+ inp = keras.Input(shape=(1,), dtype=tf.float32, name=feat)
+ inputs[feat] = inp
+ embeddings.append(normalizers[feat](inp))
+
+ x = layers.Concatenate()(embeddings)
+ x = layers.Dense(128, activation="relu")(x)
+ x = layers.Dropout(0.2)(x)
+ x = layers.Dense(64, activation="relu")(x)
+ output = layers.Dense(embed_dim)(layers.Dropout(0.2)(x))
+
+ return keras.Model(inputs=inputs, outputs=output, name=name)
+
+
+user_tower = build_tower(
+ ["user_id", "gender", "city", "country"], cont_feats, name="user_tower"
+)
+ad_tower = build_tower(["ad_id", "ad_topic"], name="ad_tower")
+
+
+def bpr_hinge_loss(y_true, y_pred):
+ margin = 1.0
+ return -tf.math.log(tf.nn.sigmoid(y_pred) + 1e-10)
+
+
+class RetrievalModel(keras.Model):
+ def __init__(self, user_tower_instance, ad_tower_instance, **kwargs):
+ super().__init__(**kwargs)
+ self.user_tower = user_tower
+ self.ad_tower = ad_tower
+ self.ln_user = layers.LayerNormalization()
+ self.ln_ad = layers.LayerNormalization()
+
+ def call(self, inputs):
+ u_emb = self.ln_user(self.user_tower(inputs["user"]))
+ pos_emb = self.ln_ad(self.ad_tower(inputs["positive_ad"]))
+ neg_emb = self.ln_ad(self.ad_tower(inputs["negative_ad"]))
+ pos_score = keras.ops.sum(u_emb * pos_emb, axis=1, keepdims=True)
+ neg_score = keras.ops.sum(u_emb * neg_emb, axis=1, keepdims=True)
+ return pos_score - neg_score
+
+ def get_embeddings(self, inputs):
+ u_emb = self.ln_user(self.user_tower(inputs["user"]))
+ ad_emb = self.ln_ad(self.ad_tower(inputs["positive_ad"]))
+ dot_interaction = keras.ops.sum(u_emb * ad_emb, axis=1, keepdims=True)
+ return u_emb, ad_emb, dot_interaction
+
+
+retrieval_model = RetrievalModel(user_tower, ad_tower)
+retrieval_model.compile(
+ optimizer=keras.optimizers.Adam(learning_rate=1e-3), loss=bpr_hinge_loss
+)
+history = retrieval_model.fit(retrieval_train_dataset, epochs=30)
+
+pd.DataFrame(history.history).plot(
+ subplots=True, layout=(1, 3), figsize=(12, 4), title="Retrival Model Metrics"
+)
+plt.show()
+
+"""
+# **Predictions of Retrival Model**
+Two-Tower model is trained, we need to use it to generate candidates.
+
+We can implement inference pipeline using three steps:
+1. Indexing: We can run the Item Tower once for all available ads to generate their
+embeddings.
+2. Query Encoding: When a user arrives, we pass their features through the User Tower to
+generate a User Embedding.
+3. Nearest Neighbor Search: We search the index to find the Ad Embeddings closest to the
+User Embedding (highest dot product).
+
+Keras-RS [BruteForceRetrieval
+layer](https://keras.io/keras_rs/api/retrieval_layers/brute_force_retrieval/) calculates
+dot product between the user and every single item in the index to find exact top-K
+matches
+"""
+
+USER_CATEGORICAL = ["user_id", "gender", "city", "country"]
+CONTINUOUS_FEATURES = ["time_on_site", "internet_usage", "area_income", "Age"]
+USER_FEATURES = USER_CATEGORICAL + CONTINUOUS_FEATURES
+
+
+class BruteForceRetrievalWrapper:
+ def __init__(self, model, ads_df, ad_features, user_features, k=10):
+ self.model, self.k = model, k
+ self.user_features = user_features
+ unique_ads = ads_df[ad_features].drop_duplicates("ad_id").reset_index(drop=True)
+ self.ids = unique_ads["ad_id"].values
+ self.topic_map = dict(zip(unique_ads["ad_id"], unique_ads["ad_topic"]))
+ ad_inputs = {
+ "ad_id": tf.constant(self.ids.astype(str)),
+ "ad_topic": tf.constant(unique_ads["ad_topic"].astype(str).values),
+ }
+ self.candidate_embs = model.ln_ad(model.ad_tower(ad_inputs))
+
+ def query_batch(self, user_df):
+ inputs = {
+ k: tf.constant(
+ user_df[k].values.astype(float if k in CONTINUOUS_FEATURES else str)
+ )
+ for k in self.user_features
+ if k in user_df.columns
+ }
+ u_emb = self.model.ln_user(self.model.user_tower(inputs))
+ scores = tf.linalg.matmul(u_emb, self.candidate_embs, transpose_b=True)
+ top_scores, top_indices = tf.math.top_k(scores, k=self.k)
+ return top_scores.numpy(), top_indices.numpy()
+
+ def decode_results(self, scores, indices):
+ results = []
+ for row_scores, row_indices in zip(scores, indices):
+ retrieved_ids = self.ids[row_indices]
+ results.append(
+ [
+ {"ad_id": aid, "ad_topic": self.topic_map[aid], "score": float(s)}
+ for aid, s in zip(retrieved_ids, row_scores)
+ ]
+ )
+ return results
+
+
+retrieval_engine = BruteForceRetrievalWrapper(
+ model=retrieval_model,
+ ads_df=ads_df,
+ ad_features=["ad_id", "ad_topic"],
+ user_features=USER_FEATURES,
+ k=10,
+)
+sample_user = pd.DataFrame([x_test.iloc[0]])
+scores, indices = retrieval_engine.query_batch(sample_user)
+top_ads = retrieval_engine.decode_results(scores, indices)[0]
+
+"""
+# **Implementation of Ranking Model**
+Retrieval model only calculates a simple similarity score (Dot Product). It doesn't
+account for complex feature interactions.
+So we need to build ranking model after words retrival model.
+
+**Architecture**
+1. **Feature Extraction:** We reuse the trained User Tower and Ad Tower from the
+Retrieval stage. We freeze these towers (trainable = False) so their weights don't
+change.
+2. **Interaction:** Instead of just a dot product, we concatenate three inputs- The User
+EmbeddingThe Ad EmbeddingThe Dot Product (Similarity)
+3. **Scorer(MLP):** These concatenated inputs are fed into a Multi-Layer Perceptron—a
+stack of Dense layers. This network learns the non-linear relationships between the user
+and the ad.
+4. **Output:** The final layer uses a Sigmoid activation to output a single probability
+between 0.0 and 1.0 (Likelihood of a Click).
+"""
+
+retrieval_model.trainable = False
+
+
+def create_ranking_ds(df):
+ inputs = {
+ "user": dict_to_tensor_features(df[USER_FEATURES], continuous_features),
+ "positive_ad": dict_to_tensor_features(df[AD_FEATURES], continuous_features),
+ }
+ return (
+ tf.data.Dataset.from_tensor_slices(
+ (inputs, df["Clicked on Ad"].values.astype("float32"))
+ )
+ .shuffle(10000)
+ .batch(256)
+ .prefetch(tf.data.AUTOTUNE)
+ )
+
+
+ranking_train_dataset = create_ranking_ds(x_train)
+ranking_test_dataset = create_ranking_ds(x_test)
+
+
+class RankingModel(keras.Model):
+ def __init__(self, retrieval_model, **kwargs):
+ super().__init__(**kwargs)
+ self.retrieval = retrieval_model
+ self.mlp = keras.Sequential(
+ [
+ layers.Dense(256, activation="relu"),
+ layers.Dropout(0.2),
+ layers.Dense(128, activation="relu"),
+ layers.Dropout(0.2),
+ layers.Dense(64, activation="relu"),
+ layers.Dense(1, activation="sigmoid"),
+ ]
+ )
+
+ def call(self, inputs):
+ u_emb, ad_emb, dot = self.retrieval.get_embeddings(inputs)
+ return self.mlp(keras.ops.concatenate([u_emb, ad_emb, dot], axis=-1))
+
+
+ranking_model = RankingModel(retrieval_model)
+ranking_model.compile(
+ optimizer=keras.optimizers.Adam(1e-4),
+ loss="binary_crossentropy",
+ metrics=["AUC", "accuracy"],
+)
+history1 = ranking_model.fit(ranking_train_dataset, epochs=20)
+
+pd.DataFrame(history1.history).plot(
+ subplots=True, layout=(1, 3), figsize=(12, 4), title="Ranking Model Metrics"
+)
+plt.show()
+
+ranking_model.evaluate(ranking_test_dataset)
+
+"""
+# **Predictions of Ranking Model**
+The retrieval model gave us a list of ads that are generally relevant (high dot product
+similarity). The ranking model will now calculate the specific probability (0% to 100%)
+that the user will click each of those ads.
+
+The Ranking model expects pairs of (User, Ad). Since we are scoring 10 ads for 1 user, we
+cannot just pass the user features once.We effectively take user's features 10 times to
+create a batch.
+"""
+
+
+def rerank_ads_for_user(user_row, retrieved_ads, ranking_model):
+ ads_df = pd.DataFrame(retrieved_ads)
+ num_ads = len(ads_df)
+ user_inputs = {
+ k: tf.fill(
+ (num_ads, 1),
+ str(user_row[k]) if k not in continuous_features else float(user_row[k]),
+ )
+ for k in USER_FEATURES
+ }
+ ad_inputs = {
+ k: tf.reshape(tf.constant(ads_df[k].astype(str).values), (-1, 1))
+ for k in AD_FEATURES
+ }
+ scores = (
+ ranking_model({"user": user_inputs, "positive_ad": ad_inputs}).numpy().flatten()
+ )
+ ads_df["ranking_score"] = scores
+ return ads_df.sort_values("ranking_score", ascending=False).to_dict("records")
+
+
+sample_user = x_test.iloc[0]
+scores, indices = retrieval_engine.query_batch(pd.DataFrame([sample_user]))
+top_ads = retrieval_engine.decode_results(scores, indices)[0]
+final_ranked_ads = rerank_ads_for_user(sample_user, top_ads, ranking_model)
+print(f"User: {sample_user['user_id']}")
+print(f"{'Ad ID':<10} | {'Topic':<30} | {'Retrival Score':<11} | {'Rank Probability'}")
+for item in final_ranked_ads:
+ print(
+ f"{item['ad_id']:<10} | {item['ad_topic'][:28]:<30} | {item['score']:.4f} |{item['ranking_score']*100:.2f}%"
+ )
diff --git a/two_stage_rs_with_marketing_interaction.ipynb b/two_stage_rs_with_marketing_interaction.ipynb
new file mode 100644
index 0000000000..cb1843b016
--- /dev/null
+++ b/two_stage_rs_with_marketing_interaction.ipynb
@@ -0,0 +1,1126 @@
+{
+ "nbformat": 4,
+ "nbformat_minor": 0,
+ "metadata": {
+ "colab": {
+ "provenance": []
+ },
+ "kernelspec": {
+ "name": "python3",
+ "display_name": "Python 3"
+ },
+ "language_info": {
+ "name": "python"
+ }
+ },
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "source": [
+ "# **Introduction**\n",
+ "\n",
+ "This tutorial demonstrates a critical business scenario: a user lands on a website, and a marketing engine must decide which specific ad to display from an inventory of thousands.\n",
+ "The goal is to maximize the Click-Through Rate (CTR). Showing irrelevant ads wastes marketing budget and annoys the user. Therefore, we need a system that predicts the probability of a specific user clicking on a specific ad based on their demographics and browsing habits.\n",
+ "\n",
+ "**Architecture**\n",
+ "1. **The Retrieval Stage:** Efficiently select an initial set of roughly 10-100 candidates from millions of possibilities. It weeds out items the user is definitely not interested in.\n",
+ "User Tower: Embeds user features (ID, demographics, behavior) into a vector.\n",
+ "Item Tower: Embeds ad features (Ad ID, Topic) into a vector.\n",
+ "Interaction: The dot product of these two vectors represents similarity.\n",
+ "2. **The Ranking Stage:** It takes the output of the retrieval model and fine-tune the order to select the single best ad to show.\n",
+ "A Deep Neural Network (MLP).\n",
+ "Interaction: It takes the User Embedding, Ad Embedding, and their similarity score to predict a precise probability (0% to 100%) that the user will click.\n",
+ "\n",
+ ""
+ ],
+ "metadata": {
+ "id": "y5jO6Y78Vf-N"
+ }
+ },
+ {
+ "cell_type": "markdown",
+ "source": [
+ "# **Dataset**\n",
+ "We will use the [Ad Click Prediction](https://www.kaggle.com/datasets/mafrojaakter/ad-click-data) Dataset from Kaggle\n",
+ "\n",
+ "**Feature Distribution of dataset:**\n",
+ "User Tower describes who is looking and features contains i.e Gender, City, Country, Age, Daily Internet Usage, Daily Time Spent on Site, and Area Income.\n",
+ "Item Tower describes what is being shown and features contains Ad Topic Line, Ad ID.\n",
+ "\n",
+ "In this tutorial, we are going to build and train a Two-Tower (User Tower and Ad Tower) model using the Ad Click Prediction dataset from Kaggle.\n",
+ "We're going to:\n",
+ "1. **Data Pipeline:** Get our data and preprocess it for both Retrieval (implicit feedback) and Ranking (explicit labels).\n",
+ "2. **Retrieval:** Implement and train a Two-Tower model to generate candidates.\n",
+ "3. **Ranking:** Implement and train a Neural Ranking model to predict click probabilities.\n",
+ "4. **Inference:** Run an end-to-end test (Retrieval --> Ranking) to generate recommendations for a specific user."
+ ],
+ "metadata": {
+ "id": "xcJBUXmeaavN"
+ }
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 1,
+ "metadata": {
+ "id": "AL5vdFd8QOZl",
+ "colab": {
+ "base_uri": "https://localhost:8080/"
+ },
+ "outputId": "8b519c48-1e1a-4e58-9325-6108cfb7b4da"
+ },
+ "outputs": [
+ {
+ "output_type": "stream",
+ "name": "stdout",
+ "text": [
+ "\u001b[?25l \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m0.0/92.5 kB\u001b[0m \u001b[31m?\u001b[0m eta \u001b[36m-:--:--\u001b[0m\r\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m92.5/92.5 kB\u001b[0m \u001b[31m2.8 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+ "\u001b[?25h"
+ ]
+ }
+ ],
+ "source": [
+ "!pip install -q keras-rs"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 2,
+ "metadata": {
+ "id": "2cdPdsiFQOZm"
+ },
+ "outputs": [],
+ "source": [
+ "import os\n",
+ "os.environ[\"KERAS_BACKEND\"] = \"tensorflow\"\n",
+ "import keras\n",
+ "import matplotlib.pyplot as plt\n",
+ "import numpy as np\n",
+ "import tensorflow as tf\n",
+ "import pandas as pd\n",
+ "import keras_rs\n",
+ "import tensorflow_datasets as tfds\n",
+ "from mpl_toolkits.axes_grid1 import make_axes_locatable\n",
+ "from keras import layers\n",
+ "from concurrent.futures import ThreadPoolExecutor\n",
+ "from sklearn.model_selection import train_test_split\n",
+ "from sklearn.preprocessing import MinMaxScaler\n",
+ "\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "source": [
+ "# **Preparing Dataset**"
+ ],
+ "metadata": {
+ "id": "fdhb5tuL9UBe"
+ }
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "from google.colab import files\n",
+ "files.upload()"
+ ],
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/",
+ "height": 91
+ },
+ "id": "RJN16Th-9W8E",
+ "outputId": "bfa060e0-25fe-41a4-cddd-b46aea023352"
+ },
+ "execution_count": 3,
+ "outputs": [
+ {
+ "output_type": "display_data",
+ "data": {
+ "text/plain": [
+ ""
+ ],
+ "text/html": [
+ "\n",
+ " \n",
+ " \n",
+ " Upload widget is only available when the cell has been executed in the\n",
+ " current browser session. Please rerun this cell to enable.\n",
+ " \n",
+ " "
+ ]
+ },
+ "metadata": {}
+ },
+ {
+ "output_type": "stream",
+ "name": "stdout",
+ "text": [
+ "Saving kaggle (1).json to kaggle (1).json\n"
+ ]
+ },
+ {
+ "output_type": "execute_result",
+ "data": {
+ "text/plain": [
+ "{'kaggle (1).json': b'{\"username\":\"mansim071\",\"key\":\"7b9249c264ac5cb7d295afcdd44f7ad1\"}'}"
+ ]
+ },
+ "metadata": {},
+ "execution_count": 3
+ }
+ ]
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "!mkdir -p ~/.kaggle\n",
+ "!mv kaggle.json ~/.kaggle/\n",
+ "!chmod 600 ~/.kaggle/kaggle.json"
+ ],
+ "metadata": {
+ "id": "G4JgdNRp9tI3"
+ },
+ "execution_count": 4,
+ "outputs": []
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "!kaggle datasets download -d mafrojaakter/ad-click-data\n",
+ "!unzip -o ad-click-data.zip -d ./ad_click_data"
+ ],
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/"
+ },
+ "id": "NOhaq3bl-bmp",
+ "outputId": "bcd54c95-28dc-42b5-8f82-39f8763db18a"
+ },
+ "execution_count": 5,
+ "outputs": [
+ {
+ "output_type": "stream",
+ "name": "stdout",
+ "text": [
+ "Dataset URL: https://www.kaggle.com/datasets/mafrojaakter/ad-click-data\n",
+ "License(s): unknown\n",
+ "Downloading ad-click-data.zip to /content\n",
+ " 0% 0.00/37.6k [00:00, ?B/s]\n",
+ "100% 37.6k/37.6k [00:00<00:00, 138MB/s]\n",
+ "Archive: ad-click-data.zip\n",
+ " inflating: ./ad_click_data/Ad Click Data.csv \n"
+ ]
+ }
+ ]
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "ads_df= pd.read_csv('/content/ad_click_data/Ad Click Data.csv')\n",
+ "# Clean column names\n",
+ "ads_df.columns = ads_df.columns.str.strip()\n",
+ "# Rename the column name\n",
+ "ads_df = ads_df.rename(columns={\n",
+ " 'Male': 'gender',\n",
+ " 'Ad Topic Line': 'ad_topic',\n",
+ " 'City': 'city',\n",
+ " 'Country': 'country',\n",
+ " 'Daily Time Spent on Site': 'time_on_site',\n",
+ " 'Daily Internet Usage': 'internet_usage',\n",
+ " 'Area Income': 'area_income'\n",
+ "})\n",
+ "#Add user_id and add_id column\n",
+ "ads_df['user_id'] = \"user_\" + ads_df.index.astype(str)\n",
+ "ads_df['ad_id'] = \"ad_\" + ads_df['ad_topic'].astype('category').cat.codes.astype(str)\n",
+ "# Remove nulls and normalize\n",
+ "ads_df = ads_df.dropna()\n",
+ "#normalize\n",
+ "numeric_cols = [\"time_on_site\", \"internet_usage\", \"area_income\", \"Age\"]\n",
+ "scaler = MinMaxScaler()\n",
+ "ads_df[numeric_cols] = scaler.fit_transform(ads_df[numeric_cols])"
+ ],
+ "metadata": {
+ "id": "Nyq64PZo-axX"
+ },
+ "execution_count": 6,
+ "outputs": []
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "#Split the train and test datasets\n",
+ "x_train,x_test= train_test_split(ads_df,test_size=0.2,random_state=42)"
+ ],
+ "metadata": {
+ "id": "2xBrCTZkuD6p"
+ },
+ "execution_count": 7,
+ "outputs": []
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "def dict_to_tensor_features(df_features, continuous_features):\n",
+ " tensor_dict = {}\n",
+ " for k, v in df_features.items():\n",
+ " if k in continuous_features:\n",
+ " tensor_dict[k] = tf.expand_dims(tf.constant(v, dtype='float32'), axis=-1)\n",
+ " else:\n",
+ " v_str = np.array(v).astype(str).tolist()\n",
+ " tensor_dict[k] = tf.expand_dims(tf.constant(v_str, dtype='string'), axis=-1)\n",
+ " return tensor_dict\n",
+ "\n",
+ "def create_retrieval_dataset(data_df,all_ads_features,all_ad_ids,\n",
+ " user_features_list,ad_features_list,continuous_features_list):\n",
+ "\n",
+ " # Filter for Positive Interactions (Cicks)\n",
+ " positive_interactions = data_df[data_df[\"Clicked on Ad\"] == 1].copy()\n",
+ "\n",
+ " if positive_interactions.empty:\n",
+ " return None\n",
+ "\n",
+ " def sample_negative(positive_ad_id):\n",
+ " neg_ad_id = positive_ad_id\n",
+ " while neg_ad_id == positive_ad_id:\n",
+ " neg_ad_id = np.random.choice(all_ad_ids)\n",
+ " return neg_ad_id\n",
+ "\n",
+ " def create_triplets_row(pos_row):\n",
+ " pos_ad_id = pos_row.ad_id\n",
+ " neg_ad_id = sample_negative(pos_ad_id)\n",
+ "\n",
+ " neg_ad_row = all_ads_features[all_ads_features['ad_id'] == neg_ad_id].iloc[0]\n",
+ " user_features_dict = {name: getattr(pos_row, name) for name in user_features_list}\n",
+ " pos_ad_features_dict = {name: getattr(pos_row, name) for name in ad_features_list}\n",
+ " neg_ad_features_dict = {name: neg_ad_row[name] for name in ad_features_list}\n",
+ "\n",
+ " return {\n",
+ " \"user\": user_features_dict,\n",
+ " \"positive_ad\": pos_ad_features_dict,\n",
+ " \"negative_ad\": neg_ad_features_dict\n",
+ " }\n",
+ "\n",
+ " with ThreadPoolExecutor(max_workers=8) as executor:\n",
+ " triplets = list(executor.map(create_triplets_row, positive_interactions.itertuples(index=False)))\n",
+ "\n",
+ " triplets_df = pd.DataFrame(triplets)\n",
+ " user_df = triplets_df[\"user\"].apply(pd.Series)\n",
+ " pos_ad_df = triplets_df[\"positive_ad\"].apply(pd.Series)\n",
+ " neg_ad_df = triplets_df[\"negative_ad\"].apply(pd.Series)\n",
+ "\n",
+ " user_features_tensor = dict_to_tensor_features(user_df.to_dict('list'), continuous_features_list)\n",
+ " pos_ad_features_tensor = dict_to_tensor_features(pos_ad_df.to_dict('list'), continuous_features_list)\n",
+ " neg_ad_features_tensor = dict_to_tensor_features(neg_ad_df.to_dict('list'), continuous_features_list)\n",
+ "\n",
+ " features = {\n",
+ " \"user\": user_features_tensor,\n",
+ " \"positive_ad\": pos_ad_features_tensor,\n",
+ " \"negative_ad\": neg_ad_features_tensor,\n",
+ " }\n",
+ " y_true = tf.ones((triplets_df.shape[0], 1), dtype=tf.float32)\n",
+ " dataset = tf.data.Dataset.from_tensor_slices((features, y_true))\n",
+ " buffer_size = len(triplets_df)\n",
+ " dataset = dataset.shuffle(buffer_size=buffer_size).batch(64).cache().prefetch(tf.data.AUTOTUNE)\n",
+ " return dataset\n",
+ "\n",
+ "user_clicked_ads = (\n",
+ " x_train[x_train[\"Clicked on Ad\"] == 1]\n",
+ " .groupby(\"user_id\")[\"ad_id\"]\n",
+ " .apply(set)\n",
+ " .to_dict()\n",
+ ")\n",
+ "\n",
+ "for u in x_train[\"user_id\"].unique():\n",
+ " if u not in user_clicked_ads:\n",
+ " user_clicked_ads[u] = set()\n",
+ "\n",
+ "AD_FEATURES = [\"ad_id\", \"ad_topic\"]\n",
+ "USER_FEATURES = [\"user_id\", \"gender\", \"city\", \"country\", \"time_on_site\", \"internet_usage\", \"area_income\", \"Age\"]\n",
+ "continuous_features = [\"time_on_site\", \"internet_usage\", \"area_income\", \"Age\"]\n",
+ "\n",
+ "all_ads_features = x_train[AD_FEATURES].drop_duplicates().reset_index(drop=True)\n",
+ "all_ad_ids = all_ads_features['ad_id'].tolist()\n",
+ "\n",
+ "retrieval_train_dataset = create_retrieval_dataset(\n",
+ " data_df=x_train,\n",
+ " all_ads_features=all_ads_features,\n",
+ " all_ad_ids=all_ad_ids,\n",
+ " user_features_list=USER_FEATURES,\n",
+ " ad_features_list=AD_FEATURES,\n",
+ " continuous_features_list=continuous_features\n",
+ ")\n",
+ "\n",
+ "retrieval_test_dataset = create_retrieval_dataset(\n",
+ " data_df=x_test,\n",
+ " all_ads_features=all_ads_features,\n",
+ " all_ad_ids=all_ad_ids,\n",
+ " user_features_list=USER_FEATURES,\n",
+ " ad_features_list=AD_FEATURES,\n",
+ " continuous_features_list=continuous_features\n",
+ ")"
+ ],
+ "metadata": {
+ "id": "D0eSXIKpsUSM"
+ },
+ "execution_count": 28,
+ "outputs": []
+ },
+ {
+ "cell_type": "markdown",
+ "source": [
+ "# **Implement the Retrival Model**\n",
+ "For the Retrieval stage, we will build a Two-Tower Model.\n",
+ "\n",
+ "**The Architecture Components:**\n",
+ "\n",
+ "1. User Tower:User features (User ID, demographics, behavior metrics like time_on_site). It encodes these mixed features into a fixed-size vector representation called the User Embedding.\n",
+ "2. Item (Ad) Tower:Ad features (Ad ID, Ad Topic Line).It encodes these features into a fixed-size vector representation called the Item Embedding.\n",
+ "3. Interaction (Similarity):We calculate the Dot Product between the User Embedding and the Item Embedding."
+ ],
+ "metadata": {
+ "id": "48AtiZBm1N6W"
+ }
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "keras.utils.set_random_seed(42)"
+ ],
+ "metadata": {
+ "id": "07SgZrFa7BFy"
+ },
+ "execution_count": 29,
+ "outputs": []
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "vocab_map = {\n",
+ " \"user_id\": x_train[\"user_id\"].unique(),\n",
+ " \"gender\": x_train[\"gender\"].astype(str).unique(),\n",
+ " \"city\": x_train[\"city\"].unique(),\n",
+ " \"country\": x_train[\"country\"].unique(),\n",
+ " \"ad_id\": x_train[\"ad_id\"].unique(),\n",
+ " \"ad_topic\": x_train[\"ad_topic\"].unique()\n",
+ "}\n",
+ "cont_feats = [\"time_on_site\", \"internet_usage\", \"area_income\", \"Age\"]\n",
+ "\n",
+ "normalizers = {}\n",
+ "for f in cont_feats:\n",
+ " norm = layers.Normalization(axis=None)\n",
+ " norm.adapt(x_train[f].values.astype('float32'))\n",
+ " normalizers[f] = norm\n",
+ "\n",
+ "def build_tower(feature_names, continuous_names=None, embed_dim=64, name=\"tower\"):\n",
+ " inputs, embeddings = {}, []\n",
+ "\n",
+ " for feat in feature_names:\n",
+ " if feat in vocab_map:\n",
+ " inp = keras.Input(shape=(1,), dtype=tf.string, name=feat)\n",
+ " inputs[feat] = inp\n",
+ " vocab = list(vocab_map[feat])\n",
+ " x = layers.StringLookup(vocabulary=vocab)(inp)\n",
+ " x = layers.Embedding(len(vocab) + 1, embed_dim, embeddings_regularizer='l2')(x)\n",
+ " embeddings.append(layers.Flatten()(x))\n",
+ "\n",
+ " if continuous_names:\n",
+ " for feat in continuous_names:\n",
+ " inp = keras.Input(shape=(1,), dtype=tf.float32, name=feat)\n",
+ " inputs[feat] = inp\n",
+ " embeddings.append(normalizers[feat](inp))\n",
+ "\n",
+ " x = layers.Concatenate()(embeddings)\n",
+ " x = layers.Dense(128, activation=\"relu\")(x)\n",
+ " x = layers.Dropout(0.2)(x)\n",
+ " x = layers.Dense(64, activation=\"relu\")(x)\n",
+ " output = layers.Dense(embed_dim)(layers.Dropout(0.2)(x))\n",
+ "\n",
+ " return keras.Model(inputs=inputs, outputs=output, name=name)"
+ ],
+ "metadata": {
+ "id": "1xNPuYzU_Zgj"
+ },
+ "execution_count": 30,
+ "outputs": []
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "user_tower = build_tower([\"user_id\", \"gender\", \"city\", \"country\"], cont_feats, name=\"user_tower\")\n",
+ "ad_tower = build_tower([\"ad_id\", \"ad_topic\"], name=\"ad_tower\")"
+ ],
+ "metadata": {
+ "id": "oR97oiPV_f5v"
+ },
+ "execution_count": 31,
+ "outputs": []
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "def bpr_hinge_loss(y_true, y_pred):\n",
+ " margin = 1.0\n",
+ " return -tf.math.log(tf.nn.sigmoid(y_pred) + 1e-10)"
+ ],
+ "metadata": {
+ "id": "pt7mR-WxJFwx"
+ },
+ "execution_count": 32,
+ "outputs": []
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "class RetrievalModel(keras.Model):\n",
+ " def __init__(self, user_tower_instance, ad_tower_instance, **kwargs):\n",
+ " super().__init__(**kwargs)\n",
+ " self.user_tower = user_tower\n",
+ " self.ad_tower = ad_tower\n",
+ " self.ln_user = layers.LayerNormalization()\n",
+ " self.ln_ad = layers.LayerNormalization()\n",
+ "\n",
+ "\n",
+ " def call(self,inputs):\n",
+ " u_emb = self.ln_user(self.user_tower(inputs[\"user\"]))\n",
+ " pos_emb = self.ln_ad(self.ad_tower(inputs[\"positive_ad\"]))\n",
+ " neg_emb = self.ln_ad(self.ad_tower(inputs[\"negative_ad\"]))\n",
+ " pos_score = keras.ops.sum(u_emb * pos_emb, axis=1, keepdims=True)\n",
+ " neg_score = keras.ops.sum(u_emb * neg_emb, axis=1, keepdims=True)\n",
+ " return pos_score - neg_score\n",
+ "\n",
+ "\n",
+ " def get_embeddings(self, inputs):\n",
+ " u_emb = self.ln_user(self.user_tower(inputs[\"user\"]))\n",
+ " ad_emb = self.ln_ad(self.ad_tower(inputs[\"positive_ad\"]))\n",
+ " dot_interaction = keras.ops.sum(u_emb * ad_emb, axis=1, keepdims=True)\n",
+ " return u_emb, ad_emb, dot_interaction"
+ ],
+ "metadata": {
+ "id": "Mx-PbEFOeCMf"
+ },
+ "execution_count": 33,
+ "outputs": []
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "retrieval_model = RetrievalModel(user_tower, ad_tower)\n",
+ "retrieval_model.compile(optimizer=keras.optimizers.Adam(learning_rate=1e-3),loss=bpr_hinge_loss)\n",
+ "history = retrieval_model.fit(retrieval_train_dataset,epochs=30)"
+ ],
+ "metadata": {
+ "id": "K2i_5VPiF2F_",
+ "colab": {
+ "base_uri": "https://localhost:8080/"
+ },
+ "outputId": "af8bfbdb-f97d-4cbc-cc22-83c19ca6e478"
+ },
+ "execution_count": 34,
+ "outputs": [
+ {
+ "output_type": "stream",
+ "name": "stdout",
+ "text": [
+ "Epoch 1/30\n",
+ "\u001b[1m6/6\u001b[0m \u001b[32m━━━━━━━━━━━━━━━━━━━━\u001b[0m\u001b[37m\u001b[0m \u001b[1m6s\u001b[0m 9ms/step - loss: 2.9548\n",
+ "Epoch 2/30\n",
+ "\u001b[1m6/6\u001b[0m \u001b[32m━━━━━━━━━━━━━━━━━━━━\u001b[0m\u001b[37m\u001b[0m \u001b[1m0s\u001b[0m 7ms/step - loss: 1.3977 \n",
+ "Epoch 3/30\n",
+ "\u001b[1m6/6\u001b[0m \u001b[32m━━━━━━━━━━━━━━━━━━━━\u001b[0m\u001b[37m\u001b[0m \u001b[1m0s\u001b[0m 7ms/step - loss: 1.1149 \n",
+ "Epoch 4/30\n",
+ "\u001b[1m6/6\u001b[0m \u001b[32m━━━━━━━━━━━━━━━━━━━━\u001b[0m\u001b[37m\u001b[0m \u001b[1m0s\u001b[0m 7ms/step - loss: 0.9265 \n",
+ "Epoch 5/30\n",
+ "\u001b[1m6/6\u001b[0m \u001b[32m━━━━━━━━━━━━━━━━━━━━\u001b[0m\u001b[37m\u001b[0m \u001b[1m0s\u001b[0m 8ms/step - loss: 0.7926 \n",
+ "Epoch 6/30\n",
+ "\u001b[1m6/6\u001b[0m \u001b[32m━━━━━━━━━━━━━━━━━━━━\u001b[0m\u001b[37m\u001b[0m \u001b[1m0s\u001b[0m 8ms/step - loss: 0.6924 \n",
+ "Epoch 7/30\n",
+ "\u001b[1m6/6\u001b[0m \u001b[32m━━━━━━━━━━━━━━━━━━━━\u001b[0m\u001b[37m\u001b[0m \u001b[1m0s\u001b[0m 7ms/step - loss: 0.6163 \n",
+ "Epoch 8/30\n",
+ "\u001b[1m6/6\u001b[0m \u001b[32m━━━━━━━━━━━━━━━━━━━━\u001b[0m\u001b[37m\u001b[0m \u001b[1m0s\u001b[0m 8ms/step - loss: 0.5574 \n",
+ "Epoch 9/30\n",
+ "\u001b[1m6/6\u001b[0m \u001b[32m━━━━━━━━━━━━━━━━━━━━\u001b[0m\u001b[37m\u001b[0m \u001b[1m0s\u001b[0m 8ms/step - loss: 0.5107 \n",
+ "Epoch 10/30\n",
+ "\u001b[1m6/6\u001b[0m \u001b[32m━━━━━━━━━━━━━━━━━━━━\u001b[0m\u001b[37m\u001b[0m \u001b[1m0s\u001b[0m 12ms/step - loss: 0.4725\n",
+ "Epoch 11/30\n",
+ "\u001b[1m6/6\u001b[0m \u001b[32m━━━━━━━━━━━━━━━━━━━━\u001b[0m\u001b[37m\u001b[0m \u001b[1m0s\u001b[0m 8ms/step - loss: 0.4401 \n",
+ "Epoch 12/30\n",
+ "\u001b[1m6/6\u001b[0m \u001b[32m━━━━━━━━━━━━━━━━━━━━\u001b[0m\u001b[37m\u001b[0m \u001b[1m0s\u001b[0m 7ms/step - loss: 0.4120 \n",
+ "Epoch 13/30\n",
+ "\u001b[1m6/6\u001b[0m \u001b[32m━━━━━━━━━━━━━━━━━━━━\u001b[0m\u001b[37m\u001b[0m \u001b[1m0s\u001b[0m 8ms/step - loss: 0.3869 \n",
+ "Epoch 14/30\n",
+ "\u001b[1m6/6\u001b[0m \u001b[32m━━━━━━━━━━━━━━━━━━━━\u001b[0m\u001b[37m\u001b[0m \u001b[1m0s\u001b[0m 7ms/step - loss: 0.3644 \n",
+ "Epoch 15/30\n",
+ "\u001b[1m6/6\u001b[0m \u001b[32m━━━━━━━━━━━━━━━━━━━━\u001b[0m\u001b[37m\u001b[0m \u001b[1m0s\u001b[0m 7ms/step - loss: 0.3438 \n",
+ "Epoch 16/30\n",
+ "\u001b[1m6/6\u001b[0m \u001b[32m━━━━━━━━━━━━━━━━━━━━\u001b[0m\u001b[37m\u001b[0m \u001b[1m0s\u001b[0m 8ms/step - loss: 0.3249 \n",
+ "Epoch 17/30\n",
+ "\u001b[1m6/6\u001b[0m \u001b[32m━━━━━━━━━━━━━━━━━━━━\u001b[0m\u001b[37m\u001b[0m \u001b[1m0s\u001b[0m 7ms/step - loss: 0.3075 \n",
+ "Epoch 18/30\n",
+ "\u001b[1m6/6\u001b[0m \u001b[32m━━━━━━━━━━━━━━━━━━━━\u001b[0m\u001b[37m\u001b[0m \u001b[1m0s\u001b[0m 7ms/step - loss: 0.2914 \n",
+ "Epoch 19/30\n",
+ "\u001b[1m6/6\u001b[0m \u001b[32m━━━━━━━━━━━━━━━━━━━━\u001b[0m\u001b[37m\u001b[0m \u001b[1m0s\u001b[0m 7ms/step - loss: 0.2765 \n",
+ "Epoch 20/30\n",
+ "\u001b[1m6/6\u001b[0m \u001b[32m━━━━━━━━━━━━━━━━━━━━\u001b[0m\u001b[37m\u001b[0m \u001b[1m0s\u001b[0m 7ms/step - loss: 0.2627 \n",
+ "Epoch 21/30\n",
+ "\u001b[1m6/6\u001b[0m \u001b[32m━━━━━━━━━━━━━━━━━━━━\u001b[0m\u001b[37m\u001b[0m \u001b[1m0s\u001b[0m 8ms/step - loss: 0.2498 \n",
+ "Epoch 22/30\n",
+ "\u001b[1m6/6\u001b[0m \u001b[32m━━━━━━━━━━━━━━━━━━━━\u001b[0m\u001b[37m\u001b[0m \u001b[1m0s\u001b[0m 7ms/step - loss: 0.2378 \n",
+ "Epoch 23/30\n",
+ "\u001b[1m6/6\u001b[0m \u001b[32m━━━━━━━━━━━━━━━━━━━━\u001b[0m\u001b[37m\u001b[0m \u001b[1m0s\u001b[0m 8ms/step - loss: 0.2265 \n",
+ "Epoch 24/30\n",
+ "\u001b[1m6/6\u001b[0m \u001b[32m━━━━━━━━━━━━━━━━━━━━\u001b[0m\u001b[37m\u001b[0m \u001b[1m0s\u001b[0m 8ms/step - loss: 0.2160 \n",
+ "Epoch 25/30\n",
+ "\u001b[1m6/6\u001b[0m \u001b[32m━━━━━━━━━━━━━━━━━━━━\u001b[0m\u001b[37m\u001b[0m \u001b[1m0s\u001b[0m 7ms/step - loss: 0.2061 \n",
+ "Epoch 26/30\n",
+ "\u001b[1m6/6\u001b[0m \u001b[32m━━━━━━━━━━━━━━━━━━━━\u001b[0m\u001b[37m\u001b[0m \u001b[1m0s\u001b[0m 7ms/step - loss: 0.1968 \n",
+ "Epoch 27/30\n",
+ "\u001b[1m6/6\u001b[0m \u001b[32m━━━━━━━━━━━━━━━━━━━━\u001b[0m\u001b[37m\u001b[0m \u001b[1m0s\u001b[0m 7ms/step - loss: 0.1880 \n",
+ "Epoch 28/30\n",
+ "\u001b[1m6/6\u001b[0m \u001b[32m━━━━━━━━━━━━━━━━━━━━\u001b[0m\u001b[37m\u001b[0m \u001b[1m0s\u001b[0m 7ms/step - loss: 0.1798 \n",
+ "Epoch 29/30\n",
+ "\u001b[1m6/6\u001b[0m \u001b[32m━━━━━━━━━━━━━━━━━━━━\u001b[0m\u001b[37m\u001b[0m \u001b[1m0s\u001b[0m 8ms/step - loss: 0.1720 \n",
+ "Epoch 30/30\n",
+ "\u001b[1m6/6\u001b[0m \u001b[32m━━━━━━━━━━━━━━━━━━━━\u001b[0m\u001b[37m\u001b[0m \u001b[1m0s\u001b[0m 9ms/step - loss: 0.1646 \n"
+ ]
+ }
+ ]
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "pd.DataFrame(history.history).plot(subplots=True, layout=(1, 3), figsize=(12, 4), title=\"Retrival Model Metrics\")\n",
+ "plt.show()"
+ ],
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/",
+ "height": 408
+ },
+ "id": "1tKldoRSdyu0",
+ "outputId": "155cac09-7007-4974-d6c4-231a35cd0683"
+ },
+ "execution_count": 37,
+ "outputs": [
+ {
+ "output_type": "display_data",
+ "data": {
+ "text/plain": [
+ ""
+ ],
+ "image/png": "iVBORw0KGgoAAAANSUhEUgAAAlIAAAGHCAYAAAB7xLxyAAAAOnRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjEwLjAsIGh0dHBzOi8vbWF0cGxvdGxpYi5vcmcvlHJYcgAAAAlwSFlzAAAPYQAAD2EBqD+naQAAPERJREFUeJzt3Xl8VPW9//H3ZJtsk4RA9gQIiywii0EQUEShUgQKVhGhLajAdYlaBGvL7UWtyy/1WitVcUGvcKtFEBH1oojIKgVURFRAEJAlQHbITPaEzPn9ETISSYBMZjLJzOv5eJxHMmfOmfOZEx7Ht9/v93yPyTAMQwAAAGg0P08XAAAA0FoRpAAAAJxEkAIAAHASQQoAAMBJBCkAAAAnEaQAAACcRJACAABwEkEKAADASQQpAAAAJxGkAACtXseOHXXbbbe57fM3bNggk8mkDRs2uO0YTWUymfToo482er/Dhw/LZDJp0aJFLq+pOXj6b0OQAgA4ZdGiRTKZTI4lICBASUlJuu2223T8+HGnPnPPnj169NFHdfjwYdcW20zOPiebN28+533DMJSSkiKTyaQxY8Z4oELn1QYWk8mkN998s95thgwZIpPJpF69ejl1jMWLF2vevHlNqLL5BXi6AABA6/bYY48pNTVV5eXl2rZtmxYtWqTNmzdr165dCg4ObtRn7dmzR3/5y180bNgwdezY8aL327dvn/z8Wk7bQHBwsBYvXqyrrrqqzvqNGzfq2LFjMpvNHqqs6Wq/229/+9s66w8fPqwtW7Y0+m9+tsWLF2vXrl2aOXPmRe8zdOhQlZWVKSgoyOnjNkXL+VcHAGiVRo0apd/+9reaPn26XnvtNT344IM6ePCgPvjgA7ce1zAMlZWVSZLMZrMCAwPderzGuOGGG7Rs2TKdPn26zvrFixcrLS1N8fHxHqqs6W644QatWbNG+fn5ddYvXrxYcXFx6t+/f7PUUV5eLrvdLj8/PwUHB3ssSBOkAAAudfXVV0uSDh48WGf93r17dfPNNys6OlrBwcHq379/nbC1aNEiTZgwQZJ07bXXOrqRase+dOzYUWPGjNHq1avVv39/hYSE6JVXXnG8VztGavv27TKZTPrf//3fc2pbvXq1TCaTVq5cKUk6cuSI7rnnHnXr1k0hISFq27atJkyY0OSuxUmTJqmgoEBr1qxxrKusrNQ777yjyZMn17tPSUmJZs+erZSUFJnNZnXr1k1/+9vfZBhGne0qKir0wAMPKCYmRhaLRb/61a907Nixej/z+PHjuuOOOxQXFyez2axLL71Ur7/+epO+27hx42Q2m7Vs2bI66xcvXqxbbrlF/v7+9e735ptvKi0tTSEhIYqOjtatt96qzMxMx/vDhg3Thx9+qCNHjjj+9rWtkrXdikuWLNF//dd/KSkpSaGhobLZbA2Okfr88891ww03qE2bNgoLC1Pv3r31j3/8w/F+dna2br/9diUnJ8tsNishIUHjxo1r9N+erj0AgEvV/oeoTZs2jnW7d+/WkCFDlJSUpD/96U8KCwvT22+/rfHjx2v58uW68cYbNXToUN1///167rnn9J//+Z/q0aOHJDl+SjVdeJMmTdKdd96pGTNmqFu3buccv3///urUqZPefvttTZ06tc57S5cuVZs2bTRy5EhJ0pdffqktW7bo1ltvVXJysg4fPqyXXnpJw4YN0549exQaGurUOejYsaMGDRqkt956S6NGjZIkrVq1SlarVbfeequee+65OtsbhqFf/epXWr9+vaZNm6a+fftq9erV+sMf/qDjx4/r2WefdWw7ffp0vfnmm5o8ebIGDx6sdevWafTo0efUkJOToyuvvFImk0n33nuvYmJitGrVKk2bNk02m61R3WdnCw0N1bhx4/TWW2/p7rvvliR988032r17t1577TV9++235+zz5JNPau7cubrllls0ffp05eXl6fnnn9fQoUP19ddfKyoqSn/+859ltVp17Ngxx/cNDw+v8zmPP/64goKC9OCDD6qioqLB7rw1a9ZozJgxSkhI0O9//3vFx8fr+++/18qVK/X73/9eknTTTTdp9+7duu+++9SxY0fl5uZqzZo1Onr0aKO6lWUAAOCEhQsXGpKMTz/91MjLyzMyMzONd955x4iJiTHMZrORmZnp2Hb48OHGZZddZpSXlzvW2e12Y/DgwUbXrl0d65YtW2ZIMtavX3/O8Tp06GBIMj7++ON635s6darj9Zw5c4zAwEDj5MmTjnUVFRVGVFSUcccddzjWlZaWnvNZW7duNSQZ//znPx3r1q9f32Bd9Z2TL7/80njhhRcMi8XiOMaECROMa6+91lHv6NGjHfu99957hiTjiSeeqPN5N998s2EymYwDBw4YhmEYO3fuNCQZ99xzT53tJk+ebEgyHnnkEce6adOmGQkJCUZ+fn6dbW+99VYjMjLSUdehQ4cMScbChQvP+91qz8GyZcuMlStXGiaTyTh69KhhGIbxhz/8wejUqZNhGIZxzTXXGJdeeqljv8OHDxv+/v7Gk08+WefzvvvuOyMgIKDO+tGjRxsdOnRo8NidOnU652/287/N6dOnjdTUVKNDhw7GqVOn6mxrt9sNwzCMU6dOGZKMp59++rzf+WLQtQcAaJIRI0YoJiZGKSkpuvnmmxUWFqYPPvhAycnJkqSTJ09q3bp1uuWWW1RUVKT8/Hzl5+eroKBAI0eO1P79+y/6Lr/U1FRHa9L5TJw4UVVVVXr33Xcd6z755BMVFhZq4sSJjnUhISGO36uqqlRQUKAuXbooKipKO3bsuNhTUK9bbrlFZWVlWrlypYqKirRy5coGu/U++ugj+fv76/7776+zfvbs2TIMQ6tWrXJsJ+mc7X7eumQYhpYvX66xY8fKMAzHOc/Pz9fIkSNltVqb9P2uv/56RUdHa8mSJTIMQ0uWLNGkSZPq3fbdd9+V3W7XLbfcUqeO+Ph4de3aVevXr7/o406dOrXO36w+X3/9tQ4dOqSZM2cqKiqqznsmk0lSzd89KChIGzZs0KlTpy76+PWhaw8A0CTz58/XJZdcIqvVqtdff12bNm2qc1fagQMHZBiG5s6dq7lz59b7Gbm5uUpKSrrgsVJTUy+qpj59+qh79+5aunSppk2bJqmmW69du3a67rrrHNuVlZUpIyNDCxcu1PHjx+uMR7JarRd1rIbExMRoxIgRWrx4sUpLS1VdXa2bb7653m2PHDmixMREWSyWOutruzWPHDni+Onn56fOnTvX2e7nXZx5eXkqLCzUggULtGDBgnqPmZub69T3kqTAwEBNmDBBixcv1oABA5SZmdlgSNy/f78Mw1DXrl0b/KyLdTF//9qxeeebgsFsNuupp57S7NmzFRcXpyuvvFJjxozRlClTGn0jAEEKANAkAwYMcNypNX78eF111VWaPHmy9u3bp/DwcNntdknSgw8+2GBrUpcuXS7qWBdqjTjbxIkT9eSTTyo/P18Wi0UffPCBJk2apICAn/7Td99992nhwoWaOXOmBg0apMjISJlMJt16662Oupti8uTJmjFjhrKzszVq1KhzWkjcpbb23/72t+eME6vVu3fvJh1j8uTJevnll/Xoo4+qT58+6tmzZ4O1mEwmrVq1qt6B6D8fB3U+jfn7X8jMmTM1duxYvffee1q9erXmzp2rjIwMrVu3Tv369bvozyFIAQBcxt/fXxkZGbr22mv1wgsv6E9/+pM6deokqablYcSIEefdv7brxRUmTpyov/zlL1q+fLni4uJks9l066231tnmnXfe0dSpU/XMM8841pWXl6uwsNAlNdx444268847tW3bNi1durTB7Tp06KBPP/1URUVFdVql9u7d63i/9qfdbtfBgwfrtELt27evzufV3tFXXV19wXPurKuuukrt27fXhg0b9NRTTzW4XefOnWUYhlJTU3XJJZec9zNd8fevba3btWvXBb97586dNXv2bM2ePVv79+9X37599cwzzzQ44Wh9GCMFAHCpYcOGacCAAZo3b57Ky8sVGxurYcOG6ZVXXlFWVtY52+fl5Tl+DwsLkySXBJkePXrosssu09KlS7V06VIlJCRo6NChdbbx9/c/Z3qB559/XtXV1U0+vlTT2vLSSy/p0Ucf1dixYxvc7oYbblB1dbVeeOGFOuufffZZmUwmx51/tT9/ftffz2cD9/f310033aTly5dr165d5xzv7HPuLJPJpOeee06PPPKIfve73zW43a9//Wv5+/vrL3/5yznn2jAMFRQUOF6HhYU1uUv18ssvV2pqqubNm3fOv6Pa45eWlqq8vLzOe507d5bFYlFFRUWjjkeLFADA5f7whz9owoQJWrRoke666y7Nnz9fV111lS677DLNmDFDnTp1Uk5OjrZu3apjx47pm2++kST17dtX/v7+euqpp2S1WmU2m3XdddcpNjbWqTomTpyohx9+WMHBwZo2bdo5kzaOGTNGb7zxhiIjI9WzZ09t3bpVn376qdq2bdvkc1Croa61s40dO1bXXnut/vznP+vw4cPq06ePPvnkE73//vuaOXOmo5Wlb9++mjRpkl588UVZrVYNHjxYa9eu1YEDB875zL/+9a9av369Bg4cqBkzZqhnz546efKkduzYoU8//VQnT55s8ncbN26cxo0bd95tOnfurCeeeEJz5szR4cOHNX78eFksFh06dEgrVqzQf/zHf+jBBx+UJKWlpWnp0qWaNWuWrrjiCoWHh583gNbHz89PL730ksaOHau+ffvq9ttvV0JCgvbu3avdu3dr9erV+uGHHzR8+HDdcsst6tmzpwICArRixQrl5OSc02p5QU2+7w8A4JPOvtX/56qrq43OnTsbnTt3Nk6fPm0YhmEcPHjQmDJlihEfH28EBgYaSUlJxpgxY4x33nmnzr6vvvqq0alTJ8Pf37/Obe0/nzLgbD+f/qDW/v37DUmGJGPz5s3nvH/q1Cnj9ttvN9q1a2eEh4cbI0eONPbu3XvO5zkz/cH51PddioqKjAceeMBITEw0AgMDja5duxpPP/2045b9WmVlZcb9999vtG3b1ggLCzPGjh1rZGZmnjP9gWEYRk5OjpGenm6kpKQYgYGBRnx8vDF8+HBjwYIFjm2cmf7gfH4+/UGt5cuXG1dddZURFhZmhIWFGd27dzfS09ONffv2ObYpLi42Jk+ebERFRRmSHFMhnO/YDf1tNm/ebPziF78wLBaLERYWZvTu3dt4/vnnDcMwjPz8fCM9Pd3o3r27ERYWZkRGRhoDBw403n777fN+t/qYDONn7WwAAAC4KIyRAgAAcBJBCgAAwEkEKQAAACcRpAAAAJxEkAIAAHASQQoAAMBJBCkAAAAnEaQAAACcRJACAABwEkEKAADASQQpAAAAJxGkAAAAnESQAgAAcBJBCgAAwEkEKQAAACcRpAAAAJxEkAIAAHASQQoAAMBJBCkAAAAnEaQAAACcRJACAABwEkEKAADASQQpAAAAJwV4uoCLYbfbdeLECVksFplMJk+XA6CVMQxDRUVFSkxMlJ8f//8IwHVaRZA6ceKEUlJSPF0GgFYuMzNTycnJni4DgBdpFUHKYrFIqrkIRkREeLgaAK2NzWZTSkqK41oCAK7SKoJUbXdeREQEQQqA0xgaAMDVGCwAAADgJIIUAACAkwhSAAAATmoVY6QAX2a321VZWenpMlq8oKAgpjYA0OwIUkALVllZqUOHDslut3u6lBbPz89PqampCgoK8nQpAHwIQQpooQzDUFZWlvz9/ZWSkkJry3nUTtqblZWl9u3bc3cegGZDkAJaqNOnT6u0tFSJiYkKDQ31dDktXkxMjE6cOKHTp08rMDDQ0+UA8BH8Ly7QQlVXV0sSXVUXqfY81Z43AGgOBCmghaOb6uJwngB4AkEKAADASV4ZpO564yuNe2GzjhaUeroUwOcMGzZMM2fO9HQZANAsvDJIfXfcqm+OWXWqlLl3AACA+3hlkLIE19yMWFR+2sOVAAAAb+aVQSrcXBOkiiuqPFwJ4NtOnTqlKVOmqE2bNgoNDdWoUaO0f/9+x/tHjhzR2LFj1aZNG4WFhenSSy/VRx995Nj3N7/5jWJiYhQSEqKuXbtq4cKFnvoqAFAvr5xHKpwWKXghwzBUVuWZW/tDAv2duivutttu0/79+/XBBx8oIiJCf/zjH3XDDTdoz549CgwMVHp6uiorK7Vp0yaFhYVpz549Cg8PlyTNnTtXe/bs0apVq9SuXTsdOHBAZWVlrv5qANAkXhmkLME1k/ERpOBNyqqq1fPh1R459p7HRio0qHGXi9oA9e9//1uDBw+WJP3rX/9SSkqK3nvvPU2YMEFHjx7VTTfdpMsuu0yS1KlTJ8f+R48eVb9+/dS/f39JUseOHV3zZQDAhby8a48gBXjK999/r4CAAA0cONCxrm3bturWrZu+//57SdL999+vJ554QkOGDNEjjzyib7/91rHt3XffrSVLlqhv37566KGHtGXLlmb/DgBwIV7aIkWQgvcJCfTXnsdGeuzY7jB9+nSNHDlSH374oT755BNlZGTomWee0X333adRo0bpyJEj+uijj7RmzRoNHz5c6enp+tvf/uaWWgDAGV7dIkXXHryJyWRSaFCARxZnxkf16NFDp0+f1ueff+5YV1BQoH379qlnz56OdSkpKbrrrrv07rvvavbs2Xr11Vcd78XExGjq1Kl68803NW/ePC1YsKBpJxEAXMyrW6SKyrlrD/CUrl27aty4cZoxY4ZeeeUVWSwW/elPf1JSUpLGjRsnSZo5c6ZGjRqlSy65RKdOndL69evVo0cPSdLDDz+stLQ0XXrppaqoqNDKlSsd7wFAS+HVLVJ07QGetXDhQqWlpWnMmDEaNGiQDMPQRx99pMDAmhtCqqurlZ6erh49euiXv/ylLrnkEr344ouSah5CPGfOHPXu3VtDhw6Vv7+/lixZ4smvAwDnMBmGYXi6iAux2WyKjIyU1WpVRETEBbf/eFeW7npzh/p3aKN37h7cDBUCrldeXq5Dhw4pNTVVwcHBni6nxTvf+WrsNQQALpZXtkjVTn9AixQAAHAnrwxSDDYHAADNwTuDFIPNAQBAM/DKIGU5a7B5KxgCBgAAWinvDFJnxkjZDam00jPPJgMAAN7PK4NUcKCf/P1qJhBkwDlaO1pVLw7nCYAneOWEnCaTSeHmAFnLqlRUflpx3O2MVigwMFAmk0l5eXmKiYlxanZxX2EYhvLy8mQymRxzVAFAc/DKICXJEaRokUJr5e/vr+TkZB07dkyHDx/2dDktnslkUnJysvz93fNcQACoj9cGKR4TA28QHh6url27qqqKf8cXEhgYSIgC0Oy8PkgVM5cUWjl/f38CAgC0UF452Fw6a1JOuvYAAICbeG2QcjwmhhYpAADgJl4bpH6a3ZwgBQAA3MNrg9RPs5szSBcAALiH1wap8LMeEwMAAOAOXhukLHTtAQAAN/PaIBV+ZrA5QQoAALiL9wYpuvYAAICbeW2QYkJOAADgbl4fpHhEDAAAcBevDVLMbA4AANytUUEqIyNDV1xxhSwWi2JjYzV+/Hjt27fvvPssWrRIJpOpzhIcHNykoi9G7YScxRWnZRiG248HAAB8T6OC1MaNG5Wenq5t27ZpzZo1qqqq0vXXX6+SkpLz7hcREaGsrCzHcuTIkSYVfTEizty1ZxhSaWW1248HAAB8T0BjNv7444/rvF60aJFiY2P11VdfaejQoQ3uZzKZFB8f71yFTjIH+CnAz6TTdkNF5acVZm7UVwUAALigJo2RslqtkqTo6OjzbldcXKwOHTooJSVF48aN0+7du8+7fUVFhWw2W52lsUwm01ndeww4BwAArud0kLLb7Zo5c6aGDBmiXr16Nbhdt27d9Prrr+v999/Xm2++KbvdrsGDB+vYsWMN7pORkaHIyEjHkpKS4lSNjgHnTIEAAADcwGQ4ORL77rvv1qpVq7R582YlJydf9H5VVVXq0aOHJk2apMcff7zebSoqKlRRUeF4bbPZlJKSIqvVqoiIiIs+1qh/fKbvs2x6Y9oAXd015qL3A+BdbDabIiMjG30NAYALcWrg0L333quVK1dq06ZNjQpRkhQYGKh+/frpwIEDDW5jNptlNpudKa0OCy1SAADAjRrVtWcYhu69916tWLFC69atU2pqaqMPWF1dre+++04JCQmN3rexwpndHAAAuFGjWqTS09O1ePFivf/++7JYLMrOzpYkRUZGKiQkRJI0ZcoUJSUlKSMjQ5L02GOP6corr1SXLl1UWFiop59+WkeOHNH06dNd/FXOxaScAADAnRoVpF566SVJ0rBhw+qsX7hwoW677TZJ0tGjR+Xn91ND16lTpzRjxgxlZ2erTZs2SktL05YtW9SzZ8+mVX4ReEwMAABwp0YFqYsZl75hw4Y6r5999lk9++yzjSrKVejaAwAA7uS1z9qTfhpsXkzXHgAAcAPvDlJnHhPDGCkAAOAOXh2kmJATAAC4k3cHKccYKQabAwAA1/PqIMUYKQAA4E7eHaTOjJHirj0AAOAOXh2kwoMZIwUAANzHu4NUbdde5WnZ7U49mxkAAKBBXh2kamc2NwyptKraw9UAAABv49VByhzgp0B/kyQeEwMAAFzPq4OUyWT6qXuPcVIAAMDFvDpISWcNOGcKBAAA4GJeH6QsZqZAAAAA7uH1QYopEAAAgLt4fZD6aXZzBpsDAADX8vogRYsUAABwF68PUrVzSfG8PQAA4GpeH6TCzww2p0UKAAC4mtcHKUeLFEEKAAC4mNcHKceEnHTtAQAAF/P6IGVhQk4AAOAmXh+kalukeNYeAABwNe8PUoyRAgAAbuL1QSoi+MwjYujaAwAALub1Qeqnrj2CFAAAcC3vD1JnTchptxsergYAAHgT7w9SZ1qkJKmkklYpAADgOl4fpIID/RXkX/M1GScFAABcyeuDlMSDiwEAgHv4RpBiwDkAAHADnwpSdO0BAABX8okgxYOLAQCAO/hUkOIxMQAAwJV8IkjRtQcAANzBJ4KU5cxjYhhsDgAAXMknghTTHwAAAHfwjSDl6NpjjBQAAHAdnwhSlmDGSAEAANfzqSBF1x4AAHAlnwhS4WYGmwMAANfzkSBF1x4AAHC9RgWpjIwMXXHFFbJYLIqNjdX48eO1b9++C+63bNkyde/eXcHBwbrsssv00UcfOV2wM5jZHAAAuEOjgtTGjRuVnp6ubdu2ac2aNaqqqtL111+vkpKSBvfZsmWLJk2apGnTpunrr7/W+PHjNX78eO3atavJxV8sBpsDAAB3MBmGYTi7c15enmJjY7Vx40YNHTq03m0mTpyokpISrVy50rHuyiuvVN++ffXyyy9f1HFsNpsiIyNltVoVERHR6DoLiiuU9sSnkqSD/+8G+fuZGv0ZAFqvpl5DAKAhTRojZbVaJUnR0dENbrN161aNGDGizrqRI0dq69atDe5TUVEhm81WZ2mK2gk5JamkklYpAADgGk4HKbvdrpkzZ2rIkCHq1atXg9tlZ2crLi6uzrq4uDhlZ2c3uE9GRoYiIyMdS0pKirNlSpLMAf4KCqj5qoyTAgAAruJ0kEpPT9euXbu0ZMkSV9YjSZozZ46sVqtjyczMbPJnWszMJQUAAFwr4MKbnOvee+/VypUrtWnTJiUnJ5932/j4eOXk5NRZl5OTo/j4+Ab3MZvNMpvNzpTWoPDgABWUVPKYGAAA4DKNapEyDEP33nuvVqxYoXXr1ik1NfWC+wwaNEhr166ts27NmjUaNGhQ4yptonBapAAAgIs1qkUqPT1dixcv1vvvvy+LxeIY5xQZGamQkBBJ0pQpU5SUlKSMjAxJ0u9//3tdc801euaZZzR69GgtWbJE27dv14IFC1z8Vc6PKRAAAICrNapF6qWXXpLVatWwYcOUkJDgWJYuXerY5ujRo8rKynK8Hjx4sBYvXqwFCxaoT58+euedd/Tee++dd4C6O/CYGAAA4GqNapG6mCmnNmzYcM66CRMmaMKECY05lMsxuzkAAHA1n3jWnnTWGCm69gAAgIv4TJCiRQoAALiazwSp2tnNi8qZ/gAAALiGzwSp2gk5uWsPAAC4iu8EqeCau/YIUgAAwFV8JkgxIScAAHA13wlSjJECAAAu5jtBijFSAADAxXwmSEXUjpGiaw8AALiIzwSp2q69kspqVdsvPEM7AADAhfhMkAoz+zt+p3sPAAC4gs8EKXOAv4ICar4uQQoAALiCzwQpSYrgMTEAAMCFfCpI/TSXFFMgAACApvOtIFU7lxRdewAAwAV8KkhZzEyBAAAAXMenglRtixSDzQEAgCv4VJCyMEYKAAC4kE8FqXDu2gMAAC7kU0HKwmBzAADgQj4VpMLPDDYvokUKAAC4gG8FKbr2AACAC/lUkKodbM5dewAAwBV8K0gxRgoAALiQTwUpHhEDAABcybeCFGOkAACAC/lUkIoIPvOIGLr2AACAC/hUkKrt2iutrFa13fBwNQAAoLXzqSAVdiZISXTvAQCApvOpIBUU4CdzQM1XLqpgwDkAAGganwpSkmRhnBQAAHARHwxS3LkHAABcw+eC1E9zSRGkAABA0/hukKJrDwAANJHPBSm69gAAgKv4XJCqnd2cx8QAAICm8rkgZTnTtcddewAAoKl8L0idmf6AweYAAKCpfC5IOR5cTIsUAABoIt8LUmbGSAEAANfwuSBloUUKAAC4iO8GKcZIAQCAJmp0kNq0aZPGjh2rxMREmUwmvffee+fdfsOGDTKZTOcs2dnZztbcJOHmM4PNaZECAABN1OggVVJSoj59+mj+/PmN2m/fvn3KyspyLLGxsY09tEvwiBgAAOAqAY3dYdSoURo1alSjDxQbG6uoqKiL2raiokIVFRWO1zabrdHHawhdewAAwFWabYxU3759lZCQoF/84hf697//fd5tMzIyFBkZ6VhSUlJcVkdtkCqrqtbparvLPhcAAPgetwephIQEvfzyy1q+fLmWL1+ulJQUDRs2TDt27Ghwnzlz5shqtTqWzMxMl9UTZv6pEY479wAAQFM0umuvsbp166Zu3bo5Xg8ePFgHDx7Us88+qzfeeKPefcxms8xms1vqCfT3U3Cgn8qr7CoqP62o0CC3HAcAAHg/j0x/MGDAAB04cMATh5b002NiaJECAABN4ZEgtXPnTiUkJHji0JJ4cDEAAHCNRnftFRcX12lNOnTokHbu3Kno6Gi1b99ec+bM0fHjx/XPf/5TkjRv3jylpqbq0ksvVXl5uV577TWtW7dOn3zyieu+RSPVPm+Px8QAAICmaHSQ2r59u6699lrH61mzZkmSpk6dqkWLFikrK0tHjx51vF9ZWanZs2fr+PHjCg0NVe/evfXpp5/W+YzmxlxSAADAFUyGYRieLuJCbDabIiMjZbVaFRER0eTPu/ON7Vq9O0dP3thLvxnYwQUVAmjJXH0NAYBaPvesPemnx8QwKScAAGgKnwxSlmC69gAAQNP5ZJAK5649AADgAj4ZpGiRAgAAruCTQYrpDwAAgCv4ZpBi+gMAAOACPhmkkqJCJEl7s22qtrf42R8AAEAL5ZNBqk9KlCzmAJ0qrdKu41ZPlwMAAFopnwxSgf5+GtKlnSRp4w95Hq4GAAC0Vj4ZpCTpmm4xkghSAADAeT4bpIZeUhOkvj56StZS7t4DAACN57NBKikqRF1jw2U3pM0H8j1dDgAAaIV8NkhJ0jWX1Hbv5Xq4EgAA0Br5dpA6a5yUYTANAgAAaByfDlJXdIxWcKCfcmwV+iGn2NPlAACAVsang1RwoL+u7NRWEt17AACg8Xw6SElnj5NiGgQAANA4BKkzQerLQ6dUUsGz9wAAwMXz+SCV2i5MKdEhqqy2a9uPBZ4uBwAAtCI+H6RMJhPdewAAwCk+H6Qk6ZpLYiURpAAAQOMQpCQN6txWgf4mHSko1eH8Ek+XAwAAWgmClKRwc4D6d4iWRKsUAAC4eASpM86e5RwAAOBiEKTOqB1wvvVggcqrqj1cDQAAaA0IUmd0j7co1mJWWVW1th8+5elyAABAK0CQOqPuNAg8LgYAAFwYQeosQ5lPCgAANAJB6ixXdWknP5P0Q06xThSWebocAADQwhGkztImLEh9UqIkSZtolQIAABdAkPqZ2nFSm/YTpAAAwPkRpH6mNkh9tj9fp6vtHq4GAAC0ZASpn+mdHKWo0EAVlZ/WzsxCT5cDAABaMILUz/j7mXR1V+7eAwAAF0aQqsc1TIMAAAAuAkGqHkO7tpMkfXvMqvziCg9XAwAAWiqCVD1iI4LVMyFCkrRqV7aHqwEAAC0VQaoBN6clS5L+57MfVW03PFwNAABoiQhSDbh1QIqiQgN1uKBUn+ymVQoAAJyLINWA0KAATbmygyTp5Y0HZRi0SgEAgLoIUucxZXBHmQP89M0xqz4/dNLT5QAAgBam0UFq06ZNGjt2rBITE2UymfTee+9dcJ8NGzbo8ssvl9lsVpcuXbRo0SInSm1+7cLNmtC/ZqzUKxsPergaAADQ0jQ6SJWUlKhPnz6aP3/+RW1/6NAhjR49Wtdee6127typmTNnavr06Vq9enWji/WE6Vd1kp9JWr8vT3uzbZ4uBwAAtCABjd1h1KhRGjVq1EVv//LLLys1NVXPPPOMJKlHjx7avHmznn32WY0cObKxh292HduFaVSvBH34XZYWbPpRf7+lr6dLAgAALYTbx0ht3bpVI0aMqLNu5MiR2rp1a4P7VFRUyGaz1Vk86T+GdpIkfbDzhE4Ulnm0FgAA0HK4PUhlZ2crLi6uzrq4uDjZbDaVldUfSjIyMhQZGelYUlJS3F3mefVJidKgTm112m7o9c2HPFoLAABoOVrkXXtz5syR1Wp1LJmZmZ4uSXdeU9Mq9dYXR2UtrfJwNQAAoCVwe5CKj49XTk5OnXU5OTmKiIhQSEhIvfuYzWZFRETUWTztmkti1D3eopLKar35+RFPlwMAAFoAtwepQYMGae3atXXWrVmzRoMGDXL3oV3KZDI5WqUW/vuwyquqPVwRAADwtEYHqeLiYu3cuVM7d+6UVDO9wc6dO3X06FFJNd1yU6ZMcWx/11136ccff9RDDz2kvXv36sUXX9Tbb7+tBx54wDXfoBmN6Z2oxMhg5RdXaMXXxz1dDgAA8LBGB6nt27erX79+6tevnyRp1qxZ6tevnx5++GFJUlZWliNUSVJqaqo+/PBDrVmzRn369NEzzzyj1157rVVMffBzgf5+mnZ1TavUq5t4mDEAAL7OZLSCh8jZbDZFRkbKarV6fLxUScVpDf7rOlnLqvTyb9P0y17xHq0HwIW1pGsIAO/SIu/aa8nCzAGaMoiHGQMAAIKUU6YO7qigAD/tzCzUl4dPebocAADgIQQpJ7QLN2tCWs3DjF/ccMDD1QAAAE8hSDlpxtWd5O9n0oZ9eVq9O9vT5QAAAA8gSDmpY7sw3XnmGXxz39slaxmznQMA4GsIUk1w//Cu6tQuTLlFFfrrqu89XQ4AAGhmBKkmCA70V8avL5MkvfVFprYeLPBwRQAAoDkRpJpoYKe2+s3A9pKkOe9+y6NjAADwIQQpF/jjqO6KjwjW4YJSPfvpD54uBwAANBOClAtEBAfq8fG9JEmvfXZIu45bPVwRAABoDgQpF/lFzziN7p2garuhh975VlXVdk+XBAAA3Iwg5UKPjr1UkSGB2pNl06uf/ejpcgAAgJsRpFwoxmLW3DE9JUnzPt2vH/OKPVwRAABwJ4KUi910eZKu7tpOlaft+tO738lu56HGAAB4K4KUi5lMJv2/Gy9TSKC/vjh0Um99edTTJQEAADchSLlBSnSoHhzZTZL014/2Ksta5uGKAACAOxCk3OS2wR3VNyVKRRWnddebO1RWyUSdAAB4G4KUm/j7mTRvYl9FhQbqm8xCPbB0J+OlAADwMgQpN+rYLkwLftdfQf5++nh3tp76eK+nSwIAAC5EkHKzAanR+u+be0uSXtn0o/71+REPVwQAAFyFINUMxvdL0gMjLpEkPfz+bm38Ic/DFQEAAFcgSDWT+4d30a/7Janabij9Xzu0L7vI0yUBAIAmIkg1E5PJpIybLtPA1GgVV5zWHYu+VK6t3NNlAQCAJiBINSNzgL9e+V2aOrUL0/HCMk3/53aVVp72dFkAAMBJBKlmFhUapNdvu0JtQgP17TGrfr9kp6qZFgEAgFaJIOUBHduF6dUpNdMirNmToyc//F6GQZgCAKC1IUh5SP+O0Xp6Qs20CK//+5D+c8Uuna62e7gqAADQGAQpDxrXN0mPj+8lk0l664ujuvONrxgzBQBAK0KQ8rDfXdlBL/0mTeYAP63dm6tJr36uguIKT5cFAAAuAkGqBfhlr3gtnjHQ8Vy+m17aoiMFJZ4uCwAAXABBqoVI6xCt5XcPVnKbEB0uKNWvX9yibzILPV0WAAA4D4JUC9I5Jlzv3jNYlyZGqKCkUrcu2KZ1e3M8XRYAAGgAQaqFibUEa+mdg3R113Yqq6rWjH9+paVfHvV0WQAAoB4EqRYo3Byg12+7Qjddnqxqu6E/Lv9Oj6/co/Kqak+XBgAAzkKQaqEC/f30twm9dd91XSRJ/7P5kMY+v1m7jls9XBkAAKhFkGrBTCaTZl/fTf8ztb/ahZu1P7dY4+f/W8+v3c/knQAAtAAEqVZgeI84ffLAUI3qFa/TdkPPrPlBN7+8VT/mFXu6NAAAfBpBqpWIDgvSi7+5XM9O7CNLcIB2Zhbqhuc+0z+3Hpadhx4DAOARBKlWxGQy6cZ+yVo9c6iGdGmr8iq7Hn5/t6Yu/EJZ1jJPlwcAgM8hSLVCiVEheuOOgfrLry5VcKCfPtufr+v/vkkvbTjInX0AADQjk2EYLb5fyGazKTIyUlarVREREZ4up0U5mFesWW9/45gFPTEyWLOv76Yb+yXJz8/k2eKAFoJrCAB3capFav78+erYsaOCg4M1cOBAffHFFw1uu2jRIplMpjpLcHCw0wWjrs4x4Vpx92A9M6GPEiODdcJartnLvtHo5zfrs/15ni4PAACv1uggtXTpUs2aNUuPPPKIduzYoT59+mjkyJHKzc1tcJ+IiAhlZWU5liNHjjSpaNTl52fSTWnJWvfgMP3xl91lMQfo+yybfvc/X2jK619ozwmbp0sEAMArNTpI/f3vf9eMGTN0++23q2fPnnr55ZcVGhqq119/vcF9TCaT4uPjHUtcXFyTikb9ggP9dfewztr40LW6fUhHBfqbtOmHPI1+/jPNfvsbHTtV6ukSAQDwKo0KUpWVlfrqq680YsSInz7Az08jRozQ1q1bG9yvuLhYHTp0UEpKisaNG6fdu3ef9zgVFRWy2Wx1Fly86LAgPTL2Un066xqN7p0gw5CW7zima57eoPvf+lrfHWN2dAAAXKFRQSo/P1/V1dXntCjFxcUpOzu73n26deum119/Xe+//77efPNN2e12DR48WMeOHWvwOBkZGYqMjHQsKSkpjSkTZ3RoG6b5ky/XinsGa0iXtqq2G/rgmxMa+8Jm3bpgq9btzWEOKgAAmqBRd+2dOHFCSUlJ2rJliwYNGuRY/9BDD2njxo36/PPPL/gZVVVV6tGjhyZNmqTHH3+83m0qKipUUVHheG2z2ZSSksIdN02067hVr332o/7v2yxVnwlQXWLDNePqVI3rm6TgQH8PVwi4B3ftAXCXRrVItWvXTv7+/srJyamzPicnR/Hx8Rf1GYGBgerXr58OHDjQ4DZms1kRERF1FjRdr6RIzbu1nz576Fr9x9BOCjcH6EBusf64/Dtd9dQ6Pbd2v3Js5Z4uEwCAVqNRQSooKEhpaWlau3atY53dbtfatWvrtFCdT3V1tb777jslJCQ0rlK4TGJUiP7zhh7aOuc6/dfoHkqMDFZ+caX+vuYHDf7rOk1b9KU+3pWtKh6MDADAeQU0dodZs2Zp6tSp6t+/vwYMGKB58+appKREt99+uyRpypQpSkpKUkZGhiTpscce05VXXqkuXbqosLBQTz/9tI4cOaLp06e79pug0SzBgZp+dSdNHdxRH32XpTe3HdGXh09p7d5crd2bq3bhQbqxX5Ju6Z+irnEWT5cLAECL0+ggNXHiROXl5enhhx9Wdna2+vbtq48//tgxAP3o0aPy8/upoevUqVOaMWOGsrOz1aZNG6WlpWnLli3q2bOn674FmiTQ30/j+iZpXN8kHcwr1rLtx7R8xzHlFVXo1c8O6dXPDqlf+yjd0j9FY3onyBIc6OmSAQBoEXhEDOpVVW3Xxn15Wro9U+v25joGpwcF+GnYJTEa3TtBw3vEKdzc6CwONDuuIQDchSCFC8otKteKHcf19vZMHcwrcaw3B/jp2m6xGt07Qdd1j1UYoQotFNcQAO5CkMJFMwxDe7OL9OG3WVr57QkdLvhppvTgQD9d1z1Woy9L1DXdYmipQovCNQSAuxCk4BTDMLQny6YPv83Sh99l6chZoSrI308DO0VrRI84De8Rq+Q2oR6sFOAaAsB9CFJoMsMwtPuETSu/zdLHu7LqtFRJUrc4i4b3iNXwHnHqmxIlfz+ThyqFr+IaAsBdCFJwKcMwdDCvROv25ujT73O1/fBJnf0UmrZhQbrmkhhdfUk7DenSTrGWYM8VC5/BNQSAuxCk4FaFpZXasC9Pa/fmasO+XBWVn67zfvd4i4ZeEqOrurTTgNRoHlMDt+AaAsBdCFJoNlXVdn15+KQ2/ZCvzQfytOu4rc77QQF+GtAxWld1bafBnduqZ0KEAvwbNfk+UC+uIQDchSAFjykortC/Dxbosx/ytPlAvrKsdZ/zF24OUP+ObXRlp7YamBqtXkmRCiRYwQlcQwC4C0EKLULt2KrP9udp8/58fXH45DndgKFB/krrUDdY0RWIi8E1BIC7EKTQIlXbDX2fZdO2Hwv0+aGT+uLQSVnLqupsE+Tvp15JEUrr0EZpHdro8g5tGLyOenENAeAuBCm0Cna7oX05RTXB6seT2n7kpPKLK8/Zrn10qCNU9UuJUrd4C92B4BoCwG0IUmiVDMPQ0ZOl+urIKceyL6dIP//XbA7w06WJEeqdHKW+KVHqkxKljm1DZTIxl5Uv4RoCwF0IUvAatvIqfZNZqO2HT2nH0VP6JrNQtp+Ns5KkiOAA9UmJUu/kSPVKjNSliZFKiQ4hXHkxriEA3IUgBa9lGIYOF5Tqm8xCfXOsUN9kFmr3CZsqTtvP2TYiOEA9EyNqglVSzc9OMeHMwu4luIYAcBeCFHxKVbVd+7KL9O0xq749VhOs9mUXqbL63HAVHOinbvER6hFvUfd4i7onRKh7vEVRoUEeqBxNwTUEgLsQpODzKk/bdSC3WLtOWLXnhE27jlu1J8um0srqerdPiAyuE6y6xlrUKSaMqRhaMK4hANyFIAXUo9pu6FB+ifZm27Q3q6jmZ3aRjp0qq3d7P5PUsW2YusSG65I4i7rGhROwWhCuIQDchSAFNIKtvEo/ZBfp++wi7c2qCVc/5BSdM3loLT+T1KFtmDrHhKlzTLg6nfnZOSZcbcLoImwuXEMAuAtBCmgiwzCUW1Sh/TnF+iGnSPtzixy/13fXYK3osCB1jglTp3bhSo0JU8e2YUptF6YObUNpxXIxriEA3IUgBbhJbcA6mFusg3nFOphXooN5xfoxr0THC+vvIpQkk0lKiAhWx3Zh6tguTKlta352aBuqlDahCgkiZDUW1xAA7kKQAjygtPK0fswr0Y/5JTqYW6zDBSU6nF+iQ/kl523FkqRYi1kd2oaqfXSY2keH1vzeNlTto0PVNiyI+bDqwTUEgLsQpIAWxDAMnSyp1OGCEh3KL60JVwUlOlJQoiMFpQ2OxaoVEuiv5DYhSm4TopTo0DO/17RkJbcJUVRooE8GLa4hANwlwNMFAPiJyWRS23Cz2oabldYh+pz3C0srdaSgVEdP1iy1AevoyVJl28pVVlWt/bnF2p9bXO/nhwb5KzEqRElRIUpqc+ZnVEjNujYhirOYFcCzCQHgohGkgFYkKjRIUaFB6pMSdc57laftOlFYpmOnypR5qlSZJ0vP+r1M+cUVKq2s1oHcYh1oIGj5maRYS7ASooKVGBmihMhgxUcGKzGq5veEyBDFWMzM+A4AZxCkAC8RFODnGKBen/Kqap0oLNPxwrKan6fKdLywXMcLS3WisFxZ1jJVVRvKtpUr21aur1VY7+f4+5kUE25WXGSw4iPMio8IPvN7sOP3WItZ4eYAn+xGBOBbCFKAjwgO9FenmHB1igmv9/1qu6GC4gqdsJYrq7BMWdaacHXCWq7sM+tyiipUbf8pbH1znuOFBvkr1mJWbERNsIo78zM2wqxYS7BiLGbFhJt9dtwWAO9AkAIgqaalKTYiWLERwepbT9ehVBO28osrlG2tCVI5tvJzfs+1Vaio4rRKK6t1uKBUhwtKz3vcQP+aFq4Yy1nLmXFi7cLNahcepHYWs9qFmRURQisXgJaFIAXgovn7mRQXEay4iGD1Oc92pZWnlWurUG5RhXJs5cotqlDumZ85tnLlFVUor7hChaVVqqo2dMJarhPW8gseP8jfT23Dg9Q2PEgd24bphcmXu+7LAYATCFIAXC40KEAd2wU0OF6rVsXpahUUVyq3qKImXNUuxeUqKK5UfnGF8osrlV9U08pVWW0/0+VY3uBDpQGgORGkAHiMOaBmOobEqJALblteVa2CkkoVFFcov7iiGaoDgAsjSAFoFYID/R3zXgFAS8HMewAAAE4iSAEAADiJIAUAAOAkghQAAICTCFIAAABOIkgBAAA4iSAFAADgJIIUAACAkwhSAAAATiJIAQAAOKlVPCLGMAxJks1m83AlAFqj2mtH7bUEAFylVQSpoqIiSVJKSoqHKwHQmhUVFSkyMtLTZQDwIiajFfwvmt1u14kTJ2SxWGQymS64vc1mU0pKijIzMxUREdEMFfoGzqt7cF5d7+fn1DAMFRUVKTExUX5+jGgA4DqtokXKz89PycnJjd4vIiKC/zC5AefVPTivrnf2OaUlCoA78L9mAAAATiJIAQAAOMkrg5TZbNYjjzwis9ns6VK8CufVPTivrsc5BdBcWsVgcwAAgJbIK1ukAAAAmgNBCgAAwEkEKQAAACcRpAAAAJxEkAIAAHCSVwap+fPnq2PHjgoODtbAgQP1xRdfeLqkVmXTpk0aO3asEhMTZTKZ9N5779V53zAMPfzww0pISFBISIhGjBih/fv3e6bYViIjI0NXXHGFLBaLYmNjNX78eO3bt6/ONuXl5UpPT1fbtm0VHh6um266STk5OR6quHV46aWX1Lt3b8cM5oMGDdKqVasc73NOAbib1wWppUuXatasWXrkkUe0Y8cO9enTRyNHjlRubq6nS2s1SkpK1KdPH82fP7/e9//7v/9bzz33nF5++WV9/vnnCgsL08iRI1VeXt7MlbYeGzduVHp6urZt26Y1a9aoqqpK119/vUpKShzbPPDAA/q///s/LVu2TBs3btSJEyf061//2oNVt3zJycn661//qq+++krbt2/Xddddp3Hjxmn37t2SOKcAmoHhZQYMGGCkp6c7XldXVxuJiYlGRkaGB6tqvSQZK1ascLy22+1GfHy88fTTTzvWFRYWGmaz2Xjrrbc8UGHrlJuba0gyNm7caBhGzTkMDAw0li1b5tjm+++/NyQZW7du9VSZrVKbNm2M1157jXMKoFl4VYtUZWWlvvrqK40YMcKxzs/PTyNGjNDWrVs9WJn3OHTokLKzs+uc48jISA0cOJBz3AhWq1WSFB0dLUn66quvVFVVVee8du/eXe3bt+e8XqTq6motWbJEJSUlGjRoEOcUQLMI8HQBrpSfn6/q6mrFxcXVWR8XF6e9e/d6qCrvkp2dLUn1nuPa93B+drtdM2fO1JAhQ9SrVy9JNec1KChIUVFRdbblvF7Yd999p0GDBqm8vFzh4eFasWKFevbsqZ07d3JOAbidVwUpoDVIT0/Xrl27tHnzZk+X4hW6deumnTt3ymq16p133tHUqVO1ceNGT5cFwEd4Vddeu3bt5O/vf85dOTk5OYqPj/dQVd6l9jxyjp1z7733auXKlVq/fr2Sk5Md6+Pj41VZWanCwsI623NeLywoKEhdunRRWlqaMjIy1KdPH/3jH//gnAJoFl4VpIKCgpSWlqa1a9c61tntdq1du1aDBg3yYGXeIzU1VfHx8XXOsc1m0+eff845Pg/DMHTvvfdqxYoVWrdunVJTU+u8n5aWpsDAwDrndd++fTp69CjntZHsdrsqKio4pwCahdd17c2aNUtTp05V//79NWDAAM2bN08lJSW6/fbbPV1aq1FcXKwDBw44Xh86dEg7d+5UdHS02rdvr5kzZ+qJJ55Q165dlZqaqrlz5yoxMVHjx4/3XNEtXHp6uhYvXqz3339fFovFMUYnMjJSISEhioyM1LRp0zRr1ixFR0crIiJC9913nwYNGqQrr7zSw9W3XHPmzNGoUaPUvn17FRUVafHixdqwYYNWr17NOQXQPDx926A7PP/880b79u2NoKAgY8CAAca2bds8XVKrsn79ekPSOcvUqVMNw6iZAmHu3LlGXFycYTabjeHDhxv79u3zbNEtXH3nU5KxcOFCxzZlZWXGPffcY7Rp08YIDQ01brzxRiMrK8tzRbcCd9xxh9GhQwcjKCjIiImJMYYPH2588sknjvc5pwDczWQYhuGhDAcAANCqedUYKQAAgOZEkAIAAHASQQoAAMBJBCkAAAAnEaQAAACcRJACAABwEkEKAADASQQpAAAAJxGkAAAAnESQAgAAcBJBCgAAwEn/Hy0V00jT1u5WAAAAAElFTkSuQmCC\n"
+ },
+ "metadata": {}
+ }
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "source": [
+ "# **Predictions of Retrival Model**\n",
+ "Two-Tower model is trained, we need to use it to generate candidates.\n",
+ "\n",
+ "We can implement inference pipeline using three steps:\n",
+ "1. Indexing: We can run the Item Tower once for all available ads to generate their embeddings.\n",
+ "2. Query Encoding: When a user arrives, we pass their features through the User Tower to generate a User Embedding.\n",
+ "3. Nearest Neighbor Search: We search the index to find the Ad Embeddings closest to the User Embedding (highest dot product).\n",
+ "\n",
+ "Keras-RS [BruteForceRetrieval layer](https://keras.io/keras_rs/api/retrieval_layers/brute_force_retrieval/) calculates dot product between the user and every single item in the index to find exact top-K matches"
+ ],
+ "metadata": {
+ "id": "_o0ILppGcknp"
+ }
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "USER_CATEGORICAL = [\"user_id\", \"gender\", \"city\", \"country\"]\n",
+ "CONTINUOUS_FEATURES = [\"time_on_site\", \"internet_usage\", \"area_income\", \"Age\"]\n",
+ "USER_FEATURES = USER_CATEGORICAL + CONTINUOUS_FEATURES\n",
+ "\n",
+ "class BruteForceRetrievalWrapper:\n",
+ " def __init__(self, model, ads_df, ad_features, user_features, k=10):\n",
+ " self.model, self.k = model, k\n",
+ " self.user_features = user_features\n",
+ " unique_ads = ads_df[ad_features].drop_duplicates(\"ad_id\").reset_index(drop=True)\n",
+ " self.ids = unique_ads[\"ad_id\"].values\n",
+ " self.topic_map = dict(zip(unique_ads[\"ad_id\"], unique_ads[\"ad_topic\"]))\n",
+ " ad_inputs = {\"ad_id\": tf.constant(self.ids.astype(str)),\n",
+ " \"ad_topic\": tf.constant(unique_ads[\"ad_topic\"].astype(str).values)\n",
+ " }\n",
+ " self.candidate_embs = model.ln_ad(model.ad_tower(ad_inputs))\n",
+ "\n",
+ " def query_batch(self, user_df):\n",
+ " inputs = {k: tf.constant(user_df[k].values.astype(float if k in CONTINUOUS_FEATURES else str))\n",
+ " for k in self.user_features if k in user_df.columns\n",
+ " }\n",
+ " u_emb = self.model.ln_user(self.model.user_tower(inputs))\n",
+ " scores = tf.linalg.matmul(u_emb, self.candidate_embs, transpose_b=True)\n",
+ " top_scores, top_indices = tf.math.top_k(scores, k=self.k)\n",
+ " return top_scores.numpy(), top_indices.numpy()\n",
+ "\n",
+ " def decode_results(self, scores, indices):\n",
+ " results = []\n",
+ " for row_scores, row_indices in zip(scores, indices):\n",
+ " retrieved_ids = self.ids[row_indices]\n",
+ " results.append([\n",
+ " {\"ad_id\": aid, \"ad_topic\": self.topic_map[aid], \"score\": float(s)}\n",
+ " for aid, s in zip(retrieved_ids, row_scores)\n",
+ " ])\n",
+ " return results\n",
+ "\n",
+ "retrieval_engine = BruteForceRetrievalWrapper(model=retrieval_model,ads_df=ads_df,ad_features=[\"ad_id\", \"ad_topic\"],\n",
+ " user_features=USER_FEATURES, k=10)\n",
+ "sample_user = pd.DataFrame([x_test.iloc[0]])\n",
+ "scores, indices = retrieval_engine.query_batch(sample_user)\n",
+ "top_ads = retrieval_engine.decode_results(scores, indices)[0]"
+ ],
+ "metadata": {
+ "id": "QrHPBLIml8Si"
+ },
+ "execution_count": 51,
+ "outputs": []
+ },
+ {
+ "cell_type": "markdown",
+ "source": [
+ "# **Implementation of Ranking Model**\n",
+ "Retrieval model only calculates a simple similarity score (Dot Product). It doesn't account for complex feature interactions.\n",
+ "So we need to build ranking model after words retrival model.\n",
+ "\n",
+ "**Architecture**\n",
+ "1. **Feature Extraction:** We reuse the trained User Tower and Ad Tower from the Retrieval stage. We freeze these towers (trainable = False) so their weights don't change.\n",
+ "2. **Interaction:** Instead of just a dot product, we concatenate three inputs- The User EmbeddingThe Ad EmbeddingThe Dot Product (Similarity)\n",
+ "3. **Scorer(MLP):** These concatenated inputs are fed into a Multi-Layer Perceptron—a stack of Dense layers. This network learns the non-linear relationships between the user and the ad.\n",
+ "4. **Output:** The final layer uses a Sigmoid activation to output a single probability between 0.0 and 1.0 (Likelihood of a Click)."
+ ],
+ "metadata": {
+ "id": "xQtLgCfyeqYS"
+ }
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "retrieval_model.trainable = False\n",
+ "def create_ranking_ds(df):\n",
+ " inputs = {\"user\": dict_to_tensor_features(df[USER_FEATURES], continuous_features),\n",
+ " \"positive_ad\": dict_to_tensor_features(df[AD_FEATURES], continuous_features)\n",
+ " }\n",
+ " return tf.data.Dataset.from_tensor_slices((inputs, df[\"Clicked on Ad\"].values.\n",
+ " astype('float32'))).shuffle(10000).batch(256).prefetch(tf.data.AUTOTUNE)"
+ ],
+ "metadata": {
+ "id": "_j2PAllRvDOb"
+ },
+ "execution_count": 39,
+ "outputs": []
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "ranking_train_dataset= create_ranking_ds(x_train)\n",
+ "ranking_test_dataset = create_ranking_ds(x_test)"
+ ],
+ "metadata": {
+ "id": "uhKCsNa8v0Uo"
+ },
+ "execution_count": 40,
+ "outputs": []
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "class RankingModel(keras.Model):\n",
+ " def __init__(self, retrieval_model, **kwargs):\n",
+ " super().__init__(**kwargs)\n",
+ " self.retrieval = retrieval_model\n",
+ " self.mlp = keras.Sequential([\n",
+ " layers.Dense(256, activation=\"relu\"), layers.Dropout(0.2),\n",
+ " layers.Dense(128, activation=\"relu\"), layers.Dropout(0.2),\n",
+ " layers.Dense(64, activation=\"relu\"),\n",
+ " layers.Dense(1, activation=\"sigmoid\")\n",
+ " ])\n",
+ "\n",
+ " def call(self, inputs):\n",
+ " u_emb, ad_emb, dot = self.retrieval.get_embeddings(inputs)\n",
+ " return self.mlp(keras.ops.concatenate([u_emb, ad_emb, dot], axis=-1))"
+ ],
+ "metadata": {
+ "id": "mQCXdFFqvDRC"
+ },
+ "execution_count": 41,
+ "outputs": []
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "ranking_model = RankingModel(retrieval_model)\n",
+ "ranking_model.compile(optimizer=keras.optimizers.Adam(1e-4), loss=\"binary_crossentropy\", metrics=[\"AUC\", \"accuracy\"])\n",
+ "history1 = ranking_model.fit(ranking_train_dataset, epochs=20)"
+ ],
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/"
+ },
+ "id": "w5JPRvJ_vDUS",
+ "outputId": "cdc8c321-8722-48a9-f6e3-8516c9f5caa1"
+ },
+ "execution_count": 42,
+ "outputs": [
+ {
+ "output_type": "stream",
+ "name": "stdout",
+ "text": [
+ "Epoch 1/20\n",
+ "\u001b[1m3/3\u001b[0m \u001b[32m━━━━━━━━━━━━━━━━━━━━\u001b[0m\u001b[37m\u001b[0m \u001b[1m6s\u001b[0m 75ms/step - AUC: 0.7137 - accuracy: 0.4999 - loss: 0.6688\n",
+ "Epoch 2/20\n",
+ "\u001b[1m3/3\u001b[0m \u001b[32m━━━━━━━━━━━━━━━━━━━━\u001b[0m\u001b[37m\u001b[0m \u001b[1m0s\u001b[0m 43ms/step - AUC: 0.8871 - accuracy: 0.6535 - loss: 0.6237\n",
+ "Epoch 3/20\n",
+ "\u001b[1m3/3\u001b[0m \u001b[32m━━━━━━━━━━━━━━━━━━━━\u001b[0m\u001b[37m\u001b[0m \u001b[1m0s\u001b[0m 51ms/step - AUC: 0.9528 - accuracy: 0.8104 - loss: 0.5837\n",
+ "Epoch 4/20\n",
+ "\u001b[1m3/3\u001b[0m \u001b[32m━━━━━━━━━━━━━━━━━━━━\u001b[0m\u001b[37m\u001b[0m \u001b[1m0s\u001b[0m 27ms/step - AUC: 0.9704 - accuracy: 0.8531 - loss: 0.5561 \n",
+ "Epoch 5/20\n",
+ "\u001b[1m3/3\u001b[0m \u001b[32m━━━━━━━━━━━━━━━━━━━━\u001b[0m\u001b[37m\u001b[0m \u001b[1m0s\u001b[0m 23ms/step - AUC: 0.9826 - accuracy: 0.9023 - loss: 0.5173\n",
+ "Epoch 6/20\n",
+ "\u001b[1m3/3\u001b[0m \u001b[32m━━━━━━━━━━━━━━━━━━━━\u001b[0m\u001b[37m\u001b[0m \u001b[1m0s\u001b[0m 47ms/step - AUC: 0.9875 - accuracy: 0.9188 - loss: 0.4851\n",
+ "Epoch 7/20\n",
+ "\u001b[1m3/3\u001b[0m \u001b[32m━━━━━━━━━━━━━━━━━━━━\u001b[0m\u001b[37m\u001b[0m \u001b[1m0s\u001b[0m 58ms/step - AUC: 0.9866 - accuracy: 0.9337 - loss: 0.4533\n",
+ "Epoch 8/20\n",
+ "\u001b[1m3/3\u001b[0m \u001b[32m━━━━━━━━━━━━━━━━━━━━\u001b[0m\u001b[37m\u001b[0m \u001b[1m0s\u001b[0m 29ms/step - AUC: 0.9914 - accuracy: 0.9448 - loss: 0.4224 \n",
+ "Epoch 9/20\n",
+ "\u001b[1m3/3\u001b[0m \u001b[32m━━━━━━━━━━━━━━━━━━━━\u001b[0m\u001b[37m\u001b[0m \u001b[1m0s\u001b[0m 23ms/step - AUC: 0.9903 - accuracy: 0.9441 - loss: 0.3910\n",
+ "Epoch 10/20\n",
+ "\u001b[1m3/3\u001b[0m \u001b[32m━━━━━━━━━━━━━━━━━━━━\u001b[0m\u001b[37m\u001b[0m \u001b[1m0s\u001b[0m 40ms/step - AUC: 0.9910 - accuracy: 0.9502 - loss: 0.3671\n",
+ "Epoch 11/20\n",
+ "\u001b[1m3/3\u001b[0m \u001b[32m━━━━━━━━━━━━━━━━━━━━\u001b[0m\u001b[37m\u001b[0m \u001b[1m0s\u001b[0m 20ms/step - AUC: 0.9938 - accuracy: 0.9616 - loss: 0.3386\n",
+ "Epoch 12/20\n",
+ "\u001b[1m3/3\u001b[0m \u001b[32m━━━━━━━━━━━━━━━━━━━━\u001b[0m\u001b[37m\u001b[0m \u001b[1m0s\u001b[0m 22ms/step - AUC: 0.9922 - accuracy: 0.9628 - loss: 0.3158\n",
+ "Epoch 13/20\n",
+ "\u001b[1m3/3\u001b[0m \u001b[32m━━━━━━━━━━━━━━━━━━━━\u001b[0m\u001b[37m\u001b[0m \u001b[1m0s\u001b[0m 22ms/step - AUC: 0.9940 - accuracy: 0.9676 - loss: 0.2864\n",
+ "Epoch 14/20\n",
+ "\u001b[1m3/3\u001b[0m \u001b[32m━━━━━━━━━━━━━━━━━━━━\u001b[0m\u001b[37m\u001b[0m \u001b[1m0s\u001b[0m 24ms/step - AUC: 0.9948 - accuracy: 0.9657 - loss: 0.2607\n",
+ "Epoch 15/20\n",
+ "\u001b[1m3/3\u001b[0m \u001b[32m━━━━━━━━━━━━━━━━━━━━\u001b[0m\u001b[37m\u001b[0m \u001b[1m0s\u001b[0m 14ms/step - AUC: 0.9951 - accuracy: 0.9685 - loss: 0.2452\n",
+ "Epoch 16/20\n",
+ "\u001b[1m3/3\u001b[0m \u001b[32m━━━━━━━━━━━━━━━━━━━━\u001b[0m\u001b[37m\u001b[0m \u001b[1m0s\u001b[0m 15ms/step - AUC: 0.9943 - accuracy: 0.9689 - loss: 0.2243\n",
+ "Epoch 17/20\n",
+ "\u001b[1m3/3\u001b[0m \u001b[32m━━━━━━━━━━━━━━━━━━━━\u001b[0m\u001b[37m\u001b[0m \u001b[1m0s\u001b[0m 13ms/step - AUC: 0.9945 - accuracy: 0.9701 - loss: 0.2068\n",
+ "Epoch 18/20\n",
+ "\u001b[1m3/3\u001b[0m \u001b[32m━━━━━━━━━━━━━━━━━━━━\u001b[0m\u001b[37m\u001b[0m \u001b[1m0s\u001b[0m 13ms/step - AUC: 0.9942 - accuracy: 0.9682 - loss: 0.1947\n",
+ "Epoch 19/20\n",
+ "\u001b[1m3/3\u001b[0m \u001b[32m━━━━━━━━━━━━━━━━━━━━\u001b[0m\u001b[37m\u001b[0m \u001b[1m0s\u001b[0m 12ms/step - AUC: 0.9955 - accuracy: 0.9719 - loss: 0.1764\n",
+ "Epoch 20/20\n",
+ "\u001b[1m3/3\u001b[0m \u001b[32m━━━━━━━━━━━━━━━━━━━━\u001b[0m\u001b[37m\u001b[0m \u001b[1m0s\u001b[0m 15ms/step - AUC: 0.9943 - accuracy: 0.9725 - loss: 0.1623\n"
+ ]
+ }
+ ]
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "pd.DataFrame(history1.history).plot(subplots=True, layout=(1, 3), figsize=(12, 4), title=\"Ranking Model Metrics\")\n",
+ "plt.show()"
+ ],
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/",
+ "height": 408
+ },
+ "id": "WoodoIYnFgsx",
+ "outputId": "ee8c8243-6c85-4831-f44a-167d1ecf7b06"
+ },
+ "execution_count": 43,
+ "outputs": [
+ {
+ "output_type": "display_data",
+ "data": {
+ "text/plain": [
+ ""
+ ],
+ "image/png": "\n"
+ },
+ "metadata": {}
+ }
+ ]
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "ranking_model.evaluate(ranking_test_dataset)"
+ ],
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/"
+ },
+ "id": "RD4UirtNvDXT",
+ "outputId": "9964607f-eea1-4c1a-d117-2a847416cfec"
+ },
+ "execution_count": 44,
+ "outputs": [
+ {
+ "output_type": "stream",
+ "name": "stdout",
+ "text": [
+ "\u001b[1m1/1\u001b[0m \u001b[32m━━━━━━━━━━━━━━━━━━━━\u001b[0m\u001b[37m\u001b[0m \u001b[1m1s\u001b[0m 630ms/step - AUC: 0.9867 - accuracy: 0.9372 - loss: 0.2243\n"
+ ]
+ },
+ {
+ "output_type": "execute_result",
+ "data": {
+ "text/plain": [
+ "[0.2243196964263916, 0.9866776466369629, 0.9371727705001831]"
+ ]
+ },
+ "metadata": {},
+ "execution_count": 44
+ }
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "source": [
+ "# **Predictions of Ranking Model**\n",
+ "The retrieval model gave us a list of ads that are generally relevant (high dot product similarity). The ranking model will now calculate the specific probability (0% to 100%) that the user will click each of those ads.\n",
+ "\n",
+ "The Ranking model expects pairs of (User, Ad). Since we are scoring 10 ads for 1 user, we cannot just pass the user features once.We effectively take user's features 10 times to create a batch."
+ ],
+ "metadata": {
+ "id": "XaLAPapNjdYm"
+ }
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "def rerank_ads_for_user(user_row, retrieved_ads, ranking_model):\n",
+ " ads_df = pd.DataFrame(retrieved_ads)\n",
+ " num_ads = len(ads_df)\n",
+ " user_inputs = { k: tf.fill((num_ads, 1), str(user_row[k]) if k not in continuous_features else float(user_row[k]))\n",
+ " for k in USER_FEATURES}\n",
+ " ad_inputs = {k: tf.reshape(tf.constant(ads_df[k].astype(str).values), (-1, 1)) for k in AD_FEATURES}\n",
+ " scores = ranking_model({\"user\": user_inputs, \"positive_ad\": ad_inputs}).numpy().flatten()\n",
+ " ads_df[\"ranking_score\"] = scores\n",
+ " return ads_df.sort_values(\"ranking_score\", ascending=False).to_dict(\"records\")\n",
+ "\n",
+ "sample_user = x_test.iloc[0]\n",
+ "scores, indices = retrieval_engine.query_batch(pd.DataFrame([sample_user]))\n",
+ "top_ads = retrieval_engine.decode_results(scores, indices)[0]\n",
+ "final_ranked_ads = rerank_ads_for_user(sample_user, top_ads, ranking_model)\n",
+ "print(f\"User: {sample_user['user_id']}\")\n",
+ "print(f\"{'Ad ID':<10} | {'Topic':<30} | {'Retrival Score':<11} | {'Rank Probability'}\")\n",
+ "for item in final_ranked_ads:\n",
+ " print(f\"{item['ad_id']:<10} | {item['ad_topic'][:28]:<30} | {item['score']:.4f} | {item['ranking_score']*100:.2f}%\")"
+ ],
+ "metadata": {
+ "id": "MvPsCaw_vDaT",
+ "colab": {
+ "base_uri": "https://localhost:8080/"
+ },
+ "outputId": "7b16a6ac-679e-41b6-cce8-67b4f193b91a"
+ },
+ "execution_count": 49,
+ "outputs": [
+ {
+ "output_type": "stream",
+ "name": "stdout",
+ "text": [
+ "User: user_216\n",
+ "Ad ID | Topic | Retrival Score | Rank Probability\n",
+ "ad_660 | Profound optimizing utilizat | 8.1021 | 99.19%\n",
+ "ad_318 | Front-line upward-trending g | 6.6563 | 99.07%\n",
+ "ad_311 | Front-line methodical utiliz | 6.6728 | 98.77%\n",
+ "ad_31 | Ameliorated well-modulated c | 6.4871 | 98.65%\n",
+ "ad_861 | Synergized clear-thinking pr | 6.2368 | 98.57%\n",
+ "ad_387 | Implemented didactic support | 5.9674 | 98.47%\n",
+ "ad_799 | Self-enabling optimal initia | 5.8983 | 98.43%\n",
+ "ad_984 | Vision-oriented contextually | 5.9103 | 98.29%\n",
+ "ad_706 | Re-engineered demand-driven | 6.5815 | 98.22%\n",
+ "ad_916 | Universal multi-state system | 5.6566 | 98.17%\n"
+ ]
+ }
+ ]
+ },
+ {
+ "cell_type": "code",
+ "source": [],
+ "metadata": {
+ "id": "ECqj1I91JUgg"
+ },
+ "execution_count": 45,
+ "outputs": []
+ }
+ ]
+}
\ No newline at end of file