# Advertisement Ranking and Recommendation System

## Overview

An **Advertisement Ranking and Recommendation System** is designed to optimize ad placements by predicting the most relevant ads for users based on their interaction history, profiles, and ad attributes. The primary goal is to maximize user engagement, improve click-through rates (CTR), and ultimately generate more revenue for both advertisers and platforms.

## Motivation

The core motivation is to maximize the effectiveness of advertisements by delivering the most relevant ads to users. This can be achieved by:

- **Increasing Revenue**: More relevant ads lead to higher click-through and conversion rates, benefiting both the platform and advertisers.
- **Improving User Experience**: Personalized recommendations prevent ad fatigue and increase user engagement.
- **Optimizing Ad Spend**: Ad targeting ensures advertisers budgets are spent more efficiently by delivering high-value ads to relevant users.

The system uses machine learning to predict ad relevance and ranks ads according to their likelihood of user interaction, enabling the optimization of ad revenue.

---

## 1. Entities and Data Structures

Entities in the recommendation system include users, ads, campaigns, and their interactions. Here is how data could be structured in JSON format:

```json
{
  "user": {
    "user_id": "12345",
    "age": 30,
    "gender": "Male",
    "location": "New York",
    "interests": ["Technology", "Fitness", "Travel"],
    "device_type": "mobile",
    "historical_clicks": [
      {"ad_id": "ad123", "timestamp": "2025-01-10T12:00:00", "click": true},
      {"ad_id": "ad456", "timestamp": "2025-01-12T15:30:00", "click": false}
    ]
  },
  "ad": {
    "ad_id": "ad123",
    "campaign_id": "camp567",
    "creative_type": "image",
    "targeting": {"age_range": "25-35", "interests": ["Technology", "Fitness"]},
    "budget": 10000,
    "start_date": "2025-01-01",
    "end_date": "2025-01-31",
    "clicks": 1200,
    "impressions": 15000
  },
  "campaign": {
    "campaign_id": "camp567",
    "ad_group_id": "grp123",
    "ad_ids": ["ad123", "ad456", "ad789"],
    "total_spent": 5000,
    "goal": "maximize_clicks",
    "bidding_strategy": "CPC"
  }
}
```

---

## 2. Candidate Generation

Generate a list of candidate ads for each user based on:

- **User Attributes**: Age, gender, interests, historical interactions.
- **Ad Attributes**: Type of creative (image, video), targeting (location, interests), past performance.
- **Collaborative Filtering**: Ads seen by similar users or liked by users with similar interests.

Example JSON for candidate generation:

```json
{
  "user_id": "12345",
  "ad_candidates": [
    {"ad_id": "ad123", "relevance_score": 0.85, "creative_type": "image", "targeting_match": 0.9},
    {"ad_id": "ad456", "relevance_score": 0.75, "creative_type": "video", "targeting_match": 0.8},
    {"ad_id": "ad789", "relevance_score": 0.65, "creative_type": "carousel", "targeting_match": 0.85}
  ]
}
```

---

## 3. Feature Extraction

Key features for both users and ads include:

- **User Features**: Age, gender, location, historical interaction.
- **Ad Features**: Creative type, targeting, past performance.
- **Contextual Features**: Time of day, device type, geolocation.

Training data example:

```json
{
  "user_id": "12345",
  "ad_id": "ad123",
  "user_age": 30,
  "user_gender": "Male",
  "ad_creative_type_image": 1,
  "ad_targeting_match": 0.9,
  "historical_click_rate": 0.15
}
```

---

## 4. Two-Tower Neural Network Architecture

This architecture uses two distinct neural networks (or “towers”) for learning representations of users and ads independently, followed by a combination of their outputs to predict the likelihood of a user clicking on an ad.

### Model Structure:

1. **User Tower**: Takes user-related features (age, gender, location) as input and learns a user embedding.
2. **Ad Tower**: Takes ad-related features (creative type, targeting, past performance) and learns an ad embedding.
3. **Interaction Layer**: The embeddings of both towers are concatenated and passed through a dense layer to predict the probability of a click.

### Mathematical Formulation:

- **User Embedding**: 
$\mathbf{u} = f_u(\mathbf{x}_u)$

where $( \mathbf{x}_u )$ is the user feature vector and $( f_u )$ is the user tower.

- **Ad Embedding**: 
$\mathbf{a} = f_a(\mathbf{x}_a)$
where $( \mathbf{x}_a )$ is the ad feature vector and $( f_a )$ is the ad tower.

- **Interaction Layer**: 
The output of the interaction layer is a scalar value, representing the probability of a click:
$$ 
P(\text{click}) = \sigma(\mathbf{w}^T \cdot (\mathbf{u} \oplus \mathbf{a}))
$$
where $( \oplus )$ represents concatenation, and $( \sigma )$ is the sigmoid activation function.
You may take dot product as well to reduce feature vector size of the network.

### Loss Function:

The system uses **Binary Cross-Entropy** as the loss function:

$$
L = -\frac{1}{N} \sum_{i=1}^{N} \left[ y_i \log(\hat{y}_i) + (1 - y_i) \log(1 - \hat{y}_i) \right]
$$
where:
- \$( y_i \)$ is the actual label (1 for click, 0 for no-click),
- $( \hat{y}_i)$ is the predicted probability of click.

---

## 5. Alternative Model: XGBoost

An alternative approach is using **XGBoost** (Gradient Boosting) for ranking, which is widely used for classification and regression problems:

- **Input Features**: Use user and ad features.
- **Ranking Objective**: Instead of predicting a binary outcome, XGBoost can be trained to rank ads for a user by optimizing a **pairwise ranking loss function**.

### Mathematical Formulation for XGBoost:

- **Ranking Loss Function**: For two ads, \( i \) and \( j \), XGBoost can minimize the **Pairwise Log Loss**:
$$ 
L_{rank}(i, j) = \log(1 + \exp(- (f_i - f_j)))
$$
where $( f_i )$ and $( f_j)$ are the predicted scores for ads $(i)$ and $(j)$ , respectively.

---

## 6. Evaluation

Evaluation metrics include:

- **CTR**: Click-through rate for ad performance.
- **NDCG**: Normalized Discounted Cumulative Gain for ranking quality.
- **AUC-ROC**: Measures models discriminatory ability.

---

## 7. Deployment Strategy

- **Model Deployment**: Use Docker and Kubernetes for containerization and orchestration.
- **A/B Testing**: Test new models against existing ones to evaluate improvement before full deployment.
- **Real-time Feedback**: Incorporate user feedback and interactions to continuously improve ad recommendations.

---

## Conclusion

The Two-Tower architecture, combined with XGBoost as an alternative, ensures robust ad ranking and recommendation capabilities. By leveraging the power of neural networks and tree-based models, the system can effectively optimize ad placements, improving user engagement and generating more revenue for advertisers.



## Model Architecture Example using PyTorch


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.3.1[0m[39;49m -> [0m[32;49m25.0[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m
Collecting torchviz
  Downloading torchviz-0.0.3-py3-none-any.whl.metadata (2.1 kB)
Downloading torchviz-0.0.3-py3-none-any.whl (5.7 kB)
Installing collected packages: torchviz
Successfully installed torchviz-0.0.3

[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.3.1[0m[39;49m -> [0m[32;49m25.0[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m


In [19]:
import torch
import torch.nn as nn

class TwoTowerModel(nn.Module):
    def __init__(self, user_input_dim, ad_input_dim, embedding_dim=64):
        super(TwoTowerModel, self).__init__()
        
        # User Tower - Sequential
        self.user_tower = nn.Sequential(
            nn.Linear(user_input_dim, embedding_dim),
            nn.ReLU(),
            nn.Linear(embedding_dim, embedding_dim),
            nn.ReLU()
        )
        
        # Ad Tower - Sequential
        self.ad_tower = nn.Sequential(
            nn.Linear(ad_input_dim, embedding_dim),
            nn.ReLU(),
            nn.Linear(embedding_dim, embedding_dim),
            nn.ReLU()
        )
        
        # Interaction Layer
        self.fc_interaction = nn.Linear(embedding_dim * 2, 1)
        
        # Sigmoid for click probability
        self.sigmoid = nn.Sigmoid()

    def forward(self, user_input, ad_input):
        # User tower forward pass
        user_embedding = self.user_tower(user_input)
        
        # Ad tower forward pass
        ad_embedding = self.ad_tower(ad_input)
        
        # Concatenate user and ad embeddings
        combined_embedding = torch.cat((user_embedding, ad_embedding), dim=-1)
        
        # Interaction layer
        click_probability = self.fc_interaction(combined_embedding)
        
        # Apply sigmoid activation to get probability of click
        return self.sigmoid(click_probability)

# Example input dimensions
user_input_dim = 10  # example user features
ad_input_dim = 8     # example ad features

# Instantiate the model
model = TwoTowerModel(user_input_dim, ad_input_dim)

# Print the model architecture
print(model)


TwoTowerModel(
  (user_tower): Sequential(
    (0): Linear(in_features=10, out_features=64, bias=True)
    (1): ReLU()
    (2): Linear(in_features=64, out_features=64, bias=True)
    (3): ReLU()
  )
  (ad_tower): Sequential(
    (0): Linear(in_features=8, out_features=64, bias=True)
    (1): ReLU()
    (2): Linear(in_features=64, out_features=64, bias=True)
    (3): ReLU()
  )
  (fc_interaction): Linear(in_features=128, out_features=1, bias=True)
  (sigmoid): Sigmoid()
)
