# ELO-Based Dynamic Rating System Demo

This notebook demonstrates the ELO rating system for dating app matching.

## Key Differences from Static Scores:

| Feature | Static (PageRank/Composite) | Dynamic (ELO) |
|---------|----------------------------|---------------|
| Updates | Once at profile creation | After each interaction |
| Based on | Profile quality metrics | Actual match success |
| Reflects | Potential attractiveness | Market value |
| Self-correcting | No | Yes |
| New users | May be misranked | Start at average, adjust quickly |

## How ELO Works:

1. All users start at 1200 ELO rating
2. When user A likes user B:
   - If B also liked A (match): Both gain points
   - If B hasn't decided: A gets small boost, B gets larger boost
   - If B rejected A: A loses points, B gains points
3. Rating changes depend on expected outcome:
   - High-rated user matching low-rated user: small change
   - Equal-rated users matching: larger change
4. Final ratings reflect actual matching success over time

In [1]:
import sys
sys.path.insert(0, '../src')

# Force reload of modules
if 'matchmaker.models.elo' in sys.modules:
    del sys.modules['matchmaker.models.elo']
if 'matchmaker.engine' in sys.modules:
    del sys.modules['matchmaker.engine']
if 'matchmaker' in sys.modules:
    del sys.modules['matchmaker']

import cudf
import pandas as pd
import plotly.express as px
import plotly.graph_objects as go
from plotly.subplots import make_subplots

from matchmaker.engine import MatchingEngine
from matchmaker.models.elo import EloConfig

  from .autonotebook import tqdm as notebook_tqdm


## Load Data and Run Models

In [2]:
# Initialize engine
engine = MatchingEngine()

In [3]:
# Load interactions with gender information
engine.load_interactions(
    "data/swipes_clean.csv", 
    decider_col='decidermemberid',
    other_col='othermemberid', 
    like_col='like', 
    timestamp_col='timestamp',
    gender_col='decidergender'
)

Reading data... ✅
Fitting ALS... 
🚀 Preparing data...
🎯 Training male→female ALS...


100%|██████████| 15/15 [00:00<00:00, 15.02it/s]


🎯 Training female→male ALS...


100%|██████████| 15/15 [00:00<00:00, 249.50it/s]


🔄 Converting factors to CuPy arrays...
✅ Trained M2F ALS with 33173 males × 33358 females
✅ Trained F2M ALS with 10882 females × 44241 males
Complete! ✅


In [4]:
# Compute engagement scores (useful for comparison)
engine.run_engagement()

User DF updated ✅


In [5]:
# Compute popularity metrics and assign PageRank leagues
engine.run_elo()

User DF updated ✅


In [6]:
# Keep data in cudf for GPU-accelerated processing
user_gdf = engine.user_df
interaction_gdf = engine.interaction_df

print(f"Total users: {len(user_gdf)}")
print(f"Total interactions: {len(interaction_gdf)}")
print(f"\nPageRank League distribution:")
league_counts = user_gdf['league'].value_counts().to_pandas().sort_index()
print(league_counts)

Total users: 171012
Total interactions: 9827888

PageRank League distribution:
league
Bronze      27024
Diamond      9010
Gold        18017
Platinum    18016
Silver      18016
Name: count, dtype: int64


## Compute ELO Ratings

**What ELO Measures in Dating Apps:**
- **DESIRABILITY**: How often you get liked when others swipe on you
- **NOT selectivity**: Your own swiping behavior doesn't affect your ELO
- High ELO = you're frequently liked (desirable)
- Low ELO = you're frequently rejected (less desirable)

**GPU-Accelerated Implementation:**
- **9.8M interactions** processed in ~3-4 seconds
- **171K users** scored with chunked batch updates (100K interactions per chunk)
- Uses CuPy scatter-add operations (`cp.add.at`) for efficient rating accumulation
- **Rating bounds**: Clamped between 100-10,000 to prevent extreme outliers

**Gender-Specific Pools:**
- Separate rating scales for males (M) and females (F)
- Males are only compared to other males
- Females are only compared to other females
- Accounts for different market dynamics (e.g., women typically get more likes)

In [7]:
# Refresh user data
user_gdf = engine.user_df

# Show sample users with ELO ratings
sample = user_gdf[['user_id', 'gender', 'league', 'elo_rating', 'interaction_count', 'is_stable']].sample(20).to_pandas()
print("\nSample users with ELO ratings:")
print(sample)


Sample users with ELO ratings:
        user_id gender    league   elo_rating  interaction_count is_stable
110689   205596      M    Silver  1192.038696               25.0      True
125591   413047      M      None  1191.539429                7.0     False
39024    481600      M    Bronze  1174.808838               32.0      True
16626    174241      M   Diamond  1317.785156               18.0      True
108165  3258364      F    Silver  1196.538452              198.0      True
8183    1949030      M    Silver  1204.478027               24.0      True
76090    843113      M      None  1175.924683               30.0      True
138848  1079413      M      Gold  1209.495117               18.0      True
26302   3651988      M    Bronze  1090.957886              449.0      True
166461  2267616      M      None  1190.373779                8.0     False
162260   901203      M      Gold  1211.710083               19.0      True
55111     69971      M  Platinum  1226.246216               54.0    

## Analyze ELO Rating Distribution

In [12]:
# Plot ELO rating distribution
user_pd = user_gdf[['elo_rating', 'gender', 'interaction_count']].dropna().to_pandas()
user_pd = user_pd[user_pd['interaction_count'] >= 100]

fig = px.histogram(
    user_pd, 
    x='elo_rating', 
    color='gender',
    nbins=50,
    title='ELO Rating Distribution by Gender',
    labels={'elo_rating': 'ELO Rating'},
    barmode='overlay',
    opacity=0.7
)

fig.show()

In [11]:
# Box plot of ELO ratings by PageRank league
user_pd_league = user_gdf[['elo_rating', 'league', 'gender']].dropna().to_pandas()

fig = px.box(
    user_pd_league,
    x='league',
    y='elo_rating',
    color='gender',
    title='ELO Rating Distribution by PageRank League',
    category_orders={'league': ['Bronze', 'Silver', 'Gold', 'Platinum', 'Diamond']}
)
fig.show()