# Language Exchange Matchmaking System
## AI for Social Good - Course Project

This notebook demonstrates an intelligent matchmaking system for language exchange partners using:
- **LinUCB (Contextual Bandits)** - For learning user preferences
- **Bipartite Graph Matching** - For optimal pairing
- **Personalization** - Per-user learning

---

## Table of Contents
1. **Setup** - Install dependencies and upload files
2. **Interactive Demo** - Register users, run matching, accept/reject
3. **Experiments** - Automated experiments comparing algorithms

---

# Part 1: Setup

## 1.1 Auto-setup (recommended)

This notebook is designed to run **without manual file uploads**.

**Best option:** host this project on GitHub, then set `REPO_URL` in the setup cell and run: **Runtime → Run all**.

**Alternative (no GitHub):** upload the project ZIP to Colab, unzip it, and rerun the setup cell.



In [None]:
# ===== Colab Auto-Setup =====
import os
import sys
import subprocess
from pathlib import Path

IN_COLAB = 'google.colab' in sys.modules

# Repository to clone (public). This is set automatically for the submitted project.
REPO_URL = 'https://github.com/shimona4321-collab/learning-languages-project.git'
BRANCH = 'main'
PROJECT_DIR = 'learning-languages-project'  # folder name after clone


def find_project_root():
    # Look for a folder that contains: requirements.txt AND app/__init__.py
    candidates = [Path('.'), Path(f'./{PROJECT_DIR}'), Path('./Learning_languages_project')]
    for c in candidates:
        if (c / 'requirements.txt').exists() and (c / 'app' / '__init__.py').exists():
            return c
    for c in Path('.').iterdir():
        if c.is_dir() and (c / 'requirements.txt').exists() and (c / 'app' / '__init__.py').exists():
            return c
    return None


root = find_project_root()

# If not found and we're in Colab, try to clone from GitHub
if root is None and IN_COLAB and REPO_URL:
    if not Path(PROJECT_DIR).exists():
        print(f'Cloning repo: {REPO_URL} -> {PROJECT_DIR}')
        subprocess.run(['git', 'clone', '--depth', '1', '-b', BRANCH, REPO_URL, PROJECT_DIR], check=True)
    root = find_project_root()

if root is None:
    msg = (
        'ERROR: Project folder not found.\n\n'
        'Fix:\n'
        '  1) Ensure REPO_URL points to your GitHub repo.\n'
        '  2) Rerun this cell.\n'
    )
    print(msg)
    raise RuntimeError('Project not found')

# Move into project root
os.chdir(root)
print('Working directory:', os.getcwd())

# Install dependencies
subprocess.run([sys.executable, '-m', 'pip', 'install', '-r', 'requirements.txt', '-q'], check=True)

# Import project
os.environ["LANGMATCH_VERBOSE"] = "0"  # silence internal per-round prints
import app as lm
from app.matching.matcher import set_matcher_verbose
set_matcher_verbose(False)
print('Imported project successfully!')
print('STATE_FILE:', lm.STATE_FILE)
print('SciPy available (Hungarian):', lm.SCIPY_AVAILABLE)


## 1.2 (Optional) How to share with the instructor

For the easiest grading experience, share **one Colab link** that runs everything.

Recommended workflow:
1. Upload this project to GitHub.
2. Open this notebook in Colab.
3. Set `REPO_URL` in the setup cell (above) and click **Runtime → Run all**.



In [None]:
# Quick sanity check: confirm project files exist
from pathlib import Path

assert Path('requirements.txt').exists(), 'requirements.txt missing'
assert Path('app').exists(), 'app/ folder missing'
assert Path('app/main.py').exists(), 'app/main.py missing'
assert Path('app/checking_langmatc.py').exists(), 'app/checking_langmatc.py missing'

print('Project structure looks good!')



In [None]:
# Optional: Mount Google Drive (only if you stored the project ZIP in Drive)
# from google.colab import drive
# drive.mount('/content/drive')
# %cd /content/drive/MyDrive
# !unzip -q Learning_languages_project_ready.zip
# %cd Learning_languages_project



## 1.3 Import Project Modules

If the setup cell ran successfully, the project is already imported as `lm` (alias for `app`).


In [None]:
# If you rerun this notebook from the middle, ensure imports are available
import random
import numpy as np
import matplotlib.pyplot as plt
import app as lm

print('Successfully imported app as lm!')
print('Algorithm parameters:')
print('  ALPHA_PERSONAL =', lm.ALPHA_PERSONAL)
print('  ALPHA_GLOBAL   =', lm.ALPHA_GLOBAL)
print('  PAIR_FEATURE_DIM =', lm.PAIR_FEATURE_DIM)
MODULE_LOADED = True


# Silence internal matching debug prints (keeps notebook output clean)
from app.matching.matcher import set_matcher_verbose
set_matcher_verbose(False)


---
# Part 2: Interactive Demo

This section allows you to interact with the system:
- Register users manually
- Run matching rounds
- View and respond to proposals
- Visualize the matching graph

## 2.0 Run the full interactive CLI (optional)

If you want the exact same interactive experience as running locally (Admin/User menus), run the next cell.
In Colab you can type your answers directly in the input prompts.


In [None]:
# OPTIONAL: Run the interactive CLI (Admin/User)
# By default this is disabled so 'Runtime -> Run all' won't block on input().
# Set RUN_INTERACTIVE=True and re-run this cell if you want to try it.

import sys
import subprocess

RUN_INTERACTIVE = False
if RUN_INTERACTIVE:
    subprocess.run([sys.executable, '-m', 'app.main'], check=True)
else:
    print('Interactive mode is OFF. Set RUN_INTERACTIVE=True and rerun this cell to start the CLI.')


## 2.1 Initialize System State

In [None]:
# Create fresh application state
state = lm.AppState()
state.users.clear()
state.user_bandits.clear()
state.user_bandits_recent.clear()
state.proposals.clear()
state.pair_cooldowns.clear()
state.round_index = 0
state.global_bandit = lm.create_bandit(lm.PAIR_FEATURE_DIM)

print("System initialized!")
print(f"Current users: {len(state.users)}")
print(f"Current proposals: {len(state.proposals)}")

## 2.2 Register Users

Run the cells below to register users. You can modify the parameters.

In [None]:
def register_user(user_id, native_lang, target_lang, target_level, availability, travel_interest, chess_interest, verbose: bool = False):
    """
    Register a new user in the system.
    
    Parameters:
    - user_id: Unique identifier (string)
    - native_lang: "Hebrew" or "English"
    - target_lang: "Hebrew" or "English" (opposite of native)
    - target_level: 0-10 (language proficiency goal)
    - availability: 0-10 (hours per week)
    - travel_interest: 0-10
    - chess_interest: 0-10
    """
    user = lm.User(
        user_id=user_id,
        native_language=native_lang,
        target_language=target_lang,
        target_level_raw=int(target_level),
        availability_raw=int(availability),
        topic_interest_raw={"Travel": int(travel_interest), "Chess": int(chess_interest)},
    )
    state.users[user_id] = user
    lm.ensure_user_bandits(state, user_id)
    if verbose:
        print(f"Registered: {user_id} (Native: {native_lang}, Target: {target_lang})")
        print(f"  Level: {target_level}, Availability: {availability}")
        print(f"  Interests - Travel: {travel_interest}, Chess: {chess_interest}")
    return user

In [None]:
# Example: Register Hebrew speakers (learning English)
register_user("david", "Hebrew", "English", target_level=7, availability=5, travel_interest=8, chess_interest=3)
register_user("sarah", "Hebrew", "English", target_level=5, availability=8, travel_interest=2, chess_interest=9)
register_user("yosef", "Hebrew", "English", target_level=6, availability=6, travel_interest=7, chess_interest=7);


In [None]:
# Example: Register English speakers (learning Hebrew)
_ = register_user("john", "English", "Hebrew", target_level=4, availability=6, travel_interest=9, chess_interest=2)
_ = register_user("emily", "English", "Hebrew", target_level=8, availability=7, travel_interest=3, chess_interest=8)
_ = register_user("mike", "English", "Hebrew", target_level=5, availability=5, travel_interest=6, chess_interest=6)
_ = register_user("lisa", "English", "Hebrew", target_level=6, availability=9, travel_interest=8, chess_interest=4);


In [None]:
# View all registered users
print(f"\n{'='*60}")
print(f"REGISTERED USERS: {len(state.users)}")
print(f"{'='*60}")

hebrew_users = [u for u in state.users.values() if u.native_language == "Hebrew"]
english_users = [u for u in state.users.values() if u.native_language == "English"]

print(f"\nHebrew speakers ({len(hebrew_users)}):")
for u in hebrew_users:
    print(f"  {u.user_id}: Level={u.target_level_raw}, Avail={u.availability_raw}, "
          f"Travel={u.topic_interest_raw.get('Travel',0)}, Chess={u.topic_interest_raw.get('Chess',0)}")

print(f"\nEnglish speakers ({len(english_users)}):")
for u in english_users:
    print(f"  {u.user_id}: Level={u.target_level_raw}, Avail={u.availability_raw}, "
          f"Travel={u.topic_interest_raw.get('Travel',0)}, Chess={u.topic_interest_raw.get('Chess',0)}")

## 2.3 Run Matching Round

The system will use LinUCB to find optimal matches.

In [None]:
# Run a matching round
print("Running matching round...")
print(f"Round #{state.round_index + 1}")
print("-" * 40)

lm.run_matching_round(state)

print(f"\nMatching complete!")
print(f"Active proposals: {len(state.proposals)}")

## 2.4 View Proposals (safe output)


In [None]:
# View active proposals (SAFE: limited output to avoid Colab freezing)
import itertools

MAX_SHOW = 20   # You can increase this if you want to inspect more proposals
VERBOSE = False # If True, prints extra per-user topic fields

print(f"\n{'='*60}")
print("ACTIVE PROPOSALS")
print(f"{'='*60}")

if not state.proposals:
    print("No active proposals.")
else:
    total = len(state.proposals)
    print(f"Active proposals count: {total}")
    shown = 0

    for prop_id, prop in itertools.islice(state.proposals.items(), MAX_SHOW):
        shown += 1
        u1 = state.users.get(prop.user1_id)
        u2 = state.users.get(prop.user2_id)

        # Always show a short one-line summary
        score = getattr(prop, 'score_at_offer', None)
        score_str = f"{score:.3f}" if isinstance(score, (int, float)) else "?"
        u1_native = getattr(u1, 'native_language', '?') if u1 else '?'
        u2_native = getattr(u2, 'native_language', '?') if u2 else '?'
        u1_status = 'Accepted' if getattr(prop, 'user1_accepted', False) else 'Pending'
        u2_status = 'Accepted' if getattr(prop, 'user2_accepted', False) else 'Pending'

        print(f"{shown:>3}. {prop.user1_id} <-> {prop.user2_id} | score={score_str} | {u1_native} ↔ {u2_native} | {u1_status}/{u2_status}")

        # Optional: show a few topic-interest fields (kept short to avoid huge output)
        if VERBOSE and u1 and u2:
            t1 = getattr(u1, 'topic_interest_raw', {}) or {}
            t2 = getattr(u2, 'topic_interest_raw', {}) or {}
            print(f"     {prop.user1_id} topics: Travel={t1.get('Travel',0)}, Chess={t1.get('Chess',0)}")
            print(f"     {prop.user2_id} topics: Travel={t2.get('Travel',0)}, Chess={t2.get('Chess',0)}")

    if total > shown:
        print(f"... displayed {shown}/{total}. Increase MAX_SHOW or set VERBOSE=True if needed.")


## 2.5 Respond to Proposals (Accept/Reject)

In [None]:
def respond_to_proposal(user_id, accept=True):
    """
    Respond to a proposal for a specific user.
    
    Parameters:
    - user_id: The user responding
    - accept: True to accept, False to reject
    """
    user = state.users.get(user_id)
    if not user:
        print(f"User '{user_id}' not found!")
        return
    
    # Find proposal for this user
    proposal = None
    for p in state.proposals.values():
        if p.user1_id == user_id or p.user2_id == user_id:
            proposal = p
            break
    
    if not proposal:
        print(f"No active proposal for '{user_id}'")
        return
    
    partner_id = proposal.user2_id if proposal.user1_id == user_id else proposal.user1_id
    action = "ACCEPTS" if accept else "REJECTS"
    print(f"{user_id} {action} proposal with {partner_id}")
    
    lm.handle_proposal_response(state, user, accepted=accept)
    
    # Check if match was formed
    if user.status == "matched":
        print(f"  -> MATCH FORMED! {user_id} is now matched with {user.current_partner_id}")

In [None]:
# Example: David accepts his proposal
respond_to_proposal("david", accept=True)

In [None]:
# Example: The matched English user also accepts
# Find who was matched with david
for p in state.proposals.values():
    if p.user1_id == "david" or p.user2_id == "david":
        partner = p.user2_id if p.user1_id == "david" else p.user1_id
        respond_to_proposal(partner, accept=True)
        break

In [None]:
# Example: Sarah rejects her proposal
respond_to_proposal("sarah", accept=False)

## 2.6 View System Status

In [None]:
# View current status of all users
print(f"\n{'='*60}")
print("SYSTEM STATUS")
print(f"{'='*60}")
print(f"Round: {state.round_index}")
print(f"Total users: {len(state.users)}")
print(f"Active proposals: {len(state.proposals)}")

matched = [u for u in state.users.values() if u.status == "matched"]
idle = [u for u in state.users.values() if u.status == "idle"]
pending = [u for u in state.users.values() if u.status == "pending"]

print(f"\nUser Status:")
print(f"  Matched: {len(matched)}")
for u in matched:
    print(f"    - {u.user_id} <-> {u.current_partner_id}")
print(f"  Pending: {len(pending)}")
for u in pending:
    print(f"    - {u.user_id}")
print(f"  Idle: {len(idle)}")
for u in idle:
    print(f"    - {u.user_id}")

## 2.7 Visualize Matching Graph

In [None]:
# 2.7 Visualize Matching Graph (safe for Colab 'Run all')

import matplotlib.pyplot as plt

try:
    import networkx as nx
except Exception:
    # NetworkX is usually available in Colab, but install if missing
    !pip -q install networkx
    import networkx as nx

# Toggle visualization. Keep it ON by default, but with automatic safety limits.
VISUALIZE_MATCHING_GRAPH = False
MAX_PER_SIDE = 15        # if there are more users, we show only a sample (first N per side)
MAX_LABELS = 30          # if more than this many nodes, we skip drawing labels to avoid freezing
SAVE_GRAPH_PNG = True

def visualize_matching_safe():
    """Visualize current proposals/matches as a bipartite graph (safe for large graphs).

    - If the graph is large, we automatically show only a sampled subset to keep Colab responsive.
    - Labels are skipped when there are too many nodes (label rendering can freeze notebooks).
    """
    hebrew_users = [u.user_id for u in state.users.values() if u.native_language == 'Hebrew']
    english_users = [u.user_id for u in state.users.values() if u.native_language == 'English']

    nL, nR = len(hebrew_users), len(english_users)
    n_props = len(state.proposals)
    n_matches = sum(1 for u in state.users.values() if u.status == 'matched' and u.current_partner_id)

    if not VISUALIZE_MATCHING_GRAPH:
        print('Skipping graph visualization (VISUALIZE_MATCHING_GRAPH=False).')
        print(f'Hebrew={nL}, English={nR}, proposals={n_props}, matches={n_matches}')
        return

    # Auto-limit for performance
    if nL > MAX_PER_SIDE or nR > MAX_PER_SIDE:
        print('Graph is large; showing a sampled view to keep Colab responsive.')
        print(f'Hebrew={nL}, English={nR}, proposals={n_props}, matches={n_matches}')
        hebrew_users_s = hebrew_users[:MAX_PER_SIDE]
        english_users_s = english_users[:MAX_PER_SIDE]
    else:
        hebrew_users_s = hebrew_users
        english_users_s = english_users

    Lset, Rset = set(hebrew_users_s), set(english_users_s)

    G = nx.Graph()
    for uid in hebrew_users_s:
        G.add_node(uid, bipartite=0)
    for uid in english_users_s:
        G.add_node(uid, bipartite=1)

    # Edge colors keyed by undirected edge
    edge_color = {}

    # Pending proposals (orange)
    for prop in state.proposals.values():
        a, b = prop.user1_id, prop.user2_id
        if a in Lset and b in Rset:
            G.add_edge(a, b)
            edge_color[tuple(sorted((a, b)))] = 'orange'
        elif b in Lset and a in Rset:
            G.add_edge(b, a)
            edge_color[tuple(sorted((a, b)))] = 'orange'

    # Confirmed matches (green)
    for u in state.users.values():
        if u.status == 'matched' and u.current_partner_id:
            a, b = u.user_id, u.current_partner_id
            if a in Lset and b in Rset:
                G.add_edge(a, b)
                edge_color[tuple(sorted((a, b)))] = 'green'
            elif b in Lset and a in Rset:
                G.add_edge(b, a)
                edge_color[tuple(sorted((a, b)))] = 'green'

    # Positions with normalized vertical spacing
    def vertical_positions(n):
        if n <= 1:
            return [0.5]
        return [1.0 - i / (n - 1) for i in range(n)]

    pos = {}
    for uid, y in zip(hebrew_users_s, vertical_positions(len(hebrew_users_s))):
        pos[uid] = (0.0, y)
    for uid, y in zip(english_users_s, vertical_positions(len(english_users_s))):
        pos[uid] = (2.0, y)

    fig_h = max(4.0, 0.35 * (max(len(hebrew_users_s), len(english_users_s)) + 2))
    plt.figure(figsize=(10, fig_h))

    nx.draw_networkx_nodes(G, pos, nodelist=hebrew_users_s, node_size=1200, label='Hebrew')
    nx.draw_networkx_nodes(G, pos, nodelist=english_users_s, node_size=1200, label='English')

    ecolors = [edge_color.get(tuple(sorted(e)), 'gray') for e in G.edges()]
    nx.draw_networkx_edges(G, pos, edge_color=ecolors, width=2)

    if len(G.nodes()) <= MAX_LABELS:
        nx.draw_networkx_labels(G, pos, font_size=9)
    else:
        print('Skipping node labels (too many nodes).')

    plt.title('Language Exchange Matching Graph (green=matched, orange=pending)')
    plt.axis('off')
    plt.tight_layout()
    if SAVE_GRAPH_PNG:
        plt.savefig('matching_graph.png', dpi=150)
        print('Saved matching_graph.png')
    plt.show()

visualize_matching_safe()


## 2.8 Bulk Generate Random Users (for testing)

In [None]:
def generate_random_users(n_hebrew=10, n_english=10, seed=42):
    """Generate random users for testing."""
    rng = random.Random(seed)
    
    print(f"Generating {n_hebrew} Hebrew users and {n_english} English users...")
    
    for i in range(1, n_hebrew + 1):
        uid = f"he_user_{i:03d}"
        register_user(
            uid, "Hebrew", "English",
            target_level=rng.randint(1, 10),
            availability=rng.randint(1, 10),
            travel_interest=rng.randint(0, 10),
            chess_interest=rng.randint(0, 10)
        )
    
    for i in range(1, n_english + 1):
        uid = f"en_user_{i:03d}"
        register_user(
            uid, "English", "Hebrew",
            target_level=rng.randint(1, 10),
            availability=rng.randint(1, 10),
            travel_interest=rng.randint(0, 10),
            chess_interest=rng.randint(0, 10)
        )
    
    print(f"\nTotal users: {len(state.users)}")

# Uncomment to generate random users:
# generate_random_users(n_hebrew=5, n_english=5, seed=123)

## 2.9 Run Multiple Rounds (Simulation)

In [None]:
def run_simulation(n_rounds=5, auto_accept_prob=0.7):
    """
    Run multiple matching rounds with simulated responses.
    
    Parameters:
    - n_rounds: Number of rounds to run
    - auto_accept_prob: Probability that a user accepts their proposal
    """
    print(f"Running {n_rounds} rounds simulation...")
    print(f"Accept probability: {auto_accept_prob}")
    print("="*50)
    
    for round_num in range(n_rounds):
        print(f"\n--- Round {round_num + 1} ---")
        
        # Reset matched users to idle for new round
        for u in state.users.values():
            if u.status == "matched":
                u.status = "idle"
                u.current_partner_id = None
        state.proposals.clear()
        
        # Run matching
        lm.run_matching_round(state)
        
        # Auto-respond to proposals
        matches_formed = 0
        for prop in list(state.proposals.values()):
            u1 = state.users.get(prop.user1_id)
            u2 = state.users.get(prop.user2_id)
            
            if u1 and u2:
                accept1 = random.random() < auto_accept_prob
                accept2 = random.random() < auto_accept_prob
                
                lm.handle_proposal_response(state, u1, accepted=accept1)
                lm.handle_proposal_response(state, u2, accepted=accept2)
                
                if accept1 and accept2:
                    matches_formed += 1
        
        print(f"  Proposals: {len(state.proposals)}, Matches formed: {matches_formed}")
    
    print("\n" + "="*50)
    print("Simulation complete!")

# Uncomment to run simulation:
# run_simulation(n_rounds=5, auto_accept_prob=0.7)

---
# Part 3: Experiments

Automated experiments comparing algorithm performance.

## Helper Functions

In [None]:
# Common functions for experiments
CHESS_THRESHOLD = 5

def fresh_state():
    """Create a fresh application state."""
    st = lm.AppState()
    st.users.clear()
    st.user_bandits.clear()
    st.user_bandits_recent.clear()
    st.proposals.clear()
    st.pair_cooldowns.clear()
    st.round_index = 0
    st.global_bandit = lm.create_bandit(lm.PAIR_FEATURE_DIM)
    return st

def add_user_exp(st, user_id, native, target, level, avail, travel, chess):
    """Add a user to the state."""
    u = lm.User(
        user_id=user_id,
        native_language=native,
        target_language=target,
        target_level_raw=int(level),
        availability_raw=int(avail),
        topic_interest_raw={"Travel": int(travel), "Chess": int(chess)},
    )
    st.users[user_id] = u
    lm.ensure_user_bandits(st, user_id)
    return u

def hebrew_decision(partner):
    return int(partner.topic_interest_raw.get("Chess", 0)) > CHESS_THRESHOLD

print("Helper functions defined.")

## Experiment 1: Random vs Bandit-based Matching

In [None]:
# Experiment 1 Parameters
SEED = 444
N_HEBREW = 10
N_ENGLISH = 100
ROUNDS = 50

random.seed(SEED)
np.random.seed(SEED)
rng = random.Random(SEED)

# Generate users
hebrew_specs = [(f"he_{i:04d}", "Hebrew", "English",
                 rng.randint(0, 10), rng.randint(0, 10),
                 rng.randint(0, 10), rng.randint(0, 10)) for i in range(1, N_HEBREW + 1)]

english_specs = [(f"en_{i:04d}", "English", "Hebrew",
                  rng.randint(0, 10), rng.randint(0, 10),
                  rng.randint(0, 10), rng.randint(0, 10)) for i in range(1, N_ENGLISH + 1)]

print(f"Experiment 1: Random vs Bandit Matching")
print(f"Hebrew users: {N_HEBREW}, English users: {N_ENGLISH}, Rounds: {ROUNDS}")

In [None]:
# Run Random Matching
print("Running Random Matching...")

state_random = fresh_state()
hebrew_ids = [s[0] for s in hebrew_specs]
english_ids = [s[0] for s in english_specs]

for spec in hebrew_specs:
    add_user_exp(state_random, *spec)
for spec in english_specs:
    add_user_exp(state_random, *spec)

pas_random = []
for t in range(ROUNDS):
    shuffled = list(english_ids)
    random.shuffle(shuffled)
    chess_scores = [state_random.users[shuffled[i]].topic_interest_raw.get("Chess", 0) / 10.0 
                    for i in range(min(len(hebrew_ids), len(shuffled)))]
    pas_random.append(np.mean(chess_scores))

print(f"Random Matching - Mean PAS: {np.mean(pas_random):.4f}")

In [None]:
# Run Bandit-based Matching
print("Running Bandit-based Matching...")

state_bandit = fresh_state()
for spec in hebrew_specs:
    add_user_exp(state_bandit, *spec)
for spec in english_specs:
    add_user_exp(state_bandit, *spec)

pas_bandit = []
for t in range(ROUNDS):
    state_bandit.proposals.clear()
    for u in state_bandit.users.values():
        if u.status == "matched":
            u.status = "idle"
            u.current_partner_id = None
    
    lm.run_matching_round(state_bandit)
    
    chess_scores = []
    for he_id in hebrew_ids:
        for p in state_bandit.proposals.values():
            if p.user1_id == he_id or p.user2_id == he_id:
                en_id = p.user2_id if p.user1_id == he_id else p.user1_id
                partner = state_bandit.users.get(en_id)
                if partner:
                    chess_scores.append(partner.topic_interest_raw.get("Chess", 0) / 10.0)
                    he_accept = hebrew_decision(partner)
                    lm.handle_proposal_response(state_bandit, state_bandit.users[he_id], accepted=he_accept)
                    lm.handle_proposal_response(state_bandit, partner, accepted=True)
                break
    
    pas_bandit.append(np.mean(chess_scores) if chess_scores else 0)

print(f"Bandit Matching - Mean PAS: {np.mean(pas_bandit):.4f}")

In [None]:
# Plot Experiment 1
plt.figure(figsize=(10, 6))
plt.plot(range(1, ROUNDS + 1), pas_random, label='Random Matching', alpha=0.8, linestyle='--', marker='o', markersize=3)
plt.plot(range(1, ROUNDS + 1), pas_bandit, label='Bandit-based Matching', alpha=0.8, linestyle='-', marker='s', markersize=3)
plt.xlabel('Round')
plt.ylabel('Preference Alignment Score (PAS)')
plt.title('Experiment 1: Random vs Bandit-based Matching')
plt.legend()
plt.grid(True, alpha=0.3)
plt.savefig('exp1_results.png', dpi=150)
plt.show()

print(f"\nExperiment 1 Summary:")
print(f"  Random: {np.mean(pas_random):.4f}")
print(f"  Bandit: {np.mean(pas_bandit):.4f}")
print(f"  Improvement: {((np.mean(pas_bandit)-np.mean(pas_random))/np.mean(pas_random)*100):.1f}%")

## Experiment 2: Exploration ON vs OFF

In [None]:
# Experiment 2: Effect of Exploration (alpha) in LinUCB
# -------------------------------------------------
# IMPORTANT:
# In the codebase, the scoring module imports ALPHA_* values at import-time.
# Therefore, to truly toggle exploration you must patch BOTH:
#   1) app.ALPHA_PERSONAL / app.ALPHA_GLOBAL
#   2) app.matching.scoring.ALPHA_PERSONAL / ALPHA_GLOBAL

from app.matching import scoring as scoring_mod

SIM_ROUNDS = 60
POP_SIZE_PER_SIDE = 12

# Pure exploitation vs. exploration
ALPHA_OFF = 0.0
ALPHA_ON  = 1.0

# The acceptance rule below depends on user interests, so exploration should help
# discover better pairings over time.

def run_novelty_exp(alpha_val: float, name: str):
    # Save current values
    orig_lm_p, orig_lm_g = lm.ALPHA_PERSONAL, lm.ALPHA_GLOBAL
    orig_sc_p, orig_sc_g = scoring_mod.ALPHA_PERSONAL, scoring_mod.ALPHA_GLOBAL

    try:
        # Patch exploration strengths
        lm.ALPHA_PERSONAL = lm.ALPHA_GLOBAL = float(alpha_val)
        scoring_mod.ALPHA_PERSONAL = scoring_mod.ALPHA_GLOBAL = float(alpha_val)

        st = fresh_state(seed=42)
        # Keep printing minimal for experiments
        generate_random_users(st, n_he=POP_SIZE_PER_SIDE, n_en=POP_SIZE_PER_SIDE)

        novelty_rates = []
        for r in range(SIM_ROUNDS):
            # Run a round
            lm.run_matching_round(st)

            # Auto-decide proposals deterministically (accept if both have decent mutual interest)
            for k, p in list(st.proposals.items()):
                u1 = st.users[p.user1_id]
                u2 = st.users[p.user2_id]

                # acceptance heuristic: they accept if their topic overlap is strong enough
                # (this generates learning signal for the bandit)
                overlap = 0
                for t in set(u1.topic_interest_raw) | set(u2.topic_interest_raw):
                    overlap += min(u1.topic_interest_raw.get(t, 0), u2.topic_interest_raw.get(t, 0))

                accept = overlap >= 10  # threshold
                lm.handle_proposal_response(st, p.user1_id, p.user2_id, accept=accept)

            # Compute "novel partner" rate = fraction of users matched with a new partner this round
            matched_pairs = []
            for uid, u in st.users.items():
                if u.status == 'matched' and u.current_partner_id is not None:
                    a,b = sorted([uid, u.current_partner_id])
                    matched_pairs.append((a,b))
            matched_pairs = sorted(set(matched_pairs))

            # A pair is "novel" if it has never appeared in recent_matches_history
            novel = 0
            for a,b in matched_pairs:
                if not st.recent_matches_history.get((a,b), 0):
                    novel += 1
            novelty_rate = (novel / max(1, len(matched_pairs)))
            novelty_rates.append(novelty_rate)

            # End round: clear matches so that next round is another proposal cycle
            lm.cancel_all_matches(st)

        return novelty_rates

    finally:
        # Restore values
        lm.ALPHA_PERSONAL, lm.ALPHA_GLOBAL = orig_lm_p, orig_lm_g
        scoring_mod.ALPHA_PERSONAL, scoring_mod.ALPHA_GLOBAL = orig_sc_p, orig_sc_g

nov_off = run_novelty_exp(ALPHA_OFF, 'No exploration (alpha=0)')
nov_on  = run_novelty_exp(ALPHA_ON,  'With exploration (alpha=1)')

plt.figure()
plt.plot(nov_off, label='No exploration (alpha=0)')
plt.plot(nov_on,  label='With exploration (alpha=1)')
plt.xlabel('Round')
plt.ylabel('Novel-pair rate')
plt.title('Experiment 2: Exploration vs. No exploration')
plt.legend()
plt.show()


In [None]:
plt.figure(figsize=(10, 6))
plt.plot(range(1, ROUNDS2 + 1), novelty_off, label='Exploration OFF', alpha=0.7, linestyle='--', marker='o', markersize=3)
plt.plot(range(1, ROUNDS2 + 1), novelty_on, label='Exploration ON', alpha=0.7, linestyle='-', marker='s', markersize=3)
plt.xlabel('Round')
plt.ylabel('Novelty Rate')
plt.title('Experiment 2: Exploration ON vs OFF')
plt.legend()
plt.grid(True, alpha=0.3)
plt.savefig('exp2_results.png', dpi=150)
plt.show()

## Experiment 3: Personalization ON vs OFF

In [None]:
print("Running Experiment 3: Personalization ON vs OFF")
SEED3 = 222
ROUNDS3 = 50
TRAVEL_THRESHOLD = 6

random.seed(SEED3)
rng3 = random.Random(SEED3)

chess_pref = [(f"heC_{i:04d}", "Hebrew", "English",
               rng3.randint(0,10), rng3.randint(0,10),
               rng3.randint(0,10), rng3.randint(0,10)) for i in range(1, 6)]
travel_pref = [(f"heT_{i:04d}", "Hebrew", "English",
                rng3.randint(0,10), rng3.randint(0,10),
                rng3.randint(0,10), rng3.randint(0,10)) for i in range(1, 6)]
en_specs_3 = [(f"en3_{i:04d}", "English", "Hebrew",
               rng3.randint(0,10), rng3.randint(0,10),
               rng3.randint(0,10), rng3.randint(0,10)) for i in range(1, 101)]

chess_ids = [s[0] for s in chess_pref]
travel_ids = [s[0] for s in travel_pref]
all_he_3 = chess_ids + travel_ids

def decision3(he_id, partner):
    if he_id in chess_ids:
        return partner.topic_interest_raw.get("Chess", 0) > CHESS_THRESHOLD
    return partner.topic_interest_raw.get("Travel", 0) > TRAVEL_THRESHOLD

def run_personal_exp(personal_on, name):
    st = fresh_state()
    if not personal_on: st.w_personal, st.w_global = 0.0, 1.0
    
    for spec in chess_pref + travel_pref: add_user_exp(st, *spec)
    for spec in en_specs_3: add_user_exp(st, *spec)
    
    pas = []
    for t in range(ROUNDS3):
        st.proposals.clear()
        for u in st.users.values():
            if u.status == "matched": u.status, u.current_partner_id = "idle", None
        lm.run_matching_round(st)
        
        zs = []
        for he_id in all_he_3:
            for p in st.proposals.values():
                if p.user1_id == he_id or p.user2_id == he_id:
                    en_id = p.user2_id if p.user1_id == he_id else p.user1_id
                    partner = st.users.get(en_id)
                    if partner:
                        if he_id in chess_ids:
                            zs.append(partner.topic_interest_raw.get("Chess", 0) / 10.0)
                        else:
                            zs.append(partner.topic_interest_raw.get("Travel", 0) / 10.0)
                        lm.handle_proposal_response(st, st.users[he_id], accepted=decision3(he_id, partner))
                        lm.handle_proposal_response(st, partner, accepted=True)
                    break
        pas.append(np.mean(zs) if zs else 0)
    
    print(f"{name}: Mean PAS = {np.mean(pas):.4f}")
    return pas

pas_off = run_personal_exp(False, "Personalization OFF")
pas_on = run_personal_exp(True, "Personalization ON")

In [None]:
plt.figure(figsize=(10, 6))
plt.plot(range(1, ROUNDS3 + 1), pas_off, label='Personalization OFF', alpha=0.7)
plt.plot(range(1, ROUNDS3 + 1), pas_on, label='Personalization ON', alpha=0.7)
plt.xlabel('Round')
plt.ylabel('Preference Alignment Score (PAS)')
plt.title('Experiment 3: Personalization ON vs OFF')
plt.legend()
plt.grid(True, alpha=0.3)
plt.savefig('exp3_results.png', dpi=150)
plt.show()

---
# Summary

In [None]:
print("="*60)
print("SUMMARY OF ALL EXPERIMENTS")
print("="*60)

print("\nExperiment 1: Random vs Bandit-based Matching")
print(f"  Random: {np.mean(pas_random):.4f}")
print(f"  Bandit: {np.mean(pas_bandit):.4f}")
print(f"  -> Bandit improves by {((np.mean(pas_bandit)-np.mean(pas_random))/np.mean(pas_random)*100):.1f}%")

print("\nExperiment 2: Exploration ON vs OFF")
print(f"  OFF: {np.mean(novelty_off):.4f}")
print(f"  ON:  {np.mean(novelty_on):.4f}")
print(f"  -> Exploration increases novelty")

print("\nExperiment 3: Personalization ON vs OFF")
print(f"  OFF: {np.mean(pas_off):.4f}")
print(f"  ON:  {np.mean(pas_on):.4f}")
print(f"  -> Personalization improves alignment")

print("\n" + "="*60)
print("CONCLUSION")
print("="*60)
print("The LinUCB-based matching system with exploration and")
print("personalization significantly outperforms random matching.")

---

**Course:** AI for Social Good (#55982)  
**Institution:** Hebrew University - Business School  
**Semester:** A 2025-2026