# Language Exchange Matchmaking System
## AI for Social Good - Course Project

This notebook demonstrates an intelligent matchmaking system for language exchange partners using:
- **LinUCB (Contextual Bandits)** - For learning user preferences
- **Bipartite Graph Matching** - For optimal pairing
- **Personalization** - Per-user learning

---

## Table of Contents
1. **Setup** - Install dependencies and upload files
2. **Interactive Demo** - Register users, run matching, accept/reject
3. **Experiments** - Automated experiments comparing algorithms

---

# Part 1: Setup

## 1.1 Auto-setup (recommended)

This notebook is designed to run **without manual file uploads**.

**Best option:** host this project on GitHub, then set `REPO_URL` in the setup cell and run: **Runtime → Run all**.

**Alternative (no GitHub):** upload the project ZIP to Colab, unzip it, and rerun the setup cell.



In [None]:
# ===== Colab Auto-Setup =====
# This cell makes the notebook runnable with a single "Runtime → Run all".

import os, sys, subprocess

REPO_URL = 'https://github.com/shimona4321-collab/learning-languages-project.git'
PROJECT_DIR = 'learning-languages-project'  # folder name after clone

# 1) Clone (only if not already present)
if not os.path.exists(PROJECT_DIR):
    subprocess.run(['git', 'clone', '--depth', '1', REPO_URL, PROJECT_DIR], check=True)
else:
    print('Repo folder already exists, skipping clone.')

# 2) Enter repo root
os.chdir(PROJECT_DIR)
print('Working directory:', os.getcwd())

# 3) Install Python deps
subprocess.run([sys.executable, '-m', 'pip', 'install', '-r', 'requirements.txt', '-q'], check=True)

print('Setup complete.')


## 1.2 (Optional) How to share with the instructor

For the easiest grading experience, share **one Colab link** that runs everything.

Recommended workflow:
1. Upload this project to GitHub.
2. Open this notebook in Colab.
3. Set `REPO_URL` in the setup cell (above) and click **Runtime → Run all**.



In [None]:
# Quick sanity check: confirm project files exist
from pathlib import Path

assert Path('requirements.txt').exists(), 'requirements.txt missing'
assert Path('app').exists(), 'app/ folder missing'
assert Path('app/main.py').exists(), 'app/main.py missing'
assert Path('app/checking_langmatc.py').exists(), 'app/checking_langmatc.py missing'

print('Project structure looks good!')



In [None]:
# Optional: Mount Google Drive (only if you stored the project ZIP in Drive)
# from google.colab import drive
# drive.mount('/content/drive')
# %cd /content/drive/MyDrive
# !unzip -q Learning_languages_project_ready.zip
# %cd Learning_languages_project



## 1.3 Import Project Modules

If the setup cell ran successfully, the project is already imported as `lm` (alias for `app`).


In [None]:
import os, sys, random, io, contextlib
import numpy as np

# (This notebook assumes it's being run from the repo root in Colab after setup.)
# If you run locally, make sure your working directory is the project root.

# Silence heavy internal prints by wrapping calls (Colab has no terminal-like TTY)
QUIET_MODE = True
def qcall(fn, *args, **kwargs):
    if not QUIET_MODE:
        return fn(*args, **kwargs)
    buf = io.StringIO()
    with contextlib.redirect_stdout(buf), contextlib.redirect_stderr(buf):
        return fn(*args, **kwargs)

import app as lm
print("Imported project successfully! (QUIET_MODE =", QUIET_MODE, ")")


---
# Part 2: Interactive Demo

This section allows you to interact with the system:
- Register users manually
- Run matching rounds
- View and respond to proposals
- Visualize the matching graph

## 2.0 Run the full interactive CLI (optional)

If you want the exact same interactive experience as running locally (Admin/User menus), run the next cell.
In Colab you can type your answers directly in the input prompts.


In [None]:
# Optional: Interactive CLI demo (OFF by default so "Run all" won't hang)
#
# If you want to try the full CLI inside Colab:
# 1) Set RUN_INTERACTIVE = True
# 2) Rerun this cell
# 3) When prompted, type your choices in the input box that appears.

RUN_INTERACTIVE = False  # set to False to keep "Run all" non-interactive

if RUN_INTERACTIVE:
    import importlib

    # Safety: if an old variable named `app_main` exists from a previous run,
    # it might point to a function (not a module) and cause AttributeError.
    if "app_main" in globals():
        del globals()["app_main"]

    # Force-load the *module* `app.main` (not a function alias).
    app_main_mod = importlib.import_module("app.main")
    importlib.reload(app_main_mod)  # helpful if you updated code and want the latest version

    try:
        # Run the CLI in-process (NOT via subprocess), so input() works in Colab.
        app_main_mod.main()
    except SystemExit:
        # app_main may call sys.exit(); catch it so the kernel doesn't complain.
        print("Interactive CLI exited.")
    except KeyboardInterrupt:
        print("Stopped interactive CLI.")
else:
    print('Interactive CLI is OFF (recommended for "Run all").')
    print('To start it, set RUN_INTERACTIVE=True and rerun this cell.')


## 2.1 Initialize System State

In [None]:
# Create fresh application state
state = lm.AppState()
state.users.clear()
state.user_bandits.clear()
state.user_bandits_recent.clear()
state.proposals.clear()
state.pair_cooldowns.clear()
state.round_index = 0
state.global_bandit = lm.create_bandit(lm.PAIR_FEATURE_DIM)

print("System initialized!")
print(f"Current users: {len(state.users)}")
print(f"Current proposals: {len(state.proposals)}")

## 2.2 Register Users

Run the cells below to register users. You can modify the parameters.

In [None]:
import random
from dataclasses import asdict

DEMO_VERBOSE = False  # set True if you want to print every registered user

def register_demo_users(state, seed=123):
    """Register a tiny deterministic set of users for the demo."""
    random.seed(seed)

    # Example users
    demo_users = [
        ("john",  "English", "Hebrew", 4, 6, {"Travel": 9, "Chess": 2}),
        ("emily", "English", "Hebrew", 8, 7, {"Travel": 3, "Chess": 8}),
        ("mike",  "English", "Hebrew", 5, 5, {"Travel": 6, "Chess": 6}),
        ("lisa",  "English", "Hebrew", 6, 9, {"Travel": 8, "Chess": 4}),
        ("david", "Hebrew",  "English", 7, 8, {"Travel": 4, "Chess": 7}),
    ]

    for user_id, native, target, level, avail, topics in demo_users:
        u = lm.User(
            user_id=user_id,
            native_language=native,
            target_language=target,
            target_level_raw=level,
            availability_raw=avail,
            topic_interest_raw=topics,
        )
        state.users[user_id] = u
        if DEMO_VERBOSE:
            print(f"Registered: {user_id} (Native: {native}, Target: {target})")
            print(f"  Level: {level}, Availability: {avail}")
            print("  Interests - " + ", ".join(f"{k}: {v}" for k, v in topics.items()))

    return demo_users

print("Demo helpers ready (DEMO_VERBOSE=False).")


In [None]:
# Example: Register Hebrew speakers (learning English)
register_user("david", "Hebrew", "English", target_level=7, availability=5, travel_interest=8, chess_interest=3)
register_user("sarah", "Hebrew", "English", target_level=5, availability=8, travel_interest=2, chess_interest=9)
register_user("yosef", "Hebrew", "English", target_level=6, availability=6, travel_interest=7, chess_interest=7);


In [None]:
# Example: Register English speakers (learning Hebrew)
register_user("john", "English", "Hebrew", target_level=4, availability=6, travel_interest=9, chess_interest=2)
register_user("emily", "English", "Hebrew", target_level=8, availability=7, travel_interest=3, chess_interest=8)
register_user("mike", "English", "Hebrew", target_level=5, availability=5, travel_interest=6, chess_interest=6)
register_user("lisa", "English", "Hebrew", target_level=6, availability=9, travel_interest=8, chess_interest=4);


In [None]:
# --- Demo population setup (quiet by default) ---
# If you want to see the full list of generated users, set this to True.
SHOW_REGISTERED_USERS = False

print("Creating demo population...")
state = fresh_state()

random.seed(DEMO_SEED)
rng = random.Random(DEMO_SEED)

# Add a few Hebrew learners and English learners
he_specs = [(f"he_{i:02d}", "Hebrew", "English",
             rng.randint(0,10), rng.randint(0,10),
             rng.randint(0,10), rng.randint(0,10)) for i in range(1, 6)]
en_specs = [(f"en_{i:02d}", "English", "Hebrew",
             rng.randint(0,10), rng.randint(0,10),
             rng.randint(0,10), rng.randint(0,10)) for i in range(1, 6)]

for spec in he_specs:
    add_user_exp(state, *spec)
for spec in en_specs:
    add_user_exp(state, *spec)

print("Demo users created.")

if SHOW_REGISTERED_USERS:
    print("\n--- Registered users (verbose) ---")
    for u in state.users.values():
        print(u)


## 2.3 Run Matching Round

The system will use LinUCB to find optimal matches.

In [None]:
print("Running one matching round (demo)...")
qcall(lm.run_matching_round, state)

# Minimal summary:
print(f"Proposals generated this round: {len(state.proposals)}")
print("Done.")


## 2.4 View Proposals (safe output)


In [None]:
# --- Optional: print proposals (verbose) ---
# Keeping this OFF makes "Run all" much cleaner.
SHOW_PROPOSALS = False

if SHOW_PROPOSALS:
    if not state.proposals:
        print("No proposals currently.")
    else:
        print("Current proposals:")
        for p in state.proposals.values():
            print(p)
else:
    print('Skipping proposal printout (set SHOW_PROPOSALS=True to display).')


## 2.5 Respond to Proposals (Accept/Reject)

In [None]:
def respond_to_proposal(user_id, accept=True):
    """
    Respond to a proposal for a specific user.
    
    Parameters:
    - user_id: The user responding
    - accept: True to accept, False to reject
    """
    user = state.users.get(user_id)
    if not user:
        print(f"User '{user_id}' not found!")
        return
    
    # Find proposal for this user
    proposal = None
    for p in state.proposals.values():
        if p.user1_id == user_id or p.user2_id == user_id:
            proposal = p
            break
    
    if not proposal:
        print(f"No active proposal for '{user_id}'")
        return
    
    partner_id = proposal.user2_id if proposal.user1_id == user_id else proposal.user1_id
    action = "ACCEPTS" if accept else "REJECTS"
    print(f"{user_id} {action} proposal with {partner_id}")
    
    qcall(lm.handle_proposal_response, state, user, accepted=accept)
    
    # Check if match was formed
    if user.status == "matched":
        print(f"  -> MATCH FORMED! {user_id} is now matched with {user.current_partner_id}")

In [None]:
# Example: David accepts his proposal
respond_to_proposal("david", accept=True)

In [None]:
# Example: The matched English user also accepts
# Find who was matched with david
for p in state.proposals.values():
    if p.user1_id == "david" or p.user2_id == "david":
        partner = p.user2_id if p.user1_id == "david" else p.user1_id
        respond_to_proposal(partner, accept=True)
        break

In [None]:
# Example: Sarah rejects her proposal
respond_to_proposal("sarah", accept=False)

## 2.6 View System Status

In [None]:
# --- Optional: print final user status (verbose) ---
SHOW_FINAL_STATUS = False

if SHOW_FINAL_STATUS:
    print("Final user status:")
    for u in state.users.values():
        print(u.user_id, u.status, u.current_partner_id)
else:
    print('Skipping final-status printout (set SHOW_FINAL_STATUS=True to display).')


## 2.7 Visualize Matching Graph

In [None]:
# 2.7 Visualize Matching Graph (safe for Colab 'Run all')

import matplotlib.pyplot as plt

try:
    import networkx as nx
except Exception:
    # NetworkX is usually available in Colab, but install if missing
    !pip -q install networkx
    import networkx as nx

# Toggle visualization. Keep it ON by default, but with automatic safety limits.
VISUALIZE_MATCHING_GRAPH = True
MAX_PER_SIDE = 15        # if there are more users, we show only a sample (first N per side)
MAX_LABELS = 30          # if more than this many nodes, we skip drawing labels to avoid freezing
SAVE_GRAPH_PNG = True

def visualize_matching_safe():
    """Visualize current proposals/matches as a bipartite graph (safe for large graphs).

    - If the graph is large, we automatically show only a sampled subset to keep Colab responsive.
    - Labels are skipped when there are too many nodes (label rendering can freeze notebooks).
    """
    hebrew_users = [u.user_id for u in state.users.values() if u.native_language == 'Hebrew']
    english_users = [u.user_id for u in state.users.values() if u.native_language == 'English']

    nL, nR = len(hebrew_users), len(english_users)
    n_props = len(state.proposals)
    n_matches = sum(1 for u in state.users.values() if u.status == 'matched' and u.current_partner_id)

    if not VISUALIZE_MATCHING_GRAPH:
        print('Skipping graph visualization (VISUALIZE_MATCHING_GRAPH=False).')
        print(f'Hebrew={nL}, English={nR}, proposals={n_props}, matches={n_matches}')
        return

    # Auto-limit for performance
    if nL > MAX_PER_SIDE or nR > MAX_PER_SIDE:
        print('Graph is large; showing a sampled view to keep Colab responsive.')
        print(f'Hebrew={nL}, English={nR}, proposals={n_props}, matches={n_matches}')
        hebrew_users_s = hebrew_users[:MAX_PER_SIDE]
        english_users_s = english_users[:MAX_PER_SIDE]
    else:
        hebrew_users_s = hebrew_users
        english_users_s = english_users

    Lset, Rset = set(hebrew_users_s), set(english_users_s)

    G = nx.Graph()
    for uid in hebrew_users_s:
        G.add_node(uid, bipartite=0)
    for uid in english_users_s:
        G.add_node(uid, bipartite=1)

    # Edge colors keyed by undirected edge
    edge_color = {}

    # Pending proposals (orange)
    for prop in state.proposals.values():
        a, b = prop.user1_id, prop.user2_id
        if a in Lset and b in Rset:
            G.add_edge(a, b)
            edge_color[tuple(sorted((a, b)))] = 'orange'
        elif b in Lset and a in Rset:
            G.add_edge(b, a)
            edge_color[tuple(sorted((a, b)))] = 'orange'

    # Confirmed matches (green)
    for u in state.users.values():
        if u.status == 'matched' and u.current_partner_id:
            a, b = u.user_id, u.current_partner_id
            if a in Lset and b in Rset:
                G.add_edge(a, b)
                edge_color[tuple(sorted((a, b)))] = 'green'
            elif b in Lset and a in Rset:
                G.add_edge(b, a)
                edge_color[tuple(sorted((a, b)))] = 'green'

    # Positions with normalized vertical spacing
    def vertical_positions(n):
        if n <= 1:
            return [0.5]
        return [1.0 - i / (n - 1) for i in range(n)]

    pos = {}
    for uid, y in zip(hebrew_users_s, vertical_positions(len(hebrew_users_s))):
        pos[uid] = (0.0, y)
    for uid, y in zip(english_users_s, vertical_positions(len(english_users_s))):
        pos[uid] = (2.0, y)

    fig_h = max(4.0, 0.35 * (max(len(hebrew_users_s), len(english_users_s)) + 2))
    plt.figure(figsize=(10, fig_h))

    nx.draw_networkx_nodes(G, pos, nodelist=hebrew_users_s, node_size=1200, label='Hebrew')
    nx.draw_networkx_nodes(G, pos, nodelist=english_users_s, node_size=1200, label='English')

    ecolors = [edge_color.get(tuple(sorted(e)), 'gray') for e in G.edges()]
    nx.draw_networkx_edges(G, pos, edge_color=ecolors, width=2)

    if len(G.nodes()) <= MAX_LABELS:
        nx.draw_networkx_labels(G, pos, font_size=9)
    else:
        print('Skipping node labels (too many nodes).')

    plt.title('Language Exchange Matching Graph (green=matched, orange=pending)')
    plt.axis('off')
    plt.tight_layout()
    if SAVE_GRAPH_PNG:
        plt.savefig('matching_graph.png', dpi=150)
        print('Saved matching_graph.png')
    plt.show()

visualize_matching_safe()


## 2.8 Bulk Generate Random Users (for testing)

In [None]:
# --- Optional: visualize the matching graph (can be slow / noisy) ---
VISUALIZE_MATCHING_GRAPH = False

if VISUALIZE_MATCHING_GRAPH:
    # Rebuild the weighted bipartite graph used for matching and visualize it.
    # Note: This is mainly for debugging/illustration.
    import networkx as nx
    import matplotlib.pyplot as plt

    # Build graph
    G = nx.Graph()
    left = [u.user_id for u in state.users.values() if u.native_language == "Hebrew"]
    right = [u.user_id for u in state.users.values() if u.native_language == "English"]
    G.add_nodes_from(left, bipartite=0)
    G.add_nodes_from(right, bipartite=1)

    # Add edges based on the current matching scores
    for he_id in left:
        for en_id in right:
            w = float(lm.compute_edge_weight(state, he_id, en_id))
            G.add_edge(he_id, en_id, weight=w)

    # Draw (simple layout)
    pos = {}
    for i, n in enumerate(left):
        pos[n] = (0, i)
    for i, n in enumerate(right):
        pos[n] = (1, i)

    plt.figure(figsize=(10, 6))
    nx.draw(G, pos, with_labels=True, node_size=1200, font_size=9)
    edge_labels = {(u, v): f"{d['weight']:.2f}" for u, v, d in G.edges(data=True)}
    nx.draw_networkx_edge_labels(G, pos, edge_labels=edge_labels, font_size=7)
    plt.title("Demo Matching Graph (edge weights)")
    plt.show()
else:
    print('Skipping graph visualization (set VISUALIZE_MATCHING_GRAPH=True to run).')


## 2.9 Run Multiple Rounds (Simulation)

In [None]:
def run_simulation(n_rounds=5, auto_accept_prob=0.7):
    """
    Run multiple matching rounds with simulated responses.
    
    Parameters:
    - n_rounds: Number of rounds to run
    - auto_accept_prob: Probability that a user accepts their proposal
    """
    print(f"Running {n_rounds} rounds simulation...")
    print(f"Accept probability: {auto_accept_prob}")
    print("="*50)
    
    for round_num in range(n_rounds):
        print(f"\n--- Round {round_num + 1} ---")
        
        # Reset matched users to idle for new round
        for u in state.users.values():
            if u.status == "matched":
                u.status = "idle"
                u.current_partner_id = None
        state.proposals.clear()
        
        # Run matching
        qcall(lm.run_matching_round, state)
        
        # Auto-respond to proposals
        matches_formed = 0
        for prop in list(state.proposals.values()):
            u1 = state.users.get(prop.user1_id)
            u2 = state.users.get(prop.user2_id)
            
            if u1 and u2:
                accept1 = random.random() < auto_accept_prob
                accept2 = random.random() < auto_accept_prob
                
                qcall(lm.handle_proposal_response, state, u1, accepted=accept1)
                qcall(lm.handle_proposal_response, state, u2, accepted=accept2)
                
                if accept1 and accept2:
                    matches_formed += 1
        
        print(f"  Proposals: {len(state.proposals)}, Matches formed: {matches_formed}")
    
    print("\n" + "="*50)
    print("Simulation complete!")

# Uncomment to run simulation:
# run_simulation(n_rounds=5, auto_accept_prob=0.7)

---
# Part 3: Experiments

Automated experiments comparing algorithm performance.

## Helper Functions

In [None]:
# Common functions for experiments
CHESS_THRESHOLD = 5

def fresh_state():
    """Create a fresh application state."""
    st = lm.AppState()
    st.users.clear()
    st.user_bandits.clear()
    st.user_bandits_recent.clear()
    st.proposals.clear()
    st.pair_cooldowns.clear()
    st.round_index = 0
    st.global_bandit = lm.create_bandit(lm.PAIR_FEATURE_DIM)
    return st

def add_user_exp(st, user_id, native, target, level, avail, travel, chess):
    """Add a user to the state."""
    u = lm.User(
        user_id=user_id,
        native_language=native,
        target_language=target,
        target_level_raw=int(level),
        availability_raw=int(avail),
        topic_interest_raw={"Travel": int(travel), "Chess": int(chess)},
    )
    st.users[user_id] = u
    lm.ensure_user_bandits(st, user_id)
    return u

def hebrew_decision(partner):
    return int(partner.topic_interest_raw.get("Chess", 0)) > CHESS_THRESHOLD

print("Helper functions defined.")

## Experiment 1: Random vs Bandit-based Matching

In [None]:
# Experiment 1 Parameters
SEED = 444
N_HEBREW = 10
N_ENGLISH = 100
ROUNDS = 50

random.seed(SEED)
np.random.seed(SEED)
rng = random.Random(SEED)

# Generate users
hebrew_specs = [(f"he_{i:04d}", "Hebrew", "English",
                 rng.randint(0, 10), rng.randint(0, 10),
                 rng.randint(0, 10), rng.randint(0, 10)) for i in range(1, N_HEBREW + 1)]

english_specs = [(f"en_{i:04d}", "English", "Hebrew",
                  rng.randint(0, 10), rng.randint(0, 10),
                  rng.randint(0, 10), rng.randint(0, 10)) for i in range(1, N_ENGLISH + 1)]

print(f"Experiment 1: Random vs Bandit Matching")
print(f"Hebrew users: {N_HEBREW}, English users: {N_ENGLISH}, Rounds: {ROUNDS}")

In [None]:
# Run Random Matching
print("Running Random Matching...")

state_random = fresh_state()
hebrew_ids = [s[0] for s in hebrew_specs]
english_ids = [s[0] for s in english_specs]

for spec in hebrew_specs:
    add_user_exp(state_random, *spec)
for spec in english_specs:
    add_user_exp(state_random, *spec)

pas_random = []
for t in range(ROUNDS):
    shuffled = list(english_ids)
    random.shuffle(shuffled)
    chess_scores = [state_random.users[shuffled[i]].topic_interest_raw.get("Chess", 0) / 10.0 
                    for i in range(min(len(hebrew_ids), len(shuffled)))]
    pas_random.append(np.mean(chess_scores))

print(f"Random Matching - Mean PAS: {np.mean(pas_random):.4f}")

In [None]:
# Run Bandit-based Matching
print("Running Bandit-based Matching...")

state_bandit = fresh_state()
for spec in hebrew_specs:
    add_user_exp(state_bandit, *spec)
for spec in english_specs:
    add_user_exp(state_bandit, *spec)

pas_bandit = []
for t in range(ROUNDS):
    state_bandit.proposals.clear()
    for u in state_bandit.users.values():
        if u.status == "matched":
            u.status = "idle"
            u.current_partner_id = None
    
    qcall(lm.run_matching_round, state_bandit)
    
    chess_scores = []
    for he_id in hebrew_ids:
        for p in state_bandit.proposals.values():
            if p.user1_id == he_id or p.user2_id == he_id:
                en_id = p.user2_id if p.user1_id == he_id else p.user1_id
                partner = state_bandit.users.get(en_id)
                if partner:
                    chess_scores.append(partner.topic_interest_raw.get("Chess", 0) / 10.0)
                    he_accept = hebrew_decision(partner)
                    qcall(lm.handle_proposal_response, state_bandit, state_bandit.users[he_id], accepted=he_accept)
                    qcall(lm.handle_proposal_response, state_bandit, partner, accepted=True)
                break
    
    pas_bandit.append(np.mean(chess_scores) if chess_scores else 0)

print(f"Bandit Matching - Mean PAS: {np.mean(pas_bandit):.4f}")

In [None]:
# Plot Experiment 1
plt.figure(figsize=(10, 6))
plt.plot(range(1, ROUNDS + 1), pas_random, label='Random Matching', alpha=0.8, linestyle='--', marker='o', markersize=3)
plt.plot(range(1, ROUNDS + 1), pas_bandit, label='Bandit-based Matching', alpha=0.8, linestyle='-', marker='s', markersize=3)
plt.xlabel('Round')
plt.ylabel('Preference Alignment Score (PAS)')
plt.title('Experiment 1: Random vs Bandit-based Matching')
plt.legend()
plt.grid(True, alpha=0.3)
plt.savefig('exp1_results.png', dpi=150)
plt.show()

print(f"\nExperiment 1 Summary:")
print(f"  Random: {np.mean(pas_random):.4f}")
print(f"  Bandit: {np.mean(pas_bandit):.4f}")
print(f"  Improvement: {((np.mean(pas_bandit)-np.mean(pas_random))/np.mean(pas_random)*100):.1f}%")

## Experiment 2: Exploration ON vs OFF

In [None]:
# ============================
# Experiment 2: Exploration ON vs OFF (novelty)
# ============================

print("Running Experiment 2: Exploration ON vs OFF (Novelty)")
SEED2 = 111
ROUNDS2 = 50
POP_SIZE_PER_SIDE = 40

import random, numpy as np, io, contextlib, importlib

# Make this cell deterministic
random.seed(SEED2)
rng2 = random.Random(SEED2)
np.random.seed(SEED2)

# Build synthetic users (bigger pool on EN side to make exploration visible)
he_specs_2 = [
    (f"he2_{i:04d}", "Hebrew", "English",
     rng2.randint(0,10), rng2.randint(0,10),
     rng2.randint(0,10), rng2.randint(0,10))
    for i in range(1, 1 + POP_SIZE_PER_SIDE)
]
en_specs_2 = [
    (f"en2_{i:04d}", "English", "Hebrew",
     rng2.randint(0,10), rng2.randint(0,10),
     rng2.randint(0,10), rng2.randint(0,10))
    for i in range(1, 1 + 5*POP_SIZE_PER_SIDE)
]

# Alpha values:
# - OFF: tiny exploration (almost greedy, but avoids degenerate zeros)
# - ON:  strong exploration
ALPHA_OFF = 0.001
ALPHA_ON  = 5.0

def run_novelty_exp(alpha_val, name):
    # Set alpha in the module actually used by the matching code
    scoring_mod = importlib.import_module("app.matching.scoring")
    orig_p, orig_g = scoring_mod.ALPHA_PERSONAL, scoring_mod.ALPHA_GLOBAL

    try:
        scoring_mod.ALPHA_PERSONAL = float(alpha_val)
        scoring_mod.ALPHA_GLOBAL  = float(alpha_val)

        # Also set any re-exports (harmless if they exist)
        if hasattr(lm, "ALPHA_PERSONAL"):
            lm.ALPHA_PERSONAL = float(alpha_val)
        if hasattr(lm, "ALPHA_GLOBAL"):
            lm.ALPHA_GLOBAL = float(alpha_val)

        st = fresh_state()

        # Add users quietly
        for spec in he_specs_2:
            qcall(add_user_exp, st, *spec)
        for spec in en_specs_2:
            qcall(add_user_exp, st, *spec)

        he_ids = [s[0] for s in he_specs_2]
        seen = {h: set() for h in he_ids}
        rates = []

        for _ in range(ROUNDS2):
            st.proposals.clear()
            for u in st.users.values():
                if u.status == "matched":
                    u.status, u.current_partner_id = "idle", None

            # Run a round quietly
            qcall(lm.run_matching_round, st)

            novel, total = 0, 0
            # For each Hebrew user, look at the proposal (if any) and record novelty
            for he_id in he_ids:
                # find proposal involving this he user
                prop = None
                for p in st.proposals.values():
                    if p.user1_id == he_id or p.user2_id == he_id:
                        prop = p
                        break
                if prop is None:
                    continue

                en_id = prop.user2_id if prop.user1_id == he_id else prop.user1_id
                total += 1
                if en_id not in seen[he_id]:
                    novel += 1
                    seen[he_id].add(en_id)

                # Simulate responses to produce bandit feedback
                partner = st.users.get(en_id)
                if partner is not None:
                    qcall(lm.handle_proposal_response, st, st.users[he_id], accepted=hebrew_decision(partner))
                    qcall(lm.handle_proposal_response, st, partner, accepted=True)

            rates.append(novel/total if total > 0 else 0.0)

        print(f"{name}: Mean Novelty = {float(np.mean(rates)):.4f}")
        return rates

    finally:
        # Restore alpha
        scoring_mod.ALPHA_PERSONAL, scoring_mod.ALPHA_GLOBAL = orig_p, orig_g
        if hasattr(lm, "ALPHA_PERSONAL"):
            lm.ALPHA_PERSONAL = orig_p
        if hasattr(lm, "ALPHA_GLOBAL"):
            lm.ALPHA_GLOBAL = orig_g

novelty_off = run_novelty_exp(ALPHA_OFF, "Exploration OFF (alpha=0.001)")
novelty_on  = run_novelty_exp(ALPHA_ON,  "Exploration ON  (alpha=1.0)")

# Plot
plt.figure()
plt.plot(novelty_off, label="OFF (alpha=0.001)")
plt.plot(novelty_on, label="ON  (alpha=1.0)")
plt.xlabel("Round")
plt.ylabel("Novelty rate")
plt.title("Experiment 2: Novelty with/without exploration")
plt.legend()
plt.grid(True)
plt.show()


In [None]:
plt.figure(figsize=(10, 6))
plt.plot(range(1, ROUNDS2 + 1), novelty_off, label='Exploration OFF', alpha=0.7, linestyle='--', marker='o', markersize=3)
plt.plot(range(1, ROUNDS2 + 1), novelty_on, label='Exploration ON', alpha=0.7, linestyle='-', marker='s', markersize=3)
plt.xlabel('Round')
plt.ylabel('Novelty Rate')
plt.title('Experiment 2: Exploration ON vs OFF')
plt.legend()
plt.grid(True, alpha=0.3)
plt.savefig('exp2_results.png', dpi=150)
plt.show()

## Experiment 3: Personalization ON vs OFF

In [None]:
print("Running Experiment 3: Personalization ON vs OFF")
SEED3 = 222
ROUNDS3 = 50
TRAVEL_THRESHOLD = 6

random.seed(SEED3)
rng3 = random.Random(SEED3)

chess_pref = [(f"heC_{i:04d}", "Hebrew", "English",
               rng3.randint(0,10), rng3.randint(0,10),
               rng3.randint(0,10), rng3.randint(0,10)) for i in range(1, 6)]
travel_pref = [(f"heT_{i:04d}", "Hebrew", "English",
                rng3.randint(0,10), rng3.randint(0,10),
                rng3.randint(0,10), rng3.randint(0,10)) for i in range(1, 6)]
en_specs_3 = [(f"en3_{i:04d}", "English", "Hebrew",
               rng3.randint(0,10), rng3.randint(0,10),
               rng3.randint(0,10), rng3.randint(0,10)) for i in range(1, 101)]

chess_ids = [s[0] for s in chess_pref]
travel_ids = [s[0] for s in travel_pref]
all_he_3 = chess_ids + travel_ids

def decision3(he_id, partner):
    if he_id in chess_ids:
        return partner.topic_interest_raw.get("Chess", 0) > CHESS_THRESHOLD
    return partner.topic_interest_raw.get("Travel", 0) > TRAVEL_THRESHOLD

def run_personal_exp(personal_on, name):
    st = fresh_state()
    if not personal_on: st.w_personal, st.w_global = 0.0, 1.0
    
    for spec in chess_pref + travel_pref: add_user_exp(st, *spec)
    for spec in en_specs_3: add_user_exp(st, *spec)
    
    pas = []
    for t in range(ROUNDS3):
        st.proposals.clear()
        for u in st.users.values():
            if u.status == "matched": u.status, u.current_partner_id = "idle", None
        qcall(lm.run_matching_round, st)
        
        zs = []
        for he_id in all_he_3:
            for p in st.proposals.values():
                if p.user1_id == he_id or p.user2_id == he_id:
                    en_id = p.user2_id if p.user1_id == he_id else p.user1_id
                    partner = st.users.get(en_id)
                    if partner:
                        if he_id in chess_ids:
                            zs.append(partner.topic_interest_raw.get("Chess", 0) / 10.0)
                        else:
                            zs.append(partner.topic_interest_raw.get("Travel", 0) / 10.0)
                        qcall(lm.handle_proposal_response, st, st.users[he_id], accepted=decision3(he_id, partner))
                        qcall(lm.handle_proposal_response, st, partner, accepted=True)
                    break
        pas.append(np.mean(zs) if zs else 0)
    
    print(f"{name}: Mean PAS = {np.mean(pas):.4f}")
    return pas

pas_off = run_personal_exp(False, "Personalization OFF")
pas_on = run_personal_exp(True, "Personalization ON")

In [None]:
plt.figure(figsize=(10, 6))
plt.plot(range(1, ROUNDS3 + 1), pas_off, label='Personalization OFF', alpha=0.7)
plt.plot(range(1, ROUNDS3 + 1), pas_on, label='Personalization ON', alpha=0.7)
plt.xlabel('Round')
plt.ylabel('Preference Alignment Score (PAS)')
plt.title('Experiment 3: Personalization ON vs OFF')
plt.legend()
plt.grid(True, alpha=0.3)
plt.savefig('exp3_results.png', dpi=150)
plt.show()

---
# Summary

In [None]:
print("="*60)
print("SUMMARY OF ALL EXPERIMENTS")
print("="*60)

print("\nExperiment 1: Random vs Bandit-based Matching")
print(f"  Random: {np.mean(pas_random):.4f}")
print(f"  Bandit: {np.mean(pas_bandit):.4f}")
print(f"  -> Bandit improves by {((np.mean(pas_bandit)-np.mean(pas_random))/np.mean(pas_random)*100):.1f}%")

print("\nExperiment 2: Exploration ON vs OFF")
print(f"  OFF: {np.mean(novelty_off):.4f}")
print(f"  ON:  {np.mean(novelty_on):.4f}")
print(f"  -> Exploration increases novelty")

print("\nExperiment 3: Personalization ON vs OFF")
print(f"  OFF: {np.mean(pas_off):.4f}")
print(f"  ON:  {np.mean(pas_on):.4f}")
print(f"  -> Personalization improves alignment")

print("\n" + "="*60)
print("CONCLUSION")
print("="*60)
print("The LinUCB-based matching system with exploration and")
print("personalization significantly outperforms random matching.")

---

**Course:** AI for Social Good (#55982)  
**Institution:** Hebrew University - Business School  
**Semester:** A 2025-2026