# Film-Recommender — LO1–LO3 **komplett** (Labels, Zeit-Split, Ablation, Edge-Weights)
*Generated: 2025-09-12T16:00:05 UTC*

Dieses Notebook erweitert dein Projekt um:
- ✅ **Echte Labels** aus dem Letterboxd-Export (Ratings/Watchlist)
- ✅ **Zeitbasierten Split** (Train auf älter, Test auf neuer)
- ✅ **Edge-Weighted GNN** (GATv2 mit `edge_attr` aus `comp_*`/`cos`/`final`)
- ✅ **Ablation & Baselines**: LO1 (KGE-only), LO1+LO2 (`final`), LO3 (GNN), **Ensemble mit λ-Sweep**
- ✅ **Exporte**: Top-K-Listen & Ergebnis-Tabelle

> Lege dieses Notebook in `letterboxd-KG/gnn/`. Die Pfade sind relativ zu diesem Ordner.


In [1]:
# === Konfiguration & Seeds ===
CSV_PATH = "rerank_by_logical_rules.csv"
LETTERBOXD_DIR = "../data/letterboxd_export"
OUTPUT_DIR = "outputs"

# Split-Datum: None -> automatisch 80%-Quantil der positiven Interaktionszeiten
SPLIT_DATE = None

# Repro
SEED = 42
TOPK = 10
NEG_PER_POS = 3
LAMBDA_SWEEP = [0.2, 0.4, 0.6, 0.8]  # Ensemble-Anteile für LO3

import os, re, random, numpy as np, pandas as pd
from pathlib import Path

def set_all_seeds(seed=42):
    random.seed(seed)
    np.random.seed(seed)
    try:
        import torch
        torch.manual_seed(seed)
        torch.cuda.manual_seed_all(seed)
        torch.backends.cudnn.deterministic = True
        torch.backends.cudnn.benchmark = False
    except Exception:
        pass

set_all_seeds(SEED)
os.makedirs(OUTPUT_DIR, exist_ok=True)

print("CSV_PATH:", CSV_PATH)
print("LETTERBOXD_DIR:", LETTERBOXD_DIR)
print("OUTPUT_DIR:", OUTPUT_DIR)

CSV_PATH: ../data/kg/rerank_by_logical_rules.csv
LETTERBOXD_DIR: ../data/letterboxd_export
OUTPUT_DIR: ../data/kg/outputs


## (Optional) Installationen
Führe diese Zelle lokal aus, wenn PyTorch/pyG fehlen.


In [2]:
# !pip install --upgrade pip
# !pip install torch --index-url https://download.pytorch.org/whl/cpu
# !pip install torch-geometric torch-scatter torch-sparse torch-cluster torch-spline-conv -f https://data.pyg.org/whl/torch-$(python -c "import torch;print(torch.__version__.split('+')[0])").html


## 1) Rerank-CSV laden & vorbereiten

In [3]:
assert Path(CSV_PATH).exists(), f"CSV nicht gefunden: {CSV_PATH}"
df = pd.read_csv(CSV_PATH)
print(df.shape)
print(df.columns.tolist())
df.head(3)

(200, 25)
['candidate_id', 'candidate_title', 'year', 'cos', 'meta', 'final', 'seed', 'comp_genres', 'comp_keywords', 'comp_cast', 'comp_director', 'comp_runtime', 'comp_language', 'comp_popularity', 'comp_vote', 'name_norm', 'year_str', 'genre_list', 'director_list', 'watchlist_priority', 'genre_boost', 'director_boost', 'genre_penalty', 'director_penalty', 'score']


Unnamed: 0,candidate_id,candidate_title,year,cos,meta,final,seed,comp_genres,comp_keywords,comp_cast,...,name_norm,year_str,genre_list,director_list,watchlist_priority,genre_boost,director_boost,genre_penalty,director_penalty,score
0,1420,Breakfast on Pluto,2005.0,0.0,0.4557,0.1823,Pride,1.0,0.0769,0.0,...,breakfast on pluto,2005.0,[],[],True,False,False,False,False,2
1,624860,The Matrix Resurrections,2021.0,0.0,0.4696,0.1879,The Matrix,0.6667,0.2667,0.0526,...,the matrix resurrections,2021.0,[],[],True,False,False,False,False,2
2,1523,The Last King of Scotland,2006.0,0.0,0.4507,0.1803,The Social Network,1.0,0.0392,0.0,...,the last king of scotland,2006.0,[],[],True,False,False,False,False,2


In [4]:
# Numerik säubern
num_like = ['cos','final','score','comp_genres','comp_keywords','comp_cast','comp_director',
            'comp_runtime','comp_language','comp_popularity','comp_vote']
for c in num_like:
    if c in df.columns:
        df[c] = pd.to_numeric(df[c], errors='coerce').fillna(0.0)

df['seed'] = df['seed'].astype(str)
df['candidate_title'] = df['candidate_title'].astype(str)

def minmax(x):
    x = np.asarray(x, dtype=float)
    mn, mx = np.nanmin(x), np.nanmax(x)
    if not np.isfinite(mn) or not np.isfinite(mx) or mx<=mn:
        return np.zeros_like(x)
    return (x - mn) / (mx - mn)

print("Zeilen:", len(df))

Zeilen: 200


## 2) Graph aus CSV: Kanten + Gewichte (edge_weight)

In [5]:
movies = pd.unique(pd.concat([df['seed'], df['candidate_title']], ignore_index=True))
movie2id = {m:i for i,m in enumerate(movies)}
id2movie = {i:m for m,i in movie2id.items()}

rel_cols = [c for c in ['cos','final','comp_genres','comp_keywords','comp_cast','comp_director',
                        'comp_runtime','comp_language','comp_popularity','comp_vote'] if c in df.columns]

# Roh-Edges
edges = {c: [] for c in rel_cols}
for _, row in df.iterrows():
    s = row['seed']; c = row['candidate_title']
    if pd.isna(s) or pd.isna(c): 
        continue
    u, v = movie2id[s], movie2id[c]
    for rc in rel_cols:
        edges[rc].append((u, v, float(row.get(rc, 0.0))))

# Normalisierte Gewichte (0..1) je Relation
norm_edges = {}
for rc, lst in edges.items():
    if not lst:
        continue
    w = np.array([w for (_,_,w) in lst], dtype=float)
    wn = minmax(w)
    norm_edges[rc] = [(u,v,float(wn[i])) for i,(u,v,_) in enumerate(lst)]

for rc, lst in norm_edges.items():
    print(rc, "Edges:", len(lst))

cos Edges: 200
final Edges: 200
comp_genres Edges: 200
comp_keywords Edges: 200
comp_cast Edges: 200
comp_director Edges: 200
comp_runtime Edges: 200
comp_language Edges: 200
comp_popularity Edges: 200
comp_vote Edges: 200


## 3) Letterboxd-Labels (Ratings/Watchlist) + Zeit-Split

In [6]:
def smart_find(base_dir, primary_name_patterns, fallback_exts=('csv', 'CSV')):
    base = Path(base_dir)
    if not base.exists():
        return None
    for pat in primary_name_patterns:
        for ext in fallback_exts:
            cand = base / f"{pat}.{ext}"
            if cand.exists():
                return str(cand)
    for p in base.rglob("*"):
        name = p.name.lower()
        for pat in primary_name_patterns:
            if pat.lower() in name and p.suffix.lower() in ('.csv',):
                return str(p)
    return None

ratings_path   = smart_find(LETTERBOXD_DIR, ['ratings', 'ratings-export', 'ratings-2'])
watched_path   = smart_find(LETTERBOXD_DIR, ['watched', 'diary'])
watchlist_path = smart_find(LETTERBOXD_DIR, ['watchlist'])

print("ratings_path:", ratings_path)
print("watched_path:", watched_path)
print("watchlist_path:", watchlist_path)

assert ratings_path or watched_path or watchlist_path, "Keine Letterboxd-CSV gefunden. Prüfe LETTERBOXD_DIR."

import pandas as pd, numpy as np, re
def normalize_title(t):
    t = str(t).lower()
    t = re.sub(r"[^a-z0-9]+", " ", t)
    t = re.sub(r"\b(the|a|an)\b", " ", t)
    t = re.sub(r"\s+", " ", t).strip()
    return t

def parse_date_series(s):
    if s is None:
        return pd.Series(dtype='datetime64[ns]')
    try:
        return pd.to_datetime(s, errors='coerce', utc=True)
    except Exception:
        return pd.to_datetime(s.astype(str), errors='coerce', utc=True)

def parse_rating_series(s):
    if s is None:
        return pd.Series(dtype=float)
    def to_num(x):
        if pd.isna(x):
            return np.nan
        try:
            return float(x)
        except:
            txt = str(x)
            stars = txt.count('★')
            half  = '½' in txt
            return stars + (0.5 if half else 0.0)
    return s.apply(to_num).astype(float)

frames = []
if ratings_path:
    r = pd.read_csv(ratings_path)
    title = r.get('Name', r.get('Title'))
    year  = r.get('Year')
    rating= parse_rating_series(r.get('Rating'))
    date  = parse_date_series(r.get('Date', r.get('WatchedDate')))
    frames.append(pd.DataFrame({'title': title, 'year': year, 'rating': rating, 'date': date, 'watchlist': False}))
if watched_path:
    w = pd.read_csv(watched_path)
    title = w.get('Name', w.get('Title'))
    year  = w.get('Year')
    rating= parse_rating_series(w.get('Rating'))
    date  = parse_date_series(w.get('Date', w.get('WatchedDate')))
    frames.append(pd.DataFrame({'title': title, 'year': year, 'rating': rating, 'date': date, 'watchlist': False}))
if watchlist_path:
    wl = pd.read_csv(watchlist_path)
    title = wl.get('Name', wl.get('Title'))
    year  = wl.get('Year')
    date  = parse_date_series(wl.get('AddedDate', wl.get('Date')))
    frames.append(pd.DataFrame({'title': title, 'year': year, 'rating': np.nan, 'date': date, 'watchlist': True}))

inter = pd.concat(frames, ignore_index=True).dropna(subset=['title'])
inter['title_norm'] = inter['title'].apply(normalize_title)
inter['year'] = pd.to_numeric(inter['year'], errors='coerce')
inter['date'] = parse_date_series(inter['date'])
# positive: rating>=4 oder watchlist
inter['is_positive'] = (inter['rating'] >= 4.0) | (inter['watchlist'] == True)

# Mapping zu unseren Movies
cand_norm = {normalize_title(t): t for t in pd.unique(df['candidate_title'])}
seed_norm = {normalize_title(t): t for t in pd.unique(df['seed'])}
all_norm  = {**seed_norm, **cand_norm}
inter['cand_match'] = inter['title_norm'].map(all_norm)

mapped_pos = inter[(inter['is_positive']) & inter['date'].notna() & inter['cand_match'].notna()].copy()
assert len(mapped_pos) > 0, "Keine positiven Interaktionen gemappt. Prüfe Normalisierung/Titel."

split_date = pd.to_datetime(SPLIT_DATE, utc=True) if SPLIT_DATE else mapped_pos['date'].quantile(0.8)
print("Split-Datum:", split_date)

train_pos = mapped_pos[mapped_pos['date'] <= split_date]['cand_match'].tolist()
test_pos  = mapped_pos[mapped_pos['date'] >  split_date]['cand_match'].tolist()
train_pos_ids = [movie2id[t] for t in train_pos if t in movie2id]
test_pos_set  = set([t for t in test_pos if t in movie2id])

print("Train positive IDs:", len(train_pos_ids), "| Test positive unique:", len(test_pos_set))

ratings_path: ../data/letterboxd_export/ratings.csv
watched_path: ../data/letterboxd_export/watched.csv
watchlist_path: ../data/letterboxd_export/watchlist.csv
Split-Datum: 2024-05-22 04:48:00+00:00
Train positive IDs: 47 | Test positive unique: 12


## 4) Eval-Helper (NumPy 2.0 kompatibel)

In [7]:
import numpy as np
def _to_float_array(x): return np.asarray(x, dtype=float)

def dcg_at_k(rel, k=10):
    r = _to_float_array(rel)[:k]
    return float(np.sum((np.power(2.0, r) - 1.0) / np.log2(np.arange(2, r.size + 2))))

def ndcg_at_k(rel, k=10):
    r = _to_float_array(rel)
    dcg = dcg_at_k(r, k)
    ideal = np.sort(r)[::-1]
    idcg = dcg_at_k(ideal, k)
    return float(dcg / idcg) if idcg > 0 else 0.0

def hit_at_k(rel, k=10):
    r = _to_float_array(rel)[:k]
    return float(np.any(r > 0))

def recall_at_k(rel, total_pos, k=10):
    r = _to_float_array(rel)[:k]
    found = int(np.sum(r))
    return float(found / total_pos) if total_pos > 0 else 0.0

def eval_grouped_with_test(df, score_col, k=10, test_pos_set=None):
    hits, ndcgs, recalls, cnt = [], [], [], 0
    for seed, g in df.groupby('seed', sort=False):
        g = g.sort_values(score_col, ascending=False).reset_index(drop=True)
        rel = (g['candidate_title'].isin(test_pos_set)).astype(int).to_numpy()
        total_pos = int(rel.sum())
        hits.append(hit_at_k(rel, k))
        ndcgs.append(ndcg_at_k(rel, k))
        recalls.append(recall_at_k(rel, total_pos, k))
        cnt += 1
    return float(np.mean(hits)), float(np.mean(ndcgs)), float(np.mean(recalls)), cnt

## 5) Baselines (LO1 & LO1+LO2) — Eval & Export

In [8]:
# LO1: KGE-only (nutze cos, wenn vorhanden; sonst 'score' falls vorhanden)
if 'cos' in df.columns:
    df['cos_norm'] = df.groupby('seed')['cos'].transform(lambda x: (x - x.min()) / (x.max()-x.min() + 1e-9))
    h, n, r, cnt = eval_grouped_with_test(df, 'cos_norm', k=TOPK, test_pos_set=test_pos_set)
    print(f"LO1 KGE-only (`cos_norm`) — Hit@{TOPK}: {h:.3f} | NDCG@{TOPK}: {n:.3f} | Recall@{TOPK}: {r:.3f} (Seeds: {cnt})")
    df.sort_values(['seed','cos_norm'], ascending=[True, False]).groupby('seed').head(TOPK).to_csv(f"{OUTPUT_DIR}/top{TOPK}_lo1_kge.csv", index=False)
elif 'score' in df.columns:
    df['score_norm'] = df.groupby('seed')['score'].transform(lambda x: (x - x.min()) / (x.max()-x.min() + 1e-9))
    h, n, r, cnt = eval_grouped_with_test(df, 'score_norm', k=TOPK, test_pos_set=test_pos_set)
    print(f"LO1 KGE-only (`score_norm`) — Hit@{TOPK}: {h:.3f} | NDCG@{TOPK}: {n:.3f} | Recall@{TOPK}: {r:.3f} (Seeds: {cnt})")
    df.sort_values(['seed','score_norm'], ascending=[True, False]).groupby('seed').head(TOPK).to_csv(f"{OUTPUT_DIR}/top{TOPK}_lo1_kge.csv", index=False)
else:
    print("Warnung: Keine KGE-Spalte ('cos' oder 'score') gefunden — LO1-Sicht wird übersprungen.")

# LO1+LO2: final
if 'final' in df.columns:
    df['final_norm'] = df.groupby('seed')['final'].transform(lambda x: (x - x.min()) / (x.max()-x.min() + 1e-9))
    h, n, r, cnt = eval_grouped_with_test(df, 'final_norm', k=TOPK, test_pos_set=test_pos_set)
    print(f"LO1+LO2 (`final_norm`) — Hit@{TOPK}: {h:.3f} | NDCG@{TOPK}: {n:.3f} | Recall@{TOPK}: {r:.3f} (Seeds: {cnt})")
    df.sort_values(['seed','final_norm'], ascending=[True, False]).groupby('seed').head(TOPK).to_csv(f"{OUTPUT_DIR}/top{TOPK}_lo1_lo2_baseline.csv", index=False)
else:
    print("Warnung: Keine 'final'-Spalte gefunden — LO1+LO2-Sicht wird übersprungen.")

LO1 KGE-only (`cos_norm`) — Hit@10: 0.031 | NDCG@10: 0.031 | Recall@10: 0.031 (Seeds: 130)
LO1+LO2 (`final_norm`) — Hit@10: 0.031 | NDCG@10: 0.023 | Recall@10: 0.031 (Seeds: 130)


## 6) GNN-Training (LO3) mit **edge_attr** (Relationengewichte)

In [9]:
import importlib, warnings
warnings.filterwarnings('ignore')
has_torch = importlib.util.find_spec('torch') is not None
has_pyg   = importlib.util.find_spec('torch_geometric') is not None
print("Torch installiert:", has_torch, "| PyG installiert:", has_pyg)
if not (has_torch and has_pyg):
    print("GNN-Teil wird übersprungen (Pakete fehlen).")

Torch installiert: True | PyG installiert: True


In [10]:
if has_torch and has_pyg:
    import torch
    from torch import nn
    import torch.nn.functional as F
    from torch_geometric.data import HeteroData
    from torch_geometric.nn import HeteroConv, GATv2Conv

    set_all_seeds(SEED)

    data = HeteroData()
    num_movies = len(movie2id)
    d = 64

    # movie->movie edges + edge_attr=gewicht (1D)
    for rc, lst in norm_edges.items():
        if not lst: continue
        src = torch.tensor([u for (u,_,_) in lst], dtype=torch.long)
        dst = torch.tensor([v for (_,v,_) in lst], dtype=torch.long)
        w   = torch.tensor([w for (*_,w) in lst], dtype=torch.float32).view(-1,1)
        data['movie', rc, 'movie'].edge_index = torch.stack([src, dst], dim=0)
        data['movie', rc, 'movie'].edge_attr  = w

    # user->movie positives aus Train
    pos_ids = [movie2id[t] for t in train_pos if t in movie2id]
    assert len(pos_ids) > 0, "Keine Trainings-Positives im CSV-Kandidatenraum gefunden."
    u_src = torch.zeros(len(pos_ids), dtype=torch.long)
    m_dst = torch.tensor(pos_ids, dtype=torch.long)
    data['user','likes','movie'].edge_index = torch.stack([u_src, m_dst], dim=0)

    # Negative Samples
    all_movie_ids = torch.arange(num_movies, dtype=torch.long)
    pos_set = set(m_dst.tolist())
    neg_pool = [int(i) for i in all_movie_ids.tolist() if i not in pos_set]
    neg_pairs = [(0, np.random.choice(neg_pool)) for _ in range(len(pos_ids) * NEG_PER_POS)]
    un_src = torch.tensor([p[0] for p in neg_pairs], dtype=torch.long)
    mn_dst = torch.tensor([p[1] for p in neg_pairs], dtype=torch.long)

    # Nur Relationen mit dst='movie'
    conv_edge_types = [et for et in data.edge_types if et[2] == 'movie']
    edge_index_dict = {et: data[et].edge_index for et in conv_edge_types}
    edge_attr_dict  = {et: data[et].edge_attr  for et in conv_edge_types if 'edge_attr' in data[et]}

    class HeteroRecommender(nn.Module):
        def __init__(self, num_movies, dim=64, layers=2):
            super().__init__()
            self.movie_emb = nn.Embedding(num_movies, dim)
            self.user_emb  = nn.Embedding(1, dim)
            self.layers = nn.ModuleList()
            for _ in range(layers):
                self.layers.append(
                    HeteroConv(
                        { et: GATv2Conv((-1, -1), dim, edge_dim=1, add_self_loops=False)
                          for et in conv_edge_types },
                        aggr='sum'
                    )
                )

        def forward(self, edge_index_dict, edge_attr_dict=None):
            x = {'movie': self.movie_emb.weight, 'user': self.user_emb.weight}
            for conv in self.layers:
                if edge_attr_dict:
                    out = conv(x, edge_index_dict, edge_attr_dict)
                else:
                    out = conv(x, edge_index_dict)
                out = {k: F.relu(v) for k, v in out.items()}
                x.update(out)
            return x

        @staticmethod
        def score(user_vec, item_vec):
            return (user_vec * item_vec).sum(dim=-1)

    model = HeteroRecommender(num_movies=num_movies, dim=d, layers=2)
    opt = torch.optim.Adam(model.parameters(), lr=1e-3, weight_decay=1e-5)
    bce = nn.BCEWithLogitsLoss()

    # Training
    for epoch in range(1, 201):
        model.train()
        opt.zero_grad()
        x_out = model(edge_index_dict, edge_attr_dict)
        user_pos = x_out['user'][u_src]
        item_pos = x_out['movie'][m_dst]
        user_neg = x_out['user'][un_src]
        item_neg = x_out['movie'][mn_dst]
        pos_logit = HeteroRecommender.score(user_pos, item_pos)
        neg_logit = HeteroRecommender.score(user_neg, item_neg)
        loss = bce(pos_logit, torch.ones_like(pos_logit)) + bce(neg_logit, torch.zeros_like(neg_logit))
        loss.backward()
        opt.step()
        if epoch % 50 == 0:
            print(f"Epoch {epoch:3d} | Loss {loss.item():.4f}")

    # Scoring aller Kandidaten
    model.eval()
    with torch.no_grad():
        x_out = model(edge_index_dict, edge_attr_dict)
        user_vec = x_out['user'][0:1]

    gnn_scores = []
    for _, row in df.iterrows():
        cand = row['candidate_title']
        mid = movie2id.get(cand, None)
        if mid is None:
            gnn_scores.append(np.nan); continue
        item_vec = x_out['movie'][mid:mid+1]
        s = float((user_vec * item_vec).sum(dim=-1))
        gnn_scores.append(s)

    df['s_gnn'] = gnn_scores
    df['s_gnn_norm'] = df.groupby('seed')['s_gnn'].transform(lambda x: (x - x.min()) / (x.max() - x.min() + 1e-9))
    print("GNN-Scoring hinzugefügt: 's_gnn'/'s_gnn_norm'")

Epoch  50 | Loss 0.0103
Epoch 100 | Loss 0.0043
Epoch 150 | Loss 0.0025
Epoch 200 | Loss 0.0017
GNN-Scoring hinzugefügt: 's_gnn'/'s_gnn_norm'


## 7) Ablation (LO1→LO3) & Ensemble (λ-Sweep) — Eval & Exporte

In [11]:
results = []

def eval_and_log(name, col):
    if col not in df.columns: 
        print(f"[SKIP] {name}: Spalte '{col}' fehlt.")
        return
    h, n, r, cnt = eval_grouped_with_test(df, col, k=TOPK, test_pos_set=test_pos_set)
    results.append({'variant': name, 'metric': f'@{TOPK}', 'hit': h, 'ndcg': n, 'recall': r, 'seeds': cnt})
    print(f"{name:>22} — Hit@{TOPK}: {h:.3f} | NDCG@{TOPK}: {n:.3f} | Recall@{TOPK}: {r:.3f} (Seeds: {cnt})")

# LO1
if 'cos_norm' in df.columns: eval_and_log("LO1 KGE-only (cos)", 'cos_norm')
elif 'score_norm' in df.columns: eval_and_log("LO1 KGE-only (score)", 'score_norm')

# LO1+LO2
eval_and_log("LO1+LO2 (final)", 'final_norm')

# LO3 (GNN)
eval_and_log("LO3 (GNN)", 's_gnn_norm')

# Ensemble Sweep
best = None
for lam in LAMBDA_SWEEP:
    col = f"ensemble_{lam:.1f}"
    if 's_gnn_norm' in df.columns and 'final_norm' in df.columns:
        df[col] = lam * df['s_gnn_norm'] + (1.0 - lam) * df['final_norm']
        h, n, r, cnt = eval_grouped_with_test(df, col, k=TOPK, test_pos_set=test_pos_set)
        results.append({'variant': f'Ensemble λ={lam:.1f}', 'metric': f'@{TOPK}', 'hit': h, 'ndcg': n, 'recall': r, 'seeds': cnt})
        if best is None or n > best['ndcg']:
            best = {'lam': lam, 'hit': h, 'ndcg': n, 'recall': r}
        print(f"Ensemble (λ={lam:.1f}) — Hit@{TOPK}: {h:.3f} | NDCG@{TOPK}: {n:.3f} | Recall@{TOPK}: {r:.3f}")

# Ergebnisse speichern
res_df = pd.DataFrame(results)
res_path = f"{OUTPUT_DIR}/ablation_results.csv"
res_df.to_csv(res_path, index=False)
print("Ablation-Ergebnisse:", res_path)
if best:
    print(f"Bestes Ensemble (nach NDCG): λ={best['lam']:.1f} | Hit@{TOPK}={best['hit']:.3f} | NDCG@{TOPK}={best['ndcg']:.3f} | Recall@{TOPK}={best['recall']:.3f}")

# Top-K-Exporte
def export_topk(df, col, fname):
    if col not in df.columns: return None
    out = df.sort_values(['seed', col], ascending=[True, False]).groupby('seed').head(TOPK).copy()
    out['test_relevant'] = out['candidate_title'].isin(test_pos_set)
    path = f"{OUTPUT_DIR}/{fname}"
    out.to_csv(path, index=False); return path

exports = []
if 'cos_norm' in df.columns: exports.append(export_topk(df, 'cos_norm',     f"top{TOPK}_lo1_kge.csv"))
if 'final_norm' in df.columns: exports.append(export_topk(df, 'final_norm', f"top{TOPK}_lo1_lo2_baseline.csv"))
if 's_gnn_norm' in df.columns: exports.append(export_topk(df, 's_gnn_norm', f"top{TOPK}_lo3_gnn.csv"))
for lam in LAMBDA_SWEEP:
    col = f"ensemble_{lam:.1f}"
    if col in df.columns:
        exports.append(export_topk(df, col, f"top{TOPK}_ensemble_lam{lam:.1f}.csv"))

print("Top-K Exporte:")
for p in exports:
    if p: print(" -", p)

    LO1 KGE-only (cos) — Hit@10: 0.031 | NDCG@10: 0.031 | Recall@10: 0.031 (Seeds: 130)
       LO1+LO2 (final) — Hit@10: 0.031 | NDCG@10: 0.023 | Recall@10: 0.031 (Seeds: 130)
             LO3 (GNN) — Hit@10: 0.031 | NDCG@10: 0.031 | Recall@10: 0.031 (Seeds: 130)
Ensemble (λ=0.2) — Hit@10: 0.031 | NDCG@10: 0.023 | Recall@10: 0.031
Ensemble (λ=0.4) — Hit@10: 0.031 | NDCG@10: 0.023 | Recall@10: 0.031
Ensemble (λ=0.6) — Hit@10: 0.031 | NDCG@10: 0.023 | Recall@10: 0.031
Ensemble (λ=0.8) — Hit@10: 0.031 | NDCG@10: 0.023 | Recall@10: 0.031
Ablation-Ergebnisse: ../data/kg/outputs/ablation_results.csv
Bestes Ensemble (nach NDCG): λ=0.2 | Hit@10=0.031 | NDCG@10=0.023 | Recall@10=0.031
Top-K Exporte:
 - ../data/kg/outputs/top10_lo1_kge.csv
 - ../data/kg/outputs/top10_lo1_lo2_baseline.csv
 - ../data/kg/outputs/top10_lo3_gnn.csv
 - ../data/kg/outputs/top10_ensemble_lam0.2.csv
 - ../data/kg/outputs/top10_ensemble_lam0.4.csv
 - ../data/kg/outputs/top10_ensemble_lam0.6.csv
 - ../data/kg/outputs/top10

## 8) Qualitative Fallstudie (2 Seeds) — Vergleich

In [12]:
seeds_with_pos = []
for seed, g in df.groupby('seed'):
    if (g['candidate_title'].isin(test_pos_set)).any():
        seeds_with_pos.append(seed)
sample_seeds = seeds_with_pos[:2] if len(seeds_with_pos)>=2 else df['seed'].unique()[:2]

def show_top(g, col, name):
    if col in g.columns:
        top = g.sort_values(col, ascending=False).head(5)[['candidate_title', col]].copy()
        top['is_test_rel'] = top['candidate_title'].isin(test_pos_set)
        print(f"{name} Top-5:")
        print(top.to_string(index=False))

for s in sample_seeds:
    print("\n=== Seed:", s, "===")
    g = df[df['seed'] == s].copy()
    show_top(g, 'cos_norm', 'LO1 KGE')
    show_top(g, 'final_norm', 'LO1+LO2 Final')
    show_top(g, 's_gnn_norm', 'LO3 GNN')
    for lam in LAMBDA_SWEEP:
        show_top(g, f'ensemble_{lam:.1f}', f'Ensemble λ={lam:.1f}')


=== Seed: Godzilla ===
LO1 KGE Top-5:
          candidate_title  cos_norm  is_test_rel
The War of the Gargantuas       0.0         True
LO1+LO2 Final Top-5:
          candidate_title  final_norm  is_test_rel
The War of the Gargantuas         0.0         True
LO3 GNN Top-5:
          candidate_title  s_gnn_norm  is_test_rel
The War of the Gargantuas         0.0         True
Ensemble λ=0.2 Top-5:
          candidate_title  ensemble_0.2  is_test_rel
The War of the Gargantuas           0.0         True
Ensemble λ=0.4 Top-5:
          candidate_title  ensemble_0.4  is_test_rel
The War of the Gargantuas           0.0         True
Ensemble λ=0.6 Top-5:
          candidate_title  ensemble_0.6  is_test_rel
The War of the Gargantuas           0.0         True
Ensemble λ=0.8 Top-5:
          candidate_title  ensemble_0.8  is_test_rel
The War of the Gargantuas           0.0         True

=== Seed: Good Time ===
LO1 KGE Top-5:
 candidate_title  cos_norm  is_test_rel
           Bound       0.0     