# League of Legends Data Analysis - Research Notebook

This notebook collects experiments, figures, code snippets, and notes to track evolution of the project for the thesis.

- Project: Data analysis of League of Legends matches
- Author: Speazyy
- Goal: Build analyses (KDA, gold, damage, items, win rates, movement heatmaps) and document iterations.

---

## Index
- Setup
- Data sources and artifacts
- Experiments
  - E1: Analyze single match (KDA, Gold, Damage, Items)
  - E2: Item names via Data Dragon
  - E3: Player movement heatmaps with names
  - E4: Player win/loss by side (blue/red)
  - E5: Champion and player winrates (multi-match)
- Findings and next steps


In [None]:
# Setup: common imports and helpers
import os, json, time
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from dotenv import load_dotenv

load_dotenv()

# Utility: pretty print JSON subset
def jprint(obj, maxlen=800):
    s = json.dumps(obj, indent=2)[:maxlen]
    print(s + ("..." if len(s) == maxlen else ""))

## Data sources and artifacts

- Matches info JSON: `arutnevjr_ajr_matches_info.json`
- Timeline & map: `fetch_api_timeline.py`, `lol_map.png`
- Player movement extraction: `extract_positions.py`
- Match analysis: `analyze_match.py`
- Win/loss by side: `fetch_player_winloss_side.py`, `plot_player_winloss_side.py`
- Helpers: `helper.py`, `lookup.py`, `summoner_list.py`

### E1: Single match analysis (KDA, Gold, Damage, Items)
Goal: Reproduce and iterate on the analyses from `analyze_match.py` (KDA, average gold per team, average damage taken, most used items).

In [None]:
# Load match info and build DataFrame like analyze_match.py
with open("arutnevjr_ajr_matches_info.json", "r") as f:
    matches_info = json.load(f)
match = matches_info[0]

item_cols = [f"item{i}" for i in range(7)]
item_data = {col: [match["items"][i][j] if j < len(match["items"][i]) else 0 for i in range(len(match["items"]))] for j, col in enumerate(item_cols)}

df = pd.DataFrame({
    "champion": match["picks"],
    "position": match["position"],
    "team": match["teams"],
    "kills": match["kills"],
    "deaths": match["deaths"],
    "assists": match["assists"],
    "gold": match["gold"],
    "win": match["win"],
    **item_data
})

# KDA
df["KDA"] = (df["kills"] + df["assists"]) / df["deaths"].replace(0, 1)
plt.figure(figsize=(8, 4))
ax = sns.barplot(x="champion", y="KDA", hue="team", data=df)
plt.title("KDA dos jogadores")
plt.ylabel("KDA")
plt.xlabel("Champion")
plt.legend()
for p in ax.patches:
    h = p.get_height()
    if pd.notna(h):
        ax.annotate(f"{h:.2f}", (p.get_x() + p.get_width()/2., h), ha='center', va='bottom', fontsize=9, color='black', xytext=(0,3), textcoords='offset points')
plt.tight_layout()
plt.show()

# Average gold per team
avg_gold = df.groupby("team")["gold"].mean()
plt.figure(figsize=(6, 4))
ax = avg_gold.plot(kind="bar", color=["blue", "red"])
plt.title("Average Gold per Team")
plt.ylabel("Average Gold")
plt.xlabel("Team")
for i, v in enumerate(avg_gold):
    ax.annotate(f"{v:.0f}", (i, v), ha='center', va='bottom', fontsize=9, color='black', xytext=(0,3), textcoords='offset points')
plt.tight_layout()
plt.show()

# Average damage taken per team (if available)
if "damageTaken" in match:
    df["damageTaken"] = match["damageTaken"]
    avg_damage_taken = df.groupby("team")["damageTaken"].mean()
    plt.figure(figsize=(6, 4))
    ax = avg_damage_taken.plot(kind="bar", color=["blue", "red"])
    plt.title("Average Damage Taken per Team")
    plt.ylabel("Average Damage Taken")
    plt.xlabel("Team")
    for i, v in enumerate(avg_damage_taken):
        ax.annotate(f"{v:.0f}", (i, v), ha='center', va='bottom', fontsize=9, color='black', xytext=(0,3), textcoords='offset points')
    plt.tight_layout()
    plt.show()

# Most used items
items = []
for col in item_cols:
    items.extend(df[col].tolist())
items = [item for item in items if item != 0]
most_used_items = pd.Series(items).value_counts().head(5)
plt.figure(figsize=(8, 4))
ax = most_used_items.plot(kind="bar")
plt.title("Most Used Items in the Game (item0-item6)")
plt.ylabel("Count")
plt.xlabel("Item ID")
for i, v in enumerate(most_used_items):
    ax.annotate(f"{v}", (i, v), ha='center', va='bottom', fontsize=9, color='black', xytext=(0,3), textcoords='offset points')
plt.tight_layout()
plt.show()

### E2: Item name mapping via Data Dragon
Goal: Map item IDs to human-readable names using Riot Data Dragon item dataset.

In [None]:
# Data Dragon item mapping
# Download items file (manually once) or via requests if needed.
# Example local path: data_dragon/items.json
try:
    with open("items.json", "r", encoding="utf-8") as f:
        items_dd = json.load(f)
    id_to_name = {int(k): v.get("name", str(k)) for k, v in items_dd.get("data", {}).items()}
    # Example: map first 5 most used item IDs from E1 if available
    try:
        print("Sample item names:")
        for k in list(id_to_name.keys())[:5]:
            print(k, id_to_name[k])
    except Exception as e:
        print("Preview error:", e)
except FileNotFoundError:
    print("items.json not found. Place Data Dragon items JSON in the project root as items.json.")

### E3: Movement heatmaps with player names
Goal: Ensure heatmap captions and titles include correct player names (participantId/puuid mapping from match info). Add snapshots and notes here when generated.

### E4: Win/Loss by side (Blue/Red)
Goal: Fetch last N matches, compute wins/losses by side, visualize as a pie chart. The JSON now stores summoner and results.

In [None]:
# Win/Loss by side pie chart
with open("player_winloss_side.json", "r") as f:
    json_data = json.load(f)

SUMMONER_NAME = json_data.get("summoner_name", "?")
TAGLINE = json_data.get("tagline", "?")
results = json_data.get("results", [])

counts = {"Blue Win": 0, "Blue Loss": 0, "Red Win": 0, "Red Loss": 0}
for entry in results:
    if entry["side"] == "blue":
        counts["Blue Win" if entry["win"] else "Blue Loss"] += 1
    else:
        counts["Red Win" if entry["win"] else "Red Loss"] += 1

labels = [f"{k} ({v})" for k, v in counts.items()]
sizes = list(counts.values())
colors = ["#4A90E2", "#B0C4DE", "#E94B3C", "#F7CAC9"]

total_wins = counts["Blue Win"] + counts["Red Win"]
total_losses = counts["Blue Loss"] + counts["Red Loss"]
TOTAL = total_wins + total_losses
winrate = (total_wins / TOTAL * 100) if TOTAL else 0

plt.figure(figsize=(7,7))
plt.pie(sizes, labels=labels, autopct='%1.1f%%', colors=colors, startangle=90, counterclock=False)
plt.title(f"{SUMMONER_NAME}#{TAGLINE} — Total: {TOTAL} | Wins: {total_wins} | Losses: {total_losses} | Winrate: {winrate:.1f}%")
plt.axis('equal')
plt.show()

### E5: Player and champion winrates (multi-match)
Notes: Aggregate across matches per player/champion; compute winrates; visualize with bar charts. To be implemented with batched data.

## Findings and next steps
- Document key learnings per experiment here.
- Track bugs, rate limit issues (429), data corrections, and mapping fixes.
- Outline improvements and follow-up actions for the thesis chapters.