# Capstone Project - League of Legends Champion Recommender

> Author: Ryan Yong

**Summary:**   
- Develop a Recommender System for recommending champions to users based on their account mastery points.
- Training data: Account & Champion Data

There are a total of 7 notebooks for this project:  
 1. `01a_data_scrape.ipynb`   
 2. `01b_wiki_scrape_fail.ipynb`   
 3. `02_champion_dataset_EDA.ipynb`
 4. `03_account_dataset_EDA.ipynb`
 5. `04_intial_recommender_system.ipynb`
 6. `05_final_hybrid_system.ipynb`
 7. `06_implementation.ipynb`

---
**This Notebook**
- Represents a sample use case based on the final hybrid system created
- Sets as a precursor to the Streamlit App Module.

In [156]:
import pickle
import requests
import numpy as np
import pandas as pd
from riotwatcher import RiotWatcher, ApiError, LolWatcher

pd.set_option('display.max_rows', None)
pd.set_option('display.max_columns', None)

In [157]:
# Replace 'YOUR_API_KEY' with your actual Riot Games API key
API_KEY = 'YOUR_API_KEY'

# Initialize RiotWatcher with your API key
riot_watcher = RiotWatcher(API_KEY)
lol_watcher = LolWatcher(API_KEY)

In [158]:
# Load each variable from the pickle file
with open('../pickles/train_predicted_matrix.pkl', 'rb') as f:
    train_predicted_matrix = pickle.load(f)

with open('../pickles/Vt_k.pkl', 'rb') as f:
    Vt_k = pickle.load(f)

with open('../pickles/champion_similarity_df.pkl', 'rb') as f:
    champion_similarity_df = pickle.load(f)

with open('../pickles/scaler.pkl', 'rb') as f:
    scaler = pickle.load(f)

with open('../pickles/svd.pkl', 'rb') as f:
    svd = pickle.load(f)   

with open('../pickles/cols.pkl', 'rb') as f:
    cols = pickle.load(f) 

## League of Legends Data Variables

This document outlines key variables used in a script designed to interact with the Riot Games API for retrieving data related to the game "League of Legends."

### 1. `summoner_names`
This variable is a list of summoner names, where each name is a string formatted as `"game name#tag name"`. This format combines the player's in-game name with their associated tag in a single string.
- **Example Usage**:
  - `'Atrophy#Fiend'` represents a summoner with the game name "Atrophy" and the tag "Fiend".

### 2. `account_region`
Specifies the account region for the Riot Games API calls. This is necessary to target the correct regional database when fetching player-specific data.
- **Example Values**:
  - `'asia'`: Represents the Asian server region.
  - For full list, reference 01a_data_scrape.ipynb

### 3. `summoner_region`
Used to specify the game server region for summoner-related queries, which can differ from the account region.
- **Example Value**:
  - `'SG2'`: Represents the Singapore server region.
  - Other common values include `'EUW1'` for Europe West and `'EUN1'` for Europe Nordic & East.
  - For full list, reference 01a_data_scrape.ipynb

### 4. `data_dragon_url`
This variable holds the URL to the Data Dragon service provided by Riot Games. Data Dragon offers access to static data from the game, such as champion information.
- **URL Content**:
  - The URL `'https://ddragon.leagueoflegends.com/cdn/14.8.1/data/en_US/champion.json'` points to JSON data for champions in the English language, based on game patch 14.8.1.

**Purpose**: These variables are set up to support scripts that perform API calls to obtain player data and champion details from Riot's services. They ensure that the script interacts with the correct regional databases and accesses up-to-date static game data.

**Note**: It is essential to adjust these variables as per specific requirements and ensure that they correspond to the correct versions and regions to avoid discrepancies in fetched data.


In [159]:
# List of summoner names in the format "game name#tag name"
summoner_names = [
    'Atrophy#Fiend'
]

# Region to search for summoners
account_region = 'asia'  # Replace with desired region 
summoner_region = 'SG2'

# Data Dragon URL for champion data
data_dragon_url = 'https://ddragon.leagueoflegends.com/cdn/14.8.1/data/en_US/champion.json'

## Riot API Helper Functions

This document describes a set of Python functions designed to interact with the Riot Games API. These functions facilitate the retrieval of player and champion data for the game "League of Legends."

### 1. `get_puuid(game_name, tag_name, account_region)`
This function retrieves the PUUID (Player Universally Unique Identifier) for a player based on their game name and tag.
- **Parameters**:
  - `game_name`: The in-game name of the player.
  - `tag_name`: The tag associated with the player's game name.
  - `account_region`: The region in which the player's account is registered.

### 2. `get_champion_masteries(puuid, summoner_region)`
Fetches all champion mastery data associated with a player's PUUID.
- **Parameters**:
  - `puuid`: The PUUID of the player.
  - `summoner_region`: The game server region for the summoner.

### 3. `create_champion_id_name_mapping()`
Creates a mapping of champion IDs to their respective names using the Data Dragon service provided by Riot Games. This function does not require any parameters.

### 4. `get_summoner_id(puuid, summoner_region)`
Retrieves the summoner ID associated with a given PUUID.
- **Parameters**:
  - `puuid`: The PUUID of the player.
  - `summoner_region`: The game server region for the summoner.

### 5. `get_summoner_rank(summoner_id, summoner_region)`
Fetches the ranked solo queue tier and rank for a summoner based on their summoner ID.
- **Parameters**:
  - `summoner_id`: The summoner ID of the player.
  - `summoner_region`: The game server region for the summoner.

**Error Handling:** Each function is designed to handle API errors gracefully. If an API request fails, the function prints an error message detailing the issue and returns `None` or an empty list, as appropriate, to ensure the stability of the calling program.

**Usage Note:** These functions are designed to be used with the `riot_watcher` Python library, which must be configured with an API key obtained from Riot Games.


In [160]:
def get_puuid(game_name, tag_name, account_region):
    try:
        account = riot_watcher.account.by_riot_id(account_region, game_name, tag_name)
        return account['puuid']
    except ApiError as e:
        print(f"Error occurred while fetching PUUID for {game_name}#{tag_name}: {e.response.text}")
        return None

def get_champion_masteries(puuid, summoner_region):
    if puuid:
        try:
            champion_masteries = lol_watcher.champion_mastery.by_puuid(summoner_region, puuid)
            return champion_masteries
        except ApiError as e:
            print(f"Error occurred while fetching champion masteries for {puuid}: {e.response.text}")
    return []

# Function to fetch champion data from Data Dragon and create a mapping
def create_champion_id_name_mapping():
    try:
        response = requests.get(data_dragon_url)
        data = response.json()
        champion_data = data['data']
        champion_id_name_mapping = {int(champion_data[champion]['key']): champion_data[champion]['name'] for champion in champion_data}
        return champion_id_name_mapping
    except Exception as e:
        print(f"Error occurred while fetching champion data from Data Dragon: {e}")
        return None
    
def get_summoner_id(puuid, summoner_region):
    try:
        summoner = lol_watcher.summoner.by_puuid(encrypted_puuid=puuid,region=summoner_region)
        return summoner['id']
    except ApiError as e:
        print(f"Error occurred while fetching summoner ID for {puuid}: {e.response.text}")
        return None

def get_summoner_rank(summoner_id, summoner_region):
    try:
        ranks = lol_watcher.league.by_summoner(region= summoner_region,encrypted_summoner_id=summoner_id)
        ranked_solo_rank = next((entry for entry in ranks if entry['queueType'] == 'RANKED_SOLO_5x5'), None)
        if ranked_solo_rank:
            return ranked_solo_rank['tier'], ranked_solo_rank['rank']
        else:
            return None, None  # No ranked solo queue data available
    except ApiError as e:
        print(f"Error occurred while fetching summoner rank for {summoner_id}: {e.response.text}")
        return None, None

## Champion Mastery Data Retrieval and Processing

This script is designed to retrieve and process champion mastery data for each summoner from the Riot Games API, and then format this data for further analysis.

### Steps and Variables:

#### 1. `mastery_data`
A dictionary initialized to store the champion masteries for each summoner. Each key in the dictionary represents a summoner's name, and the corresponding value is another dictionary mapping champion IDs to mastery points.

#### 2. Scrape Champion Masteries
- **Process**:
  - For each summoner in the `summoner_names` list, the summoner's name is split into `game_name` and `tag_name`.
  - The PUUID (Player Universally Unique Identifier) is retrieved using the `get_puuid()` function.
  - Champion masteries are fetched using the `get_champion_masteries()` function, using the retrieved PUUID.
  - Each summoner's mastery data is structured as a dictionary mapping from champion ID to the number of mastery points and stored in `mastery_data`.

#### 3. `users_df`
- **Creation**:
  - A DataFrame is created from `mastery_data` using `pd.DataFrame.from_dict()`, with summoners as rows and champion IDs as columns.
- **Modification**:
  - The DataFrame is oriented by index (`orient='index'`), meaning each row labels will be summoners' names.
  - NaN values in the DataFrame are filled with 0 to represent no mastery points for certain champions.

#### 4. `champion_id_name_mapping`
- A dictionary is created using `create_champion_id_name_mapping()` to map champion IDs to their names. This utilizes static data fetched from Data Dragon.

#### 5. Column Renaming in DataFrame
- The columns of `users_df` are renamed according to `champion_id_name_mapping` to replace champion IDs with their names for better readability.

#### 6. Displaying DataFrame
- `users_df.head()` is used to display the first few rows of the DataFrame, showing a snippet of the data.

### Usage:
This script provides a detailed view of the champion mastery levels across multiple summoners, useful for analyzing patterns in champion preferences and mastery across different players.

**Note**: This script assumes the presence of external functions like `get_puuid()` and `get_champion_masteries()` and requires the `pandas` library for DataFrame operations. It also assumes the `riot_watcher` library setup with proper API keys.


In [161]:
# Dictionary to store champion masteries for each summoner
mastery_data = {}

# Scrape champion masteries for each summoner
for summoner_name in summoner_names:
    game_name, tag_name = summoner_name.split('#')
    puuid = get_puuid(game_name, tag_name, account_region)
    champion_masteries = get_champion_masteries(puuid, summoner_region)
    mastery_data[summoner_name] = {mastery['championId']:mastery['championPoints'] for mastery in champion_masteries}

# Create DataFrame from mastery_data
users_df = pd.DataFrame.from_dict(mastery_data, orient='index')

# Fill NaN values with 0
users_df.fillna(0, inplace=True)

# Create a dictionary mapping champion IDs to their names
champion_id_name_mapping = create_champion_id_name_mapping()

# Rename DataFrame columns using the champion ID name mapping
users_df.rename(columns=champion_id_name_mapping, inplace=True)

users_df.head()


Unnamed: 0,Kha'Zix,Riven,LeBlanc,Ekko,Zed,Fizz,Vayne,Hecarim,Jayce,Gangplank,Lee Sin,Kassadin,Orianna,Jhin,Syndra,Rengar,Katarina,Ezreal,Kennen,Twisted Fate,Nasus,Viktor,Azir,Gragas,Corki,Ahri,Cassiopeia,Ryze,Diana,Yasuo,Akali,Lucian,Elise,Graves,Fiora,Varus,Lissandra,Caitlyn,Vladimir,Vel'Koz,Jarvan IV,Taliyah,Kayle,Darius,Kalista,Talon,Irelia,Shyvana,Miss Fortune,Brand,Udyr,Rammus,Ziggs,Nidalee,Blitzcrank,Jinx,Xerath,Draven,Tristana,Lulu,Evelynn,Jax,Lux,Sejuani,Rek'Sai,Malzahar,Gnar,Twitch,Urgot,Kindred,Nocturne,Dr. Mundo,Zac,Kog'Maw,Cho'Gath,Karthus,Bard,Rumble,Quinn,Karma,Anivia,Veigar,Annie,Volibear,Xin Zhao,Master Yi,Zoe,Ashe,Alistar,Singed,Nautilus,Heimerdinger,Shaco,Zyra,Trundle,Mordekaiser,Swain,Poppy,Thresh,Shen,Aurelion Sol,Sona,Pyke,Soraka,Galio,Nunu & Willump,Morgana,Sivir,Kayn,Olaf,Skarner,Camille,Maokai,Vi,Renekton,Malphite,Qiyana,Pantheon,Wukong,Fiddlesticks,Sylas,Illaoi,Xayah,Ivern,Braum,Tahm Kench,Warwick,Kai'Sa,Kled,Teemo,Sion,Leona,Zilean,Aatrox,Tryndamere,Senna,Ornn,Yuumi,Neeko,Amumu,Lillia,Janna,Viego,Yorick
FattyAcids97#SG2,103124,91058,81539,72977,70337,55034,53179,52934,50533,47932,45054,44636,43152,42135,41683,39349,39188,38744,36833,35602,33704,32683,31597,31583,29312,28725,28713,28521,28064,27970,25326,24683,24258,23907,23578,23330,22897,22746,22457,22308,21357,20090,19656,19459,18831,18509,18326,17379,17274,16999,16237,15803,15412,15244,15052,15006,14834,14762,13817,13793,13567,13550,13450,12959,12867,12808,12776,12718,12630,12567,12567,12049,11526,11475,11471,10631,10370,9872,9780,9776,9353,9308,9060,8469,8404,8400,8131,7740,7069,6972,6873,6645,6634,6362,6028,5878,5635,5481,5306,5194,5173,5155,4993,4974,4779,4617,4580,4469,4350,4112,4011,3948,3912,3793,3611,3579,3301,3076,3025,3009,2987,2948,2888,2862,2603,2331,2233,2160,2157,2098,1758,1734,1659,1601,1600,1474,932,855,797,747,706,596,540,225


In [162]:
# Dictionary to store summoner ranks
summoner_rank_data = {}

rank_mapping = {
    None: 0,
    'IRON': 1,
    'BRONZE': 2,
    'SILVER': 3,
    'GOLD': 4,
    'PLATINUM': 5,
    'DIAMOND': 6,
    'EMERALD': 7,
    'MASTER': 8,
    'GRANDMASTER': 9,
    'CHALLENGER': 10
}

# Scrape summoner ranks for each summoner
for summoner_name in summoner_names:
    game_name, tag_name = summoner_name.split('#')
    puuid = get_puuid(game_name, tag_name, account_region)
    summoner_id = get_summoner_id(puuid, summoner_region)
    if summoner_id:
        rank_tier, rank_division = get_summoner_rank(summoner_id, summoner_region)
        summoner_rank_data[summoner_name] = {'rank': rank_tier,}
    else:
        summoner_rank_data[summoner_name] = {'rank': None}

# Create DataFrame from summoner_rank_data
rank_df = pd.DataFrame.from_dict(summoner_rank_data, orient='index')


In [163]:
rank_df
rank_df['rank'] = rank_df['rank'].map(rank_mapping)

In [164]:
# Merge users_df with rank_df
users_df_with_rank = users_df.merge(rank_df, left_index=True, right_index=True)

users_df_with_rank = users_df_with_rank.reindex(columns=cols)

## Function `process_and_predict(dataframe)`

This function processes the input DataFrame by applying a series of transformations and scaling operations on the data (excluding a specific column 'rank'), and then uses Singular Value Decomposition (SVD) to predict data points.

### Detailed Steps:

#### 1. **Exclusion of 'rank' Column**:
   - The function starts by excluding the 'rank' column from any transformations. This is to preserve the original rank data as it may be used as a label or for data stratification.

#### 2. **Filling Missing Values**:
   - The function fills any missing values in the DataFrame with 0 using `dataframe.fillna(0)`. This step ensures that there are no null values which might lead to errors during transformations or predictions.

#### 3. **Log Transformation**:
   - A log transformation (using `np.log1p`) is applied to the columns specified in `columns_to_transform`. This transformation is often used to reduce skewness of data distributions.

#### 4. **Scaling**:
   - The data in `columns_to_transform` is then scaled using a predefined scaler (`scaler.transform`). This is typically done to normalize or standardize data before applying machine learning algorithms.

#### 5. **Dropping 'rank' for Prediction**:
   - The 'rank' column is temporarily dropped for the purpose of making predictions, ensuring it does not influence the outcome.

#### 6. **Matrix Factorization Using SVD**:
   - Singular Value Decomposition is applied on the transformed data to obtain user-specific matrices (`predicted_U`).
   - These matrices are then used to calculate the predicted ratings by multiplying with the transpose of a matrix `Vt_k` to form the complete predicted matrix.

#### 7. **Reinsert 'rank' Column**:
   - After predictions, the 'rank' column is reinserted back into the predicted matrix to maintain the original structure of the DataFrame.

#### 8. **Return Predicted DataFrame**:
   - The function returns the DataFrame containing the predicted data along with the original 'rank' data.

### Example Usage:
```python
# Assuming the DataFrame is read from a CSV file
dataframe = pd.read_csv('path_to_file.csv')
predictions = process_and_predict(dataframe)
print(predictions.head())


In [165]:
def process_and_predict(dataframe):
    # Exclude 'rank' from transformations
    columns_to_transform = dataframe.columns.drop('rank')


    # Now you can proceed with using columns_to_transform

    

    dataframe = dataframe.fillna(0)
    # Apply log transformation
    dataframe[columns_to_transform] = np.log1p(dataframe[columns_to_transform])
    
    # Scale the data
    
    dataframe[columns_to_transform] = scaler.transform(dataframe[columns_to_transform])

   
    dataframe_without_rank = dataframe.drop(['rank'], axis=1)

    predicted_U = svd.transform(dataframe_without_rank)

    # Calculate all user predicted ratings
    predicted_matrix = np.dot(predicted_U,Vt_k)

    predicted_matrix = pd.DataFrame(predicted_matrix, index=dataframe_without_rank.index, columns=dataframe_without_rank.columns)

    # Reinsert the 'rank' column
    predicted_matrix.insert(loc=0, column='rank', value=dataframe['rank'])
    
    return predicted_matrix

# Example usage:
# dataframe = pd.read_csv('path_to_file.csv')  # Assuming the DataFrame is read from a CSV file
# predictions = process_and_predict(dataframe)
# print(predictions.head())


## Function `hybrid_recommendation_with_rank(new_user_matrix, top_n=10)`

This function provides a hybrid recommendation by integrating rank-based weighting with collaborative and content-based filtering techniques. It is designed to predict user preferences for "champions" in a gaming context, like League of Legends.

### Detailed Steps:

#### 1. **Calculate Rank-Based Weights**:
   - Extracts the 'rank' column from the training predicted matrix to determine the ranks of all existing users.
   - Calculates two sets of weights:
     - `primary_weights`: Linearly scaled based on the difference between the ranks of existing users and the new user.
     - `secondary_weights`: Exponential decay based on the absolute difference, softening the impact of rank distance.
   - Computes `total_weights` by multiplying primary and secondary weights, balancing the impact of both proximity and ranking disparity.

#### 2. **Weight Collaborative Predictions**:
   - Removes the 'rank' column from the training matrix to focus only on the collaborative prediction data.
   - Multiplies the collaborative predictions by the `total_weights` to adjust the influence of each existing user based on their rank relative to the new user.

#### 3. **Aggregate Collaborative Predictions**:
   - Takes the mean of the weighted predictions across all users to consolidate into a single predicted rating for each champion.

#### 4. **Calculate Cosine Similarity for Content-Based Filtering**:
   - Drops the 'rank' column from the new user matrix.
   - Computes the cosine similarity between the new user's preference vector and all others using a pre-existing champion similarity matrix, generating content-based scores.

#### 5. **Combine Collaborative and Content-Based Scores**:
   - Adds the content-based scores to the collaborative predictions to form combined scores for each champion.

#### 6. **Sort and Select Top N Recommendations**:
   - Converts combined scores into a pandas Series for easier manipulation.
   - Sorts the scores in descending order and selects the top `N` champions as recommended.

### Example Usage:
```python
# Assuming new_user_matrix and necessary matrices are predefined
top_recommendations = hybrid_recommendation_with_rank(new_user_matrix, top_n=10)
print(top_recommendations)


In [166]:
def hybrid_recommendation_with_rank(new_user_matrix, top_n=10):
    # Calculate the rank-based weights for all existing users
    ranks = train_predicted_matrix['rank']
    
    user_rank = new_user_matrix['rank'].item()

    primary_weights = (ranks - user_rank) / 10.0
    secondary_weights = np.exp(-np.abs(ranks - user_rank) / 10.0)
    total_weights = primary_weights * secondary_weights
    train_predicted_matrix_without_rank = train_predicted_matrix.drop('rank',axis=1)
    # Weight the collaborative predictions by these rank weights
    weighted_predictions = train_predicted_matrix_without_rank.mul(total_weights, axis=0)

    # Combine the predictions into a single predicted rating for each champion
    collaborative_predictions = weighted_predictions.mean(axis=0)

    new_user_matrix_without_rank = new_user_matrix.drop('rank', axis = 1)

    # Calculate cosine similarity for new userâ€™s preferences against all others using the champion similarity matrix
    content_based_scores = np.dot(new_user_matrix_without_rank.values, champion_similarity_df)

    # Weight the content-based predictions by the collaborative predictions
    combined_scores = content_based_scores.flatten() + collaborative_predictions.values

    # Convert combined_scores to a Series for easy manipulation
    combined_scores = pd.Series(combined_scores, index=champion_similarity_df.index)

    # Sort and pick top N recommendations
    return combined_scores.sort_values(ascending=False).head(top_n)


In [None]:
user_predicted_matrix = process_and_predict(users_df_with_rank)

user_predicted_matrix

In [170]:
print(f"Recommended Champion for {summoner_names[0]}")

hybrid_recommendation_with_rank(user_predicted_matrix)

Recommended Champion for FattyAcids97#SG2


Champion
Udyr            17.377257
Azir            17.317563
Zyra            17.295997
Talyah          16.901508
Thresh          16.857535
Aurelion Sol    16.737879
LeBlanc         16.698497
Ahri            16.690087
Cassiopeia      16.326877
Corki           16.170978
dtype: float64