## Websites for Anime
1. MyAnimeList (Where the data was extracted) - https://myanimelist.net/
2. Jikan API (Tool used to extract data from MAL) - https://jikan.moe/
3. JikanPy (Python Wrapper for Jikan, includes GitHub and documentation) - https://github.com/abhinavk99/jikanpy

# cooledtured Anime Recommender System

## Instructions
1. Before running this code, you will need two files to upload onto this colab.
    - **client_secret_407060465423-3r8r7du6bg9l4gpivsiqrmas6cj8voru.apps.googleusercontent.com.json**
    - **peak-castle-454322-n7-b67cde6c3167.json**
    - *These files are located in the Google Drive via Recommendation System -> Anime RecSystem Required Files (https://drive.google.com/drive/u/1/folders/1Zxd0uJxAkPma1ZJ_X628ULP8krwRNQVj)*
    - *These files will give you access to the database stored within cooledtured's google sheets drive*
    - *NOTE: Everytime you want to run this model, you need to always upload these files to the colab since Google disconnects and deletes these files after an extended amount of time*
2. The first section of this colab (Installing Required Packages) is to install the packages required to run the model. The colab *should* already have these packages installed through previous attempts in running the file; but, just run all the lines starting with "!pip" just in case.
3. The second section of this colab (Getting Access to Model/Dashboard) is to run all the lines of code s.t. you are able to access a working dashboard which is able to recommend animes from MyAnimeList's "Top Anime" section.
    - The code creates a new file called tab1_top_anime_data.csv. Do not worry about this file, this is used by the model.
    - The database does not include *every* anime from MyAnimeList's database, but it does include a substantial amount (10,000+ Animes). This should be suitable for recommendation and for the initiatives relating to cooledtured.
    - At the bottom of this section, there should be a new dashboard where you can enter any Anime you want with several filtering options. Have fun trying it out!
    - *NOTE: The dashboard also includes a new "public URL" which you can click to get a larger screen of the dashboard. This link will only last up until the colab disconnects (i.e. not running the code in this file for an extended amount of time).*

## Potential Errors
1. There may be an error that tells you that one of two files do not exist:
    - client_secret_407060465423-3r8r7du6bg9l4gpivsiqrmas6cj8voru.apps.googleusercontent.com.json
    - peak-castle-454322-n7-b67cde6c3167.json
- **This means that you did not upload the files!!!!** Make sure to upload them before running this code.

## Documentation for Anime RecSystem
### (Includes everything mentioned above + more)
https://docs.google.com/document/d/1TYxpXEpBzYitrHQUjruQqaQRtexd6kSF-8qRUvOb8n4/edit?usp=sharing

# Installing Required Packages

In [None]:
!pip install jikanpy-v4 # documentaton: https://jikanpy.readthedocs.io/en/latest/

Collecting jikanpy-v4
  Downloading jikanpy_v4-1.0.2-py3-none-any.whl.metadata (6.4 kB)
Downloading jikanpy_v4-1.0.2-py3-none-any.whl (15 kB)
Installing collected packages: jikanpy-v4
Successfully installed jikanpy-v4-1.0.2


In [None]:
!pip install --upgrade google-api-python-client google-auth-httplib2 google-auth-oauthlib gspread

Collecting google-api-python-client
  Downloading google_api_python_client-2.175.0-py3-none-any.whl.metadata (7.0 kB)
Downloading google_api_python_client-2.175.0-py3-none-any.whl (13.7 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m13.7/13.7 MB[0m [31m93.3 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: google-api-python-client
  Attempting uninstall: google-api-python-client
    Found existing installation: google-api-python-client 2.174.0
    Uninstalling google-api-python-client-2.174.0:
      Successfully uninstalled google-api-python-client-2.174.0
Successfully installed google-api-python-client-2.175.0


In [None]:
!pip install scikit-learn



In [None]:
!pip install gradio



# Getting Access to Model/Dashboard

In [None]:
from jikanpy import Jikan # For MyAnimeList
import pandas as pd
import gspread
from google.auth.transport.requests import Request
from google.oauth2.service_account import Credentials
import time
import requests
import json
import csv
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import cosine_similarity
import numpy as np
from sklearn.preprocessing import MinMaxScaler
import gradio as gr

SCOPES = ['https://www.googleapis.com/auth/spreadsheets', 'https://www.googleapis.com/auth/drive']
creds = Credentials.from_service_account_file('peak-castle-454322-n7-b67cde6c3167.json', scopes=SCOPES)
client = gspread.authorize(creds)
sheet = client.open_by_key('14rlMEGcS52vgP_BrFYPlQ-q_iFL2GXJng1tPxb3Vj38')

FileNotFoundError: [Errno 2] No such file or directory: 'peak-castle-454322-n7-b67cde6c3167.json'

In [None]:
class GoogleSheet:
    def __init__(self, credentials_file, sheet_id, sheet_tab=0):
        """
        Initialize the GoogleSheet connection.

        :param credentials_file: Path to the Google service account credentials.
        :param sheet_id: Google Sheet ID.
        :param sheet_tab: Sheet tab name or index (default is 0, the first sheet).
        """
        self.credentials_file = credentials_file
        self.sheet_id = sheet_id
        self.sheet_tab = sheet_tab
        self.client = None
        self.sheet = None

    def __enter__(self):
        """Establish the connection when entering the context."""
        creds = Credentials.from_service_account_file(self.credentials_file, scopes=SCOPES)
        self.client = gspread.authorize(creds)
        spreadsheet = self.client.open_by_key(self.sheet_id)

        # Select the sheet tab (by index or name)
        if isinstance(self.sheet_tab, int):
            self.sheet = spreadsheet.get_worksheet(self.sheet_tab)  # Select by index (0-based)
        else:
            self.sheet = spreadsheet.worksheet(self.sheet_tab)  # Select by name

        return self

    def get_data(self):
        """Fetch all data from the selected sheet tab."""
        return self.sheet.get_all_values()

    def __exit__(self, exc_type, exc_value, traceback):
        """Handle cleanup if needed."""
        pass


In [None]:
def save_to_csv(data, filename):
    """Save list of lists (Google Sheet data) to a CSV file."""
    with open(filename, mode='w', newline='', encoding='utf-8') as file:
        writer = csv.writer(file)
        writer.writerows(data)

# Define credentials and sheet details
credentials_file = 'peak-castle-454322-n7-b67cde6c3167.json'
sheet_id = '14rlMEGcS52vgP_BrFYPlQ-q_iFL2GXJng1tPxb3Vj38'
tab_name_or_index = 0

with GoogleSheet(credentials_file, sheet_id, tab_name_or_index) as gs:
    data = gs.get_data()

save_to_csv(data, 'tab1_top_anime_data.csv')
print("CSV file saved successfully!")

CSV file saved successfully!


In [None]:
'''
Reccomender for top anime
'''
# Step 1: Load and clean base DataFrame
df = pd.read_csv("tab1_top_anime_data.csv")
df['genres'] = df['genres'].fillna('')
df['synopsis'] = df['synopsis'].fillna('')
df['rating'] = df['rating'].fillna('Unknown')

# Step 2: Normalize rating labels
rating_map = {
    'G - All Ages': 'G',
    'PG - Children': 'PG',
    'PG-13 - Teens 13 or older': 'PG-13',
    'R - 17+ (violence & profanity)': 'R',
    'R+ - Mild Nudity': 'R+',
    'Rx - Hentai': 'Rx'
}
df["rating_clean"] = df["rating"].map(rating_map)

# Step 3: Filter function
def filter_by_rating(dataframe, allowed_ratings):
    return dataframe[dataframe['rating_clean'].isin(allowed_ratings)]

# Step 4: Apply rating filter before computing TF-IDF
filtered_df = filter_by_rating(df, ['PG-13', 'R', 'G', 'PG', 'R+', 'Rx'])

# Step 5: Combine genres and synopsis for content
filtered_df['content'] = filtered_df['genres'] + " " + filtered_df['synopsis']

# Step 6: Compute TF-IDF and Cosine Similarity on filtered data
tfidf = TfidfVectorizer(stop_words='english')
tfidf_matrix = tfidf.fit_transform(filtered_df['content'])
cosine_sim = cosine_similarity(tfidf_matrix, tfidf_matrix)

# Step 7: Recommender function
def get_filtered_recommendations(title, df, cosine_sim, min_score=7, min_popularity=1000, base_type="N/A", base_rating="N/A"):
    title_lower = title.lower()

    # Match by original or English title
    match_idx = df[df['title'].str.lower() == title_lower].index
    if match_idx.empty:
        match_idx = df[df['title_english'].str.lower() == title_lower].index

    if match_idx.empty:
        return f"Anime '{title}' not found in dataset."

    idx = match_idx[0]
    sim_scores = list(enumerate(cosine_sim[idx]))
    sim_scores = sorted(sim_scores, key=lambda x: x[1], reverse=True)
    anime_indices = [i[0] for i in sim_scores]

    filtered_anime = df.iloc[anime_indices].copy()

    # Apply filtering
    if base_type != "N/A":
        filtered_anime = filtered_anime[filtered_anime['type'] == base_type]
    if base_rating != "N/A":
        filtered_anime = filtered_anime[filtered_anime['rating_clean'] == base_rating]

    filtered_anime = filtered_anime[(filtered_anime['score'] >= min_score) &
                                    (filtered_anime['popularity'] < min_popularity)]

    # Exclude the base anime
    filtered_anime = filtered_anime[~filtered_anime['title'].str.lower().str.contains(title_lower, regex=False)]
    filtered_anime = filtered_anime[~filtered_anime['title_english'].str.lower().str.contains(title_lower, regex=False, na=False)]

    # Trending score
    filtered_anime['trending_score'] = (
        0.2 * filtered_anime['score'] +
        0.4 * filtered_anime['favorites'] +
        0.4 * (1 / (filtered_anime['popularity'] + 1))
    )

    return filtered_anime[['title', 'title_english', 'genres', 'score', 'rating_clean', 'type', 'popularity', 'trending_score']].head(10)

# Step 8: Example usage
filtered_recommendations = get_filtered_recommendations("Your Name.", filtered_df, cosine_sim, base_type="TV", base_rating="R")
print(filtered_recommendations)


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  filtered_df['content'] = filtered_df['genres'] + " " + filtered_df['synopsis']


                                         title  \
548                                 Durarara!!   
1609                            Gakkougurashi!   
122                     Yojouhan Shinwa Taikei   
1700                             Kekkai Sensen   
1859                                      No.6   
19          Code Geass: Hangyaku no Lelouch R2   
236   Mushoku Tensei: Isekai Ittara Honki Dasu   
827          Full Metal Panic! The Second Raid   
942                            Tokyo Revengers   
2783                    Toaru Majutsu no Index   

                                title_english  \
548                                Durarara!!   
1609                             School-Live!   
122                         The Tatami Galaxy   
1700               Blood Blockade Battlefront   
1859                                    No. 6   
19    Code Geass: Lelouch of the Rebellion R2   
236     Mushoku Tensei: Jobless Reincarnation   
827         Full Metal Panic! The Second Raid   
942     

In [None]:
import gradio as gr
import pandas as pd

# Make sure these are defined globally before the interface is launched
# df = ...  # Your full anime dataset
# cosine_sim = ...  # Your similarity matrix

# Safely wrap the function to capture errors and debug
def recommend_interface(title, min_score, min_popularity, base_type, base_rating):
    try:
        result = get_filtered_recommendations(
            title,
            df=df,
            cosine_sim=cosine_sim,
            min_score=min_score,
            min_popularity=min_popularity,
            base_type=base_type,
            base_rating=base_rating
        )
        if result is None or result.empty:
            return pd.DataFrame([{"Message": "No recommendations found."}])
        return result
    except Exception as e:
        return pd.DataFrame([{"Error": str(e)}])

# Inputs
inputs = [
    gr.Textbox(label="Anime Title"),
    gr.Slider(minimum=0, maximum=10, step=0.1, value=7, label="Minimum Score"),
    gr.Number(value=1000, label="Maximum Popularity"),
    gr.Textbox(label="Base Type (e.g., TV, TV Special, Movie, OVA, Music, ONA, Special, PV, CM, N/A)", placeholder="N/A"),
    gr.Textbox(label="Base Rating (e.g., PG-13, R, R+, PG, G, Rx, N/A)", placeholder="N/A")
]

# Interface
gr.Interface(
    fn=recommend_interface,
    inputs=inputs,
    outputs=gr.Dataframe(),
    title="Anime Recommender System",
    description="Search for anime recommendations with customizable filters!"
).launch()

It looks like you are running Gradio on a hosted a Jupyter notebook. For the Gradio app to work, sharing must be enabled. Automatically setting `share=True` (you can turn this off by setting `share=False` in `launch()` explicitly).

Colab notebook detected. To show errors in colab notebook, set debug=True in launch()
* Running on public URL: https://b0af73d1c2602e6646.gradio.live

This share link expires in 1 week. For free permanent hosting and GPU upgrades, run `gradio deploy` from the terminal in the working directory to deploy to Hugging Face Spaces (https://huggingface.co/spaces)




# Movie Recommendation System

# **Instructions**

Before running this notebook, upload one essential file:  
**`optimum-web-454302-e0-f5b48548795c.json`**

* This file allows the script to connect and write to a Google Sheet using the `gspread` library.
* **Important:** You’ll need to re-upload this file each time you reconnect to Colab, because uploaded files are cleared when your session ends.

---

# **Library Setup and Authorization**

* Installs required packages and authorizes access to Google Sheets and the TMDB API.
* Imports libraries, loads your credentials, and links to the appropriate Google Sheet via its key.

---

# **Retrieving Genre Info**

* Uses the TMDB API to fetch genre mappings for both movies and TV series.
* These mappings are used to convert numerical genre IDs into readable genre names.

---

# **Utility Functions for Data Retrieval**

* Checks if a specific worksheet already exists, or creates it if it doesn't.
* Pulls the top 3 actors and any listed directors for a given movie or show using its TMDB ID.

---

# **Getting Popular Titles from TMDB**

* Calls TMDB’s API for popular movies and shows, scanning up to 500 pages for each category.
* Gathers details like:
  * Title
  * ID
  * Release Date
  * Rating
  * Genres
  * Summary
  * Cast
  * Crew
* Results are automatically sorted from most to least popular.

---

# **Sending Data to Google Sheets**

* Merges the collected movie and show data, then uploads it to the Google Sheet.
* Data is written into a tab named **"Ranked Content"** and includes the following columns:

| **Column**         | **Description** |
|:-------------------|:----------------|
| **Title**          | Name of the movie or TV show |
| **Content ID**     | TMDB ID |
| **Release Date**   | Date released |
| **Rating**         | Average TMDB rating |
| **Genre**          | Genre(s) as text |
| **Popularity**     | TMDB popularity metric |
| **Synopsis**       | Short description |
| **Content Link**   | Clickable TMDB link |
| **Type**           | Movie or TV Show |
| **Top 3 Actors**   | Leading cast members |
| **Directors**      | Director(s) |

**Note:** Depending on your connection and TMDB’s rate limits, the upload process may take several minutes.

---

# **Launching the Gradio Interface**

Creates an interactive app that lets users:

* Search for any title and view content-based recommendations.
* Filter results by:
  * Minimum rating
  * Content type (Movie, TV Show, or Both)
  * Genre
* Explore two types of recommendations:
  * Content-based (based on synopsis and genre)
  * Cast-based (based on shared actors or directors)

**Tip:** If nothing shows up, try searching for a more common or exact title name.

---

# **Potential Errors**

* **If it doesn't run, its because the files may not have been downloaded. Make sure to download!**
* Sometimes, titles may have unique characters, like "é". This may not generate results, you must get the exact title in that case
* Sometimes, because of how many movies and shows there are, we simply weren't able to retrieve the movie.


## Websites for Movies

1. TMDB (Where the data was extracted) - https://www.themoviedb.org/?language=en-US
2. TMDB API Key - 'b8efb431ca874795fa3bd90a9216e38b'
3. TMDB - "https://api.themoviedb.org/3/genre/{content_type}/list?api_key=
{api_key}&language=en-US"

Google Sheets API Key: 1ztcWL119qy67Ox7JsHp1cmiQ158yJ9abKlfAjUj6_AY

In [None]:
!pip install gradio
import pandas as pd
import gradio as gr
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import cosine_similarity
from difflib import get_close_matches

# 🔗 Google Sheet CSV URL
google_sheet_url = "https://docs.google.com/spreadsheets/d/1ztcWL119qy67Ox7JsHp1cmiQ158yJ9abKlfAjUj6_AY/export?format=csv&gid=1096224009"
df = pd.read_csv(google_sheet_url)


Collecting gradio
  Downloading gradio-5.28.0-py3-none-any.whl.metadata (16 kB)
Collecting aiofiles<25.0,>=22.0 (from gradio)
  Downloading aiofiles-24.1.0-py3-none-any.whl.metadata (10 kB)
Collecting fastapi<1.0,>=0.115.2 (from gradio)
  Downloading fastapi-0.115.12-py3-none-any.whl.metadata (27 kB)
Collecting ffmpy (from gradio)
  Downloading ffmpy-0.5.0-py3-none-any.whl.metadata (3.0 kB)
Collecting gradio-client==1.10.0 (from gradio)
  Downloading gradio_client-1.10.0-py3-none-any.whl.metadata (7.1 kB)
Collecting groovy~=0.1 (from gradio)
  Downloading groovy-0.1.2-py3-none-any.whl.metadata (6.1 kB)
Collecting pydub (from gradio)
  Downloading pydub-0.25.1-py2.py3-none-any.whl.metadata (1.4 kB)
Collecting python-multipart>=0.0.18 (from gradio)
  Downloading python_multipart-0.0.20-py3-none-any.whl.metadata (1.8 kB)
Collecting ruff>=0.9.3 (from gradio)
  Downloading ruff-0.11.7-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (25 kB)
Collecting safehttpx<0.2.0,>=0.1.6

# Installing Required Packages

# Movie Recommender System

In [None]:
!pip install gradio

Collecting gradio
  Downloading gradio-5.27.0-py3-none-any.whl.metadata (16 kB)
Collecting aiofiles<25.0,>=22.0 (from gradio)
  Downloading aiofiles-24.1.0-py3-none-any.whl.metadata (10 kB)
Collecting fastapi<1.0,>=0.115.2 (from gradio)
  Downloading fastapi-0.115.12-py3-none-any.whl.metadata (27 kB)
Collecting ffmpy (from gradio)
  Downloading ffmpy-0.5.0-py3-none-any.whl.metadata (3.0 kB)
Collecting gradio-client==1.9.0 (from gradio)
  Downloading gradio_client-1.9.0-py3-none-any.whl.metadata (7.1 kB)
Collecting groovy~=0.1 (from gradio)
  Downloading groovy-0.1.2-py3-none-any.whl.metadata (6.1 kB)
Collecting pydub (from gradio)
  Downloading pydub-0.25.1-py2.py3-none-any.whl.metadata (1.4 kB)
Collecting python-multipart>=0.0.18 (from gradio)
  Downloading python_multipart-0.0.20-py3-none-any.whl.metadata (1.8 kB)
Collecting ruff>=0.9.3 (from gradio)
  Downloading ruff-0.11.7-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (25 kB)
Collecting safehttpx<0.2.0,>=0.1.6 (

In [None]:
!pip install sckit-


[31mERROR: Invalid requirement: 'sckit-': Expected end or semicolon (after name and no valid version specifier)
    sckit-
         ^[0m[31m
[0m

# Scraping the data

In [None]:
# Will take hours to compile

import requests
import gspread
from oauth2client.service_account import ServiceAccountCredentials
from datetime import datetime

# TMDB API Key
api_key = 'b8efb431ca874795fa3bd90a9216e38b'

# Google Sheets setup
scope = ["https://spreadsheets.google.com/feeds",
         "https://www.googleapis.com/auth/spreadsheets",
         "https://www.googleapis.com/auth/drive.file",
         "https://www.googleapis.com/auth/drive"]

# Load credentials
creds = ServiceAccountCredentials.from_json_keyfile_name(
    r"C:\Users\12407\Downloads\optimum-web-454302-e0-f5b48548795c.json", scope
)
client = gspread.authorize(creds)

# Open the Google Sheet
spreadsheet = client.open_by_key("1ztcWL119qy67Ox7JsHp1cmiQ158yJ9abKlfAjUj6_AY")

# Function to check if a sheet exists or create a new one
def get_or_create_sheet(spreadsheet, sheet_name, headers):
    try:
        sheet = spreadsheet.worksheet(sheet_name)
        print(f"Found sheet: {sheet_name}.")
    except gspread.exceptions.WorksheetNotFound:
        sheet = spreadsheet.add_worksheet(title=sheet_name, rows="10000", cols="11")  # Update to 11 columns for new data
        print(f"Created new sheet: {sheet_name}.")

    # Clear the sheet before updating it
    sheet.clear()
    sheet.insert_row(headers, 1)
    return sheet

# Function to get genre dictionary
def get_genres(content_type):
    url = f"https://api.themoviedb.org/3/genre/{content_type}/list?api_key={api_key}&language=en-US"
    response = requests.get(url)

    if response.status_code == 200:
        genres = {genre['id']: genre['name'] for genre in response.json().get('genres', [])}
        return genres
    else:
        print(f"Failed to fetch {content_type} genres. Status: {response.status_code}")
        return {}

# Fetch movie and TV genres
movie_genres = get_genres("movie")
tv_genres = get_genres("tv")

# Function to get actors and directors for a specific content ID (movie or TV show)
def get_actors_and_directors(content_type, content_id):
    url = f"https://api.themoviedb.org/3/{content_type}/{content_id}/credits?api_key={api_key}&language=en-US"
    response = requests.get(url)

    if response.status_code == 200:
        data = response.json()
        # Get top 3 actors
        actors = [actor['name'] for actor in data.get('cast', [])[:3]]  # Adjust the number of actors as needed
        # Get director(s)
        directors = [crew['name'] for crew in data.get('crew', []) if crew['job'] == 'Director']
        return ", ".join(actors), ", ".join(directors)
    else:
        print(f"Failed to fetch credits for {content_type} with ID {content_id}. Status: {response.status_code}")
        return "", ""  # Return empty strings if unable to fetch

# Function to fetch popular content with actors and directors
def fetch_popular_content_with_actors_and_directors(url, content_type, genres, pages=50):
    items = []

    for page in range(1, pages + 1):
        print(f"Fetching page {page} for {content_type}...")
        response = requests.get(f"{url}&page={page}")

        if response.status_code == 200:
            data = response.json()
            for item in data.get('results', []):
                title = item.get('title', item.get('name', 'Unknown'))
                content_id = item.get('id', 'N/A')
                release_date = item.get('release_date') if content_type == "movie" else item.get('first_air_date')
                rating = item.get('vote_average', 'N/A')
                popularity = item.get('popularity', 0)
                genre_ids = item.get('genre_ids', [])
                genre_names = ", ".join([genres.get(gid, "Unknown") for gid in genre_ids])
                synopsis = item.get('overview', 'No synopsis available.')
                content_link = f"https://www.themoviedb.org/{content_type}/{content_id}"

                # Get actors and directors
                actors, directors = get_actors_and_directors(content_type, content_id)

                # Add content type (movie or tv)
                content_type_label = "movie" if content_type == "movie" else "tv show"
                items.append([title, content_id, release_date or 'N/A', rating, genre_names, popularity, synopsis, content_link, content_type_label, actors, directors])
        else:
            print(f"Failed to get {content_type} data. Status: {response.status_code}")
            break  # Stop if any request fails

    return items

# Get or create the sheet for ranked movies and TV shows with actors and directors columns
ranked_content_sheet = get_or_create_sheet(spreadsheet, "Ranked Content",
                                           ["Title", "Content ID", "Release Date", "Rating", "Genre", "Popularity", "Synopsis", "Content Link", "Type", "Actors", "Directors"])

# Fetch and store popular movies (up to 50 pages)
popular_movies_url = f"https://api.themoviedb.org/3/movie/popular?api_key={api_key}&language=en-US"
popular_movies_data = fetch_popular_content_with_actors_and_directors(popular_movies_url, "movie", movie_genres, pages=500)

# Fetch and store popular TV shows (up to 50 pages)
popular_tv_url = f"https://api.themoviedb.org/3/tv/popular?api_key={api_key}&language=en-US"
popular_tv_data = fetch_popular_content_with_actors_and_directors(popular_tv_url, "tv", tv_genres, pages=500)

# Combine movies and TV shows data
combined_data = popular_movies_data + popular_tv_data

# Sort combined data by popularity
combined_data.sort(key=lambda x: x[5], reverse=True)

# Insert data into the sheet
if combined_data:
    ranked_content_sheet.append_rows(combined_data)

print("Ranked Movies and TV Shows have been updated successfully!")


# Getting the Recommendations + Gradio

In [None]:
!pip install gradio

Collecting gradio
  Downloading gradio-5.28.0-py3-none-any.whl.metadata (16 kB)
Collecting aiofiles<25.0,>=22.0 (from gradio)
  Downloading aiofiles-24.1.0-py3-none-any.whl.metadata (10 kB)
Collecting fastapi<1.0,>=0.115.2 (from gradio)
  Downloading fastapi-0.115.12-py3-none-any.whl.metadata (27 kB)
Collecting ffmpy (from gradio)
  Downloading ffmpy-0.5.0-py3-none-any.whl.metadata (3.0 kB)
Collecting gradio-client==1.10.0 (from gradio)
  Downloading gradio_client-1.10.0-py3-none-any.whl.metadata (7.1 kB)
Collecting groovy~=0.1 (from gradio)
  Downloading groovy-0.1.2-py3-none-any.whl.metadata (6.1 kB)
Collecting pydub (from gradio)
  Downloading pydub-0.25.1-py2.py3-none-any.whl.metadata (1.4 kB)
Collecting python-multipart>=0.0.18 (from gradio)
  Downloading python_multipart-0.0.20-py3-none-any.whl.metadata (1.8 kB)
Collecting ruff>=0.9.3 (from gradio)
  Downloading ruff-0.11.8-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (25 kB)
Collecting safehttpx<0.2.0,>=0.1.6

In [None]:
import pandas as pd
import gradio as gr
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import cosine_similarity

# Load dataset from Google Sheets
google_sheet_url = "https://docs.google.com/spreadsheets/d/1ztcWL119qy67Ox7JsHp1cmiQ158yJ9abKlfAjUj6_AY/export?format=csv&gid=1096224009"
df = pd.read_csv(google_sheet_url)

# Fill missing values
df.fillna('', inplace=True)

# Normalize genres
def normalize_genres(genre_str):
    separators = [",", "|", "/", ";"]
    genre_str = genre_str.lower()
    for sep in separators:
        genre_str = genre_str.replace(sep, ",")
    return set(g.strip() for g in genre_str.split(",") if g.strip())

df['Normalized_Genre'] = df['Genre'].apply(normalize_genres)

# Vectorize synopsis using TF-IDF
tfidf = TfidfVectorizer(stop_words='english')
tfidf_matrix = tfidf.fit_transform(df['Synopsis'].astype(str))

# Cosine similarity matrix
cosine_sim = cosine_similarity(tfidf_matrix, tfidf_matrix)

# Helper: Get all entries matching the user's title
def get_matching_titles(user_title):
    user_title = user_title.strip().lower()
    matches = df[df['Title'].str.lower() == user_title]
    if matches.empty:
        # Try loose matches
        matches = df[df['Title'].str.lower().str.contains(user_title)]
    return matches

# Helper: Recommend titles based on selected index and content type
def get_recommendations(selected_index, extra_genre, min_rating, num_recs, content_type):
    original = df.iloc[selected_index]
    original_title = original['Title'].strip().lower()
    original_release = original['Release Date']
    original_genres = original['Normalized_Genre']

    sim_scores = list(enumerate(cosine_sim[selected_index]))
    sim_scores = sorted(sim_scores, key=lambda x: x[1], reverse=True)

    recommendations = []
    seen_entries = set()


    # Filter recommendations based on content type (Movie, TV Show, or Both)
    for idx, score in sim_scores[1:]:  # Start from index 1 to skip the original
        row = df.iloc[idx]
        row_title = row['Title'].strip().lower()
        row_release = row['Release Date']
        row_type = row['Type'].lower()

        # Skip if same title and release date as original
        if row_title == original_title and row_release == original_release:
            continue

        entry_key = (row_title, row_release)
        if entry_key in seen_entries:
            continue


        row_genres = row['Normalized_Genre']
        if not original_genres.issubset(row_genres):
            continue

        if extra_genre:
            if extra_genre.lower().strip() not in row_genres:
                continue

        try:
            if float(row['Rating']) < min_rating:
                continue
        except:
            continue

        # Filter by content type
        if content_type != 'both' and content_type != row_type:
            continue

        seen_entries.add(entry_key)


        recommendations.append({
            "Title": row['Title'],
            "Release Date": row['Release Date'],
            "Rating": row['Rating'],
            "Genre": row['Genre'],
            "Popularity": row['Popularity'],
            "Synopsis": row['Synopsis'],
            "Content Link": row['Content Link'],
            "Type": row['Type'],
            "Actors": row['Actors'],
            "Directors": row['Directors']
        })

        if len(recommendations) >= num_recs:
            break

    return recommendations

# ... (original imports and code up to get_recommendations unchanged)

# New Helper: Get additional recs based on shared actors or directors
def get_shared_cast_recommendations(selected_index, max_results=10):
    original = df.iloc[selected_index]
    original_title = original['Title'].strip().lower()
    original_release = original['Release Date']
    original_actors = set(actor.strip().lower() for actor in original['Actors'].split(",") if actor.strip())
    original_directors = set(d.strip().lower() for d in original['Directors'].split(",") if d.strip())

    recs = []
    seen = set()

    for i, row in df.iterrows():
        if i == selected_index:
            continue

        title = row['Title'].strip().lower()
        release = row['Release Date']
        if title == original_title and release == original_release:
            continue

        # Check for shared actors or directors
        row_actors = set(actor.strip().lower() for actor in row['Actors'].split(",") if actor.strip())
        row_directors = set(d.strip().lower() for d in row['Directors'].split(",") if d.strip())

        shared_actor = original_actors.intersection(row_actors)
        shared_director = original_directors.intersection(row_directors)

        if shared_actor or shared_director:
            key = f"{title}_{release}"
            if key in seen:
                continue
            seen.add(key)
            recs.append({
                "Title": row['Title'],
                "Release Date": row['Release Date'],
                "Rating": row['Rating'],
                "Genre": row['Genre'],
                "Popularity": row['Popularity'],
                "Synopsis": row['Synopsis'],
                "Content Link": row['Content Link'],
                "Type": row['Type'],
                "Actors": row['Actors'],
                "Directors": row['Directors'],
                "Shared With": "Actors" if shared_actor else "Directors"
            })
        if len(recs) >= max_results:
            break

    return recs


def step2_generate(selected_index, extra_genre, min_rating, num_recs, content_type):
    recs = get_recommendations(selected_index, extra_genre, min_rating, num_recs, content_type)
    if not recs:
        return "No recommendations found with these filters."

    original = df.iloc[selected_index]
    original_title = original['Title'].strip().lower()
    original_actors = set(a.strip().lower() for a in str(original['Actors']).split(',') if a.strip())
    original_directors = set(d.strip().lower() for d in str(original['Directors']).split(',') if d.strip())

    def format_names(name_set):
        return ', '.join(name.title() for name in sorted(name_set))

    display = "### 🎯 **Top Content-Based Recommendations:**\n\n"
    for i, rec in enumerate(recs, 1):
        display += f"**{i}. {rec['Title']} ({rec['Release Date']})**\n"
        display += f"- Genre: {rec['Genre']}\n"
        display += f"- Rating: {rec['Rating']}\n"
        display += f"- Synopsis: {rec['Synopsis'][:300]}...\n"
        display += f"- Link: {rec['Content Link']}\n\n"

    # Helper to track matches
    actor_matches = []
    director_matches = []
    seen_titles = set([original_title])  # Start with original title to avoid showing it again

    for idx, row in df.iterrows():
        row_title = row['Title'].strip()
        row_lower_title = row_title.lower()
        if row_lower_title in seen_titles:
            continue

        actors = set(a.strip().lower() for a in str(row['Actors']).split(',') if a.strip())
        directors = set(d.strip().lower() for d in str(row['Directors']).split(',') if d.strip())

        shared_actors = original_actors.intersection(actors)
        shared_directors = original_directors.intersection(directors)

        if shared_actors:
            actor_matches.append((len(shared_actors), row, shared_actors))
        if shared_directors:
            director_matches.append((len(shared_directors), row, shared_directors))

        seen_titles.add(row_lower_title)

    # Sort and take top 10
    # Sort by popularity (higher = more popular)
    top_actor_matches = sorted(
        actor_matches,
        key=lambda x: -float(x[1]['Popularity']) if str(x[1]['Popularity']).replace('.', '', 1).isdigit() else 0
    )[:10]

    top_director_matches = sorted(
    director_matches,
    key=lambda x: -float(x[1]['Popularity']) if str(x[1]['Popularity']).replace('.', '', 1).isdigit() else 0
    )[:10]


    display += "---\n\n### 🎭 **Top 10 Recommendations (Shared Actors):**\n\n"
    for count, row, shared in top_actor_matches:
        display += f"**{row['Title']} ({row['Release Date']})**\n"
        display += f"- Shared Actor(s): {format_names(shared)}\n"
        display += f"- Rating: {row['Rating']}\n"
        display += f"- Synopsis: {row['Synopsis'][:300]}...\n"
        display += f"- Link: {row['Content Link']}\n\n"

    display += "---\n\n### 🎬 **Top 10 Recommendations (Shared Directors):**\n\n"
    for count, row, shared in top_director_matches:
        display += f"**{row['Title']} ({row['Release Date']})**\n"
        display += f"- Shared Director(s): {format_names(shared)}\n"
        display += f"- Rating: {row['Rating']}\n"
        display += f"- Synopsis: {row['Synopsis'][:300]}...\n"
        display += f"- Link: {row['Content Link']}\n\n"

    return display



# Step 1: Ask for title input
def step1_title_input(user_title):
    matches = get_matching_titles(user_title)
    if matches.empty:
        return f"No matches found for '{user_title}'", None

    seen_titles = set()
    options = []

    for i, row in matches.iterrows():
        title_key = f"{row['Title'].strip().lower()}_{row['Release Date']}"
        if title_key in seen_titles:
            continue
        seen_titles.add(title_key)

        label = f"{row['Title']} ({row['Release Date']}) - {row['Synopsis'][:100]}..."
        options.append((label, i))

    if not options:
        return "No unique matches found.", None

    return "Please select the correct title version:", gr.update(choices=options)


# Gradio Interface
with gr.Blocks() as demo:
    gr.Markdown("🎬 **Movie/TV Recommendation System**")

    with gr.Row():
        title_input = gr.Textbox(label="Enter Movie/TV Title")
        title_search_btn = gr.Button("Search Title")

    title_status = gr.Markdown()
    title_dropdown = gr.Dropdown(label="Select the correct version", choices=[], interactive=True, visible=True)

    with gr.Row():
        extra_genre_input = gr.Textbox(label="Optional: Extra Genre (must also contain original genre)")
        min_rating_slider = gr.Slider(0, 10, value=5.0, label="Minimum Rating")
        num_recs_slider = gr.Slider(1, 20, value=10, step=1, label="Number of Recommendations")

    with gr.Row():
        content_type_input = gr.Radio(["both", "movie", "tv show"], label="Content Type", value="both")

    generate_btn = gr.Button("Generate Recommendations")
    output = gr.Markdown()

    title_search_btn.click(fn=step1_title_input, inputs=title_input, outputs=[title_status, title_dropdown])
    generate_btn.click(fn=step2_generate,
                       inputs=[title_dropdown, extra_genre_input, min_rating_slider, num_recs_slider, content_type_input],
                       outputs=output)

demo.launch()


It looks like you are running Gradio on a hosted a Jupyter notebook. For the Gradio app to work, sharing must be enabled. Automatically setting `share=True` (you can turn this off by setting `share=False` in `launch()` explicitly).

Colab notebook detected. To show errors in colab notebook, set debug=True in launch()
* Running on public URL: https://a1576e504da5d256e1.gradio.live

This share link expires in 1 week. For free permanent hosting and GPU upgrades, run `gradio deploy` from the terminal in the working directory to deploy to Hugging Face Spaces (https://huggingface.co/spaces)




# RecommenderSystem ---- Video Game


In [None]:
!pip install gspread gspread_dataframe oauth2client gradio



In [None]:
import pandas as pd
import gspread
from google.colab import auth
from google.auth import default
from gspread_dataframe import get_as_dataframe
import pandas as pd
from oauth2client.service_account import ServiceAccountCredentials
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import linear_kernel
from sklearn.preprocessing import MinMaxScaler
from difflib import get_close_matches

## Important Notice
This application requires access to a private Google Sheet containing the scraped game dataset.
load_googlesheet
To successfully run the code, you must:
* Have a Google service account with permission to access the target Google Sheet.
* Ensure the service account email is added as a viewer/editor to the Google Sheet.
* Only works inside Google Colab, uses Colab's built-in user authentication flow
* If wants to run on other enviroment, need to place the credentials JSON file in your project directory and authenticate using it before calling **load_googlesheet()**
***load_googlesheet()** requires to modify:

      from google.oauth2.service_account import Credentials

      creds = Credentials.from_service_account_file('your_credentials.json')

      client = gspread.authorize(creds)

## Functions
This section provides the full function model for a recommender system.

**load_googlesheet function** is used to access and load scraped game data. This function connects to a specific Google Sheet using the gspread and pandas libraries to retrieve up-to-date information used in the recommender system.

Instruction:
1. Ensure that you have authorized access to the Google Sheet by generating and downloading a service account credentials file.
2. Copy and paste the google sheet link, and the sheet tab name into variables.
3. Run the load_googlesheet() function to retrieve the dataset into a DataFrame.
   Example: df = load_googlesheet(sheet_url, worksheet_name)
4. Returned df should contain all the columns of the googlesheet.

Notes:
If you modify the structure of the Google Sheet, ensure that the load_googlesheet function is updated accordingly to maintain compatibility.


**parse_owner function** is used to transform the owners column from the game dataset.  The owners column displayed as something like ("20,000 - 50,000") in the dataset. To make this column can be easily use for analysis, this function will convert the values into a single averaged value.  

Instruction:
1. Create a new column named 'owners_avg', and called apply function to apply the parse_owners function to the owners column.
 Example : df['owners_avg'] = df['owners'].apply(parse_owners)

Notes:
Ensure that the owners column has valid range strings. If the format varies, additional error handling might be needed.  
If there are missing or invalid entries (e.g., "N/A"), consider handling those separately before applying this function.

**get_closest_title(title)** is used to help users find the most similar game title from the dataset, in case they make a typo or enter a slightly incorrect name.
*   n = number of matches to be returned, default set as 1, can be modified to receive a list of titles
*   cutoff = minimum similarity threshold, default set as 0.7, can be modify

How it works:
*   Uses Python's difflib.get_close_matches to compare the input title to known game titles (indices.index)
*   if a close match (with similarity score greater than 0.65) is found, it returns that title
*   If no match is found, returns None.


**Recommend_games(title, num_recommendations=10)** is a basic recommender system model that recommends similar games based on text similarity, using TF-IDF and cosine similairty on the combination of game description,genres and tags

Parameters:
* titile = game title
* num_recommendations = number of recommendation that user's want to generate, default set to 10 output.
How it works:
1. Checks if the given title exists in the dataset(indices). If not found, then return error message
2. Finds the similarity score between the given title and others using a precomputed cosine similarity matrix (cosine_sim)
3. Sorts the games by similarity score (from highest to lowest)
4. Returns the top n most similarity games

**Recommend_hybrid(title, num_recommendations=10)** is a hybrid recommender system model that combining the similarity with popularity metrics
Parameters:
titile = game title
num_recommendations = number of recommendation that user's want to generate, default set to 10 output.
How it works:
1. Checks if the given title exists in the dataset(indices). If not found, then return error message
2. Gets content-based similarity scores
3. Combines similarty score with popularity score uisng a weighted average, can be modified:
  * similarity weight = 75%
  * popularity weight = 25%
  * **caution: if popularity score weighted too high, the output will have bias because some of the video games such as CSGO have a very large popularity score, and will significant impact the recommendation**
4. Output dataframe: game title, genre, tags, popularity score, current players, peak player of the day, review counts(positive/negative), short_description

In [None]:
def load_googlesheet(sheet_url, worksheet_name):
    auth.authenticate_user()
    creds, _ = default()
    client = gspread.authorize(creds)
    spreadsheet = client.open_by_url(sheet_url)
    worksheet = spreadsheet.worksheet(worksheet_name)
    df = get_as_dataframe(worksheet)
    df = df.dropna(how='all')
    return df

sheet_url = "https://docs.google.com/spreadsheets/d/1J5NcGXWvWs7NNLKJw96WVKjMKN8Dno0Wk2ohYbjJLZY"
worksheet_name = "Cleaned_game_details"
df = load_googlesheet(sheet_url, worksheet_name)


tfidf = TfidfVectorizer(stop_words='english')
tfidf_matrix = tfidf.fit_transform(df['combined_features'])

cosine_sim = linear_kernel(tfidf_matrix, tfidf_matrix)
indices = pd.Series(df.index, index=df['name']).drop_duplicates()

def parse_owners(owner_range):
    try:
        parts = owner_range.replace(',', '').split('-')
        low = int(parts[0].strip())
        high = int(parts[1].strip())
        return (low + high) // 2
    except:
        return 0

df['owners_avg'] = df['owners'].apply(parse_owners)
for col in ['Peak_Today', 'Current_Players', 'owners', 'positive', 'negative']:
    df[col] = df[col].fillna(0)

scaler = MinMaxScaler()
df[['peak_score', 'player_score', 'owner_score', 'pos_score', 'neg_score']] = scaler.fit_transform(
    df[['Peak_Today', 'Current_Players', 'owners_avg', 'positive', 'negative']]
)
df['popularity_score'] = (
    0.3 * df['peak_score'] +
    0.2 * df['player_score'] +
    0.2 * df['owner_score'] +
    0.2 * df['pos_score'] -
    0.1 * df['neg_score']
)

def get_closest_title(title):
    matches = get_close_matches(title, indices.index, n=1, cutoff=0.7)
    return matches[0] if matches else None

def recommend_games(title, num_recommendations=10):
    if title not in indices:
        return None, f"'{title}' not found."

    idx = indices[title]
    sim_scores = list(enumerate(cosine_sim[idx]))
    sim_scores = sorted(sim_scores, key=lambda x: x[1], reverse=True)
    sim_scores = sim_scores[1:num_recommendations+1]

    game_indices = [i[0] for i in sim_scores]
    return df.iloc[game_indices].copy().assign(score=[x[1] for x in sim_scores]), None

def recommend_hybrid(title, num_recommendations=10):
    if title not in indices:
        return None, f"'{title}' not found in dataset."

    idx = indices[title]
    sim_scores = list(enumerate(cosine_sim[idx]))

def recommend_hybrid(title, num_recommendations=10):
    if title not in indices:
        return None, f"'{title}' not found in dataset."

    idx = indices[title]
    sim_scores = list(enumerate(cosine_sim[idx]))

    # Use weighted blend instead of multiplication
    hybrid_scores = [
        (i, 0.75 * sim + 0.25 * df.iloc[i]['popularity_score']) for i, sim in sim_scores if i != idx
    ]

    # Sort and select top results
    hybrid_scores = sorted(hybrid_scores, key=lambda x: x[1], reverse=True)
    top_indices = [i[0] for i in hybrid_scores[:num_recommendations]]

    # Return your preferred columns + relevance score
    result_df = df.loc[top_indices, ['name', 'genre', 'tags', 'popularity_score', 'Current_Players', 'Peak_Today', 'positive', 'negative', 'short_description']].copy()
    result_df['score'] = [x[1] for x in hybrid_scores[:num_recommendations]]

    return result_df, None

## Interative Dashboard
This section creates a web-based user interface using Gradio, allowing users to personalize the game recommendations by selecting between basic and hybrid models and applying filters.


**Recommend_interface()** function serves as the controller for the filtering and formatted outputs.

Parameters:
* title: Game title input by user.
* num_recommendations: Number of games to recommend (5–20).
* mode: Recommendation strategy — "Content-based" or "Hybrid".
* selected_genres: Optional list of genres to filter results.
* min_popularity: Minimum popularity score threshold.
* min_positive: Minimum number of positive reviews.
How it works:
1. if the title is not found, attempts matching using **get_close_matches()**
2. Calls either **recommend_games()** or **recommend_hybrid()** depending on the user selection
3. Filters the result set by selected genres (if any), popularity score, and positive reviews.
4. Construct a readable markdown block for each game, includes genre, popularity, peak players, reviews, relevance score, short description if available
5. Return the results inside a scrollable window, if no games meet criteria returns a message.


Before launching the interface, make sure genres are cleaned and prepared as a dropdown filter

**gr.Interface()** function powered the dashboard interface and displays the recommendation based on users customization.

Inputs:
* Textbox: Game title (default: Cities: Skylines)
* Slider: Number of recommendations (5–20)
* Radio: Recommendation mode (Content-based or Hybrid)
* Dropdown: Genre filter (multi-select enabled)
* Slider: Minimum popularity score (0.0–1.0)
* Slider: Minimum number of positive reviews

Output: Displays recommendation results with text and optional image

In [None]:
import gradio as gr
from difflib import get_close_matches

# Clean genre list for dropdown
all_genres = df['genre'].dropna().str.split(',')
flat_genres = all_genres.explode().str.strip()
unique_genres = sorted(flat_genres.unique())

# Main recommender interface function
def recommend_interface(title, num_recommendations, mode, selected_genres, min_popularity, min_positive):
    try:
        if title not in indices:
            matches = get_close_matches(title, indices.index, n=1, cutoff=0.6)
            if matches:
                title = matches[0]
            else:
                return "Game not found."

        if mode == "Content-based":
            results, _ = recommend_games(title, num_recommendations)
        else:
            results, _ = recommend_hybrid(title, num_recommendations)

        if results is None or isinstance(results, str):
            return "No results found."

        # Filters
        if selected_genres:
            results = results[results['genre'].apply(
                lambda g: any(genre in g for genre in selected_genres)
            )]
        results = results[
            (results['popularity_score'] >= min_popularity) &
            (results['positive'] >= min_positive)
        ]

        if results.empty:
            return "No recommendations match your filters."

        # Output formatting
        output = ""
        for _, row in results.iterrows():
            output += f""" 🎮 {row['name']}\n
**Genre:** {row.get('genre', 'N/A')}\n
**Tags:** {row.get('tags', 'N/A')}\n
**Popularity Score:** {row.get('popularity_score', 0):.2f}\n
**Peak Today:** {row.get('Peak_Today', 0)}\n
**Positive Reviews:** {row.get('positive', 0)}\n
**Relevance Score:** {row.get('score', 0):.4f}\n
"""
            if 'short_description' in row and isinstance(row['short_description'], str):
                output += f"{row['short_description']}\n\n"
            if 'header_image_url' in row and isinstance(row['header_image_url'], str):
                output += f"![Game Image]({row['header_image_url']})\n\n"
            output += "---\n"

        # Wrap in scrollable div
        scroll_wrapper = f"""
<div style="max-height: 600px; overflow-y: auto; padding-right: 10px;">
{output}
</div>
"""
        return scroll_wrapper

    except Exception as e:
        return f"Error: {e}"

# Launch Gradio interface
gr.Interface(
    fn=recommend_interface,
    inputs=[
        gr.Textbox(label="Enter a game title", value="Cities: Skylines"),
        gr.Slider(5, 20, value=10, step=1, label="Number of Recommendations"),
        gr.Radio(["Content-based", "Hybrid"], value="Hybrid", label="Recommendation Mode"),
        gr.Dropdown(choices=unique_genres, label="Select Genres (optional)", multiselect=True),
        gr.Slider(0.0, 1.0, value=0.0, step=0.05, label="Minimum Popularity Score"),
        gr.Slider(0, int(df['positive'].max()), value=0, step=1000, label="Minimum Positive Reviews"),
    ],
    outputs=gr.Markdown(label="Recommendations"),
    title="🎮 Game Recommender",
    description="Discover games based on content similarity and popularity. Filter by genre, reviews, and more!"
).launch()


It looks like you are running Gradio on a hosted a Jupyter notebook. For the Gradio app to work, sharing must be enabled. Automatically setting `share=True` (you can turn this off by setting `share=False` in `launch()` explicitly).

Colab notebook detected. To show errors in colab notebook, set debug=True in launch()
* Running on public URL: https://8838eba899e0225bc3.gradio.live

This share link expires in 1 week. For free permanent hosting and GPU upgrades, run `gradio deploy` from the terminal in the working directory to deploy to Hugging Face Spaces (https://huggingface.co/spaces)




# Successes and Failures of our Recommender Systems

This section is mainly for interns who want to take over this project and improve on the existing models.

https://docs.google.com/document/d/17iQQPQWxp-NsMrdwiiMe6mz_K0F3ygvKkj7D_XZG7Rs/edit?usp=drive_link