# Watchify

A Flask Web App:

Uses Spotify listening history from the past 30 days and provides a custom movie/TV Show downlonadable visual recommendation.

* To run code snippets that use the Spotify API and Spotipy click on the link provided...

* "Please go to the following URL to authorize:"
https://accounts.spotify.com/authorize?client_id=ecec60c9a316409a84a45c923f7473ee&response_type=code&redirect_uri=http%3A%2F%2Flocalhost%2F&scope=user-top-read

* Code is located after the "=" in browser directory http://localhost/?code=

* Copy "code", and paste in designated area, hit "Enter" tab

* Enter the code from the URL (localhost/code='...'):

## Movie Genres Datasets, Implementation and Filtering Logic

Data Source: [Kaggle, IMDB](https://www.kaggle.com/datasets/rajugc/imdb-movies-dataset-based-on-genre)

- Datasets were updated 6mo. ago making it relevant

- Steps: click on link, download zip folder with 16 datasets, run the cell below and upload the 16 files, save "combined_movies.csv", upload combined movies file to colab content (files).


### 1.  Concatenating Movie Genre Data Using pandas

####Code

In [2]:
import pandas as pd
from google.colab import files
import io

# Step 1: Upload the files
uploaded_files = files.upload()

all_data = []

# Step 2, 3, and 4: Read each uploaded CSV, add genre column, and concatenate
for filename in uploaded_files.keys():
    # Read the file content into a pandas DataFrame
    df = pd.read_csv(io.StringIO(uploaded_files[filename].decode('utf-8')))

    # Assume that the filename before '.csv' is the genre name
    genre_name = filename.split('.csv')[0]
    df['genre'] = genre_name

    all_data.append(df)

# Concatenate all dataframes
final_df = pd.concat(all_data, ignore_index=True)

# Step 5: Save the combined DataFrame into a new CSV file
final_df.to_csv('combined_movies.csv', index=False)

# Download the combined CSV
files.download('combined_movies.csv')


Saving action.csv to action.csv
Saving adventure.csv to adventure.csv
Saving animation.csv to animation.csv
Saving biography.csv to biography.csv
Saving crime.csv to crime.csv
Saving family.csv to family.csv
Saving fantasy.csv to fantasy.csv
Saving film-noir.csv to film-noir.csv
Saving history.csv to history.csv
Saving horror.csv to horror.csv
Saving mystery.csv to mystery.csv
Saving romance.csv to romance.csv
Saving scifi.csv to scifi.csv
Saving sports.csv to sports.csv
Saving thriller.csv to thriller.csv
Saving war.csv to war.csv


<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

#### Code Breakdown

* Imports:

  The necessary libraries are imported. Pandas is for data handling, the files module from Google Colab enables file interactions, and the io module allows for stream operations.

* Uploading Files:

  The user is prompted to upload one or more CSV files using Google Colab's file upload utility.

* Data Aggregation Initialization:

  An empty list named all_data is initialized. This will be used to store the data from each uploaded file after processing.

* Processing Each Uploaded File:

  For each uploaded file:
  The content of the file is read into a pandas DataFrame.
  The genre is determined from the filename itself (it assumes the filename without the '.csv' extension is the genre name).
  A new column named 'genre' is added to the DataFrame, and it's filled with the determined genre name.
  This processed DataFrame is then added to the all_data list.

* Combining DataFrames:

  All the DataFrames in the all_data list are concatenated (combined) into a single DataFrame. This combined DataFrame includes all the data from the uploaded files, with an additional 'genre' column indicating the genre of each movie.

* Saving and Downloading:

  The final combined DataFrame is saved to a new CSV file named 'combined_movies.csv'.
  This combined CSV file is then automatically downloaded to the user's computer using Google Colab's file download utility.

### 2. Fetch Spotify User's Top 5 Most Listened Genres Over the Past 30 days

#### Code

In [3]:
!pip install spotipy

Collecting spotipy
  Downloading spotipy-2.23.0-py3-none-any.whl (29 kB)
Collecting redis>=3.5.3 (from spotipy)
  Downloading redis-5.0.1-py3-none-any.whl (250 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m250.3/250.3 kB[0m [31m4.5 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: redis, spotipy
Successfully installed redis-5.0.1 spotipy-2.23.0


In [30]:
import spotipy
from spotipy.oauth2 import SpotifyOAuth

# Spotify API Credentials
SPOTIPY_CLIENT_ID = 'ecec60c9a316409a84a45c923f7473ee'
SPOTIPY_CLIENT_SECRET = '3129b9225476463d86ddc4074cfc8500'
SPOTIPY_REDIRECT_URI = 'http://localhost/'

# Create OAuth2 object
sp_oauth = SpotifyOAuth(client_id=SPOTIPY_CLIENT_ID,
                        client_secret=SPOTIPY_CLIENT_SECRET,
                        redirect_uri=SPOTIPY_REDIRECT_URI,
                        scope=["user-top-read"])

# Get the authorization URL
auth_url = sp_oauth.get_authorize_url()
print("Please go to the following URL to authorize:")
print(auth_url)

# After clicking the URL and authorizing, you'll be redirected to your specified redirect_uri.
# Extract the code parameter from that URL.
code = input("Enter the code from the URL (localhost/code='...'): ")
token = sp_oauth.get_access_token(code, as_dict=False)
sp = spotipy.Spotify(auth=token)

# Fetch user's top artists over the past 30 days
top_artists = sp.current_user_top_artists(time_range='short_term')

# Count the frequency of each genre
genre_count = {}
for artist in top_artists['items']:
    for genre in artist['genres']:
        genre_capitalized = genre.capitalize()  # Capitalize first letter of each genre
        if genre_capitalized in genre_count:
            genre_count[genre_capitalized] += 1
        else:
            genre_count[genre_capitalized] = 1

# Sort genres by frequency and get top 5
top_5_genres = sorted(genre_count, key=genre_count.get, reverse=True)[:5]

print("Your Top 5 Genres from Spotify over the past 30 days:")
for genre in top_5_genres:
    print(genre)

Please go to the following URL to authorize:
https://accounts.spotify.com/authorize?client_id=ecec60c9a316409a84a45c923f7473ee&response_type=code&redirect_uri=http%3A%2F%2Flocalhost%2F&scope=user-top-read
Enter the code from the URL (localhost/code='...'): AQCKH9AmlNd5qE0sD10hauq_TGp1MU42TT4Hol7eBBfDW_4y-utKfJgYzu_Xcnvens1txgs6pYtsV3OTvQdHW5x5HGWIKGGy1CkKu1mq1WlfEka3TOjz58Bf0N35H93RGCGeMsM4Qy4_v22ByCxC2CTTAfsfReAtF4T0u1E3c3dEakbD
Your Top 5 Genres from Spotify over the past 30 days:
Pop
Reggaeton
Trap latino
Urbano latino
Art pop


#### Code Breakdown


* Imports:

  spotipy: A Python library to interact with the Spotify Web API.
  SpotifyOAuth: A helper class in the spotipy library for OAuth2 authentication with Spotify.

* Spotify API Credentials:

  The code defines the Spotify client ID, client secret, and a redirect URI as constants.
  These are used to authenticate with the Spotify API.

* Create OAuth2 Object:

  An instance of SpotifyOAuth is created using the aforementioned credentials, as well as a scope user-top-read which gives permission to access the user's top tracks and artists.

* Authorization:

  A URL is generated that the user should visit in their browser to authorize the application's access to their Spotify data.
  Once the user authorizes the app, they will be redirected to the specified SPOTIPY_REDIRECT_URI (in this case, http://localhost/). The URL will contain a code parameter.
  The user is prompted to input the code from the redirected URL.
  This code is then used to obtain an access token, which will allow the app to make requests on the user's behalf.

* Fetch User's Top Artists:

  Using the spotipy instance (with the access token), the code fetches the user's top artists for the past 30 days using the current_user_top_artists method with a time range of short_term.

* Genre Frequency Count:

  An empty dictionary genre_count is initialized.
  For each artist in the retrieved top artists:
  For each genre associated with the artist:
  If the genre is already in the genre_count dictionary, its count is incremented by 1.
  If not, it's added to the dictionary with a count of 1.

* Get Top 5 Genres:

  The genres are sorted based on their frequencies (from the highest to the lowest) and the top 5 are selected.

* Display Top 5 Genres:

  The top 5 genres are printed to the console.

### 3. Creating a Genre Map


The code below was created with the help of ChatGPT as per my indications of what my dataset columns in the file "combined_movies.csv" are labelled.

#### Code

In [25]:
# Genre mapping
genre_mapping = {
    # Main genres
    'pop': ['action', 'adventure', 'romance'],
    'art pop': ['animation', 'fantasy', 'romance'],
    'reggaeton': ['action', 'adventure'],
    'urbano latino': ['action', 'adventure'],
    'trap latino': ['crime', 'thriller'],
    'rock': ['action', 'adventure', 'war'],
    'indie rock': ['drama', 'romance', 'adventure'],
    'classical': ['biography', 'history', 'romance'],
    'hip hop': ['action', 'crime', 'drama'],
    'jazz': ['film-noir', 'romance', 'biography'],
    'country': ['family', 'romance', 'history'],
    'electronic': ['sci-fi', 'mystery', 'thriller'],
    'metal': ['action', 'horror', 'war'],
    'folk': ['family', 'history', 'biography'],
    'blues': ['drama', 'biography', 'film-noir'],
    'r&b': ['drama', 'romance', 'crime'],
    'soul': ['drama', 'family', 'romance'],
    'punk': ['action', 'thriller', 'mystery'],
    'disco': ['comedy', 'romance', 'family'],
    'house': ['sci-fi', 'thriller', 'mystery'],
    'techno': ['sci-fi', 'action'],
    'edm': ['action', 'adventure', 'sci-fi'],
    'latin': ['romance', 'family', 'adventure'],
    'reggae': ['comedy', 'adventure', 'family'],
    'funk': ['comedy', 'romance', 'action'],
    'k-pop': ['romance', 'comedy', 'action'],
    'psychedelic': ['animation', 'fantasy', 'mystery'],
    'world': ['history', 'family', 'biography'],
    'ambient': ['mystery', 'sci-fi', 'animation'],

    # Subgenres
    'lo-fi beats': ['drama', 'romance', 'animation'],
    'vaporwave': ['sci-fi', 'mystery', 'animation'],
    'emo': ['drama', 'romance', 'mystery'],
    'hardcore': ['action', 'thriller', 'war'],
    'dubstep': ['sci-fi', 'action', 'thriller'],
    'ska': ['comedy', 'family', 'adventure'],
    'swing': ['history', 'romance', 'family'],
    'trance': ['sci-fi', 'fantasy', 'thriller'],
    'grime': ['action', 'crime', 'drama'],
    'bluegrass': ['family', 'history', 'drama'],
    'new wave': ['drama', 'sci-fi', 'mystery'],
    'post-punk': ['drama', 'mystery', 'thriller'],
    'trip hop': ['mystery', 'drama', 'sci-fi'],
    'neosoul': ['drama', 'romance', 'family'],
    'afrobeat': ['history', 'drama', 'family'],
    'chillhop': ['drama', 'animation', 'romance'],
    'synthwave': ['sci-fi', 'action', 'drama']
}

# Convert genre_mapping to DataFrame
df = pd.DataFrame.from_dict(genre_mapping, orient='index').reset_index()
df.columns = ['Music Genre', 'Movie Genre 1', 'Movie Genre 2', 'Movie Genre 3']

# Capitalize the first letter of each entry in the specified columns
df['Music Genre'] = df['Music Genre'].str.capitalize()
df['Movie Genre 1'] = df['Movie Genre 1'].str.capitalize()
df['Movie Genre 2'] = df['Movie Genre 2'].str.capitalize()
df['Movie Genre 3'] = df['Movie Genre 3'].str.capitalize()

# Display DataFrame
df

Unnamed: 0,Music Genre,Movie Genre 1,Movie Genre 2,Movie Genre 3
0,Pop,Action,Adventure,Romance
1,Art pop,Animation,Fantasy,Romance
2,Reggaeton,Action,Adventure,
3,Urbano latino,Action,Adventure,
4,Trap latino,Crime,Thriller,
5,Rock,Action,Adventure,War
6,Indie rock,Drama,Romance,Adventure
7,Classical,Biography,History,Romance
8,Hip hop,Action,Crime,Drama
9,Jazz,Film-noir,Romance,Biography


#### Code Breakdown

* Genre Mapping:

  A dictionary named genre_mapping is defined.
  Each key in the dictionary represents a music genre, and the corresponding value is a list of movie genres that are (presumably) related or similar in theme or mood.
  The music genres are divided into main genres (like 'pop', 'rock', 'jazz', etc.) and subgenres (like 'lo-fi beats', 'vaporwave', etc.).
  For example, for the music genre 'pop', the related movie genres are 'action', 'adventure', and 'romance'.

* Convert genre_mapping to DataFrame:

  The dictionary is transformed into a pandas DataFrame using the from_dict method.
  The orient='index' argument specifies that the dictionary keys should become the DataFrame's index. The reset_index method is then used to move these indices into a column and provide a default integer index.
  The columns of the DataFrame are renamed to 'Music Genre', 'Movie Genre 1', 'Movie Genre 2', and 'Movie Genre 3' for better clarity.

* Display DataFrame:

  The resulting DataFrame, df, represents the music genres and their corresponding movie genres in a tabular form. When this code is run in a Jupyter Notebook or another interactive Python environment, the DataFrame will be displayed, showing the relationship between each music genre and the top three related movie genres.

### 4. Matching Spotify Top 5 Genres to Movie Genres with Genre Mapping

#### Code

In [6]:
import spotipy
from spotipy.oauth2 import SpotifyOAuth
import pandas as pd

# Spotify API Credentials
SPOTIPY_CLIENT_ID = 'ecec60c9a316409a84a45c923f7473ee'
SPOTIPY_CLIENT_SECRET = '3129b9225476463d86ddc4074cfc8500'
SPOTIPY_REDIRECT_URI = 'http://localhost/'

# Create OAuth2 object
sp_oauth = SpotifyOAuth(client_id=SPOTIPY_CLIENT_ID,
                        client_secret=SPOTIPY_CLIENT_SECRET,
                        redirect_uri=SPOTIPY_REDIRECT_URI,
                        scope=["user-top-read"])

# Get the authorization URL
auth_url = sp_oauth.get_authorize_url()
print("Please go to the following URL to authorize:")
print(auth_url)

# Extract the code from the redirected URL
code = input("Enter the code from the URL (localhost/code='...'): ")
token = sp_oauth.get_access_token(code, as_dict=False)
sp = spotipy.Spotify(auth=token)

# Fetch user's top artists
top_artists = sp.current_user_top_artists()

# Count frequency of each genre
genre_count = {}
for artist in top_artists['items']:
    for genre in artist['genres']:
        if genre in genre_count:
            genre_count[genre] += 1
        else:
            genre_count[genre] = 1

# Sort genres by frequency and get top 5
top_5_genres = sorted(genre_count, key=genre_count.get, reverse=True)[:5]


# Genre mapping
genre_mapping = {
    # Main genres
    'pop': ['action', 'adventure', 'romance'],
    'art pop': ['animation', 'fantasy', 'romance'],
    'reggaeton': ['action', 'adventure'],
    'urbano latino': ['action', 'adventure'],
    'trap latino': ['crime', 'thriller'],
    'rock': ['action', 'adventure', 'war'],
    'indie rock': ['drama', 'romance', 'adventure'],
    'classical': ['biography', 'history', 'romance'],
    'hip hop': ['action', 'crime', 'drama'],
    'jazz': ['film-noir', 'romance', 'biography'],
    'country': ['family', 'romance', 'history'],
    'electronic': ['sci-fi', 'mystery', 'thriller'],
    'metal': ['action', 'horror', 'war'],
    'folk': ['family', 'history', 'biography'],
    'blues': ['drama', 'biography', 'film-noir'],
    'r&b': ['drama', 'romance', 'crime'],
    'soul': ['drama', 'family', 'romance'],
    'punk': ['action', 'thriller', 'mystery'],
    'disco': ['comedy', 'romance', 'family'],
    'house': ['sci-fi', 'thriller', 'mystery'],
    'techno': ['sci-fi', 'action'],
    'edm': ['action', 'adventure', 'sci-fi'],
    'latin': ['romance', 'family', 'adventure'],
    'reggae': ['comedy', 'adventure', 'family'],
    'funk': ['comedy', 'romance', 'action'],
    'k-pop': ['romance', 'comedy', 'action'],
    'psychedelic': ['animation', 'fantasy', 'mystery'],
    'world': ['history', 'family', 'biography'],
    'ambient': ['mystery', 'sci-fi', 'animation'],

    # Subgenres
    'lo-fi beats': ['drama', 'romance', 'animation'],
    'vaporwave': ['sci-fi', 'mystery', 'animation'],
    'emo': ['drama', 'romance', 'mystery'],
    'hardcore': ['action', 'thriller', 'war'],
    'dubstep': ['sci-fi', 'action', 'thriller'],
    'ska': ['comedy', 'family', 'adventure'],
    'swing': ['history', 'romance', 'family'],
    'trance': ['sci-fi', 'fantasy', 'thriller'],
    'grime': ['action', 'crime', 'drama'],
    'bluegrass': ['family', 'history', 'drama'],
    'new wave': ['drama', 'sci-fi', 'mystery'],
    'post-punk': ['drama', 'mystery', 'thriller'],
    'trip hop': ['mystery', 'drama', 'sci-fi'],
    'neosoul': ['drama', 'romance', 'family'],
    'afrobeat': ['history', 'drama', 'family'],
    'chillhop': ['drama', 'animation', 'romance'],
    'synthwave': ['sci-fi', 'action', 'drama']
}

# Using the genre mapping to find corresponding movie genres
matching_genres = set()
for spotify_genre in top_5_genres:
    if spotify_genre in genre_mapping:
        for movie_genre in genre_mapping[spotify_genre]:
            matching_genres.add(movie_genre.title())  # capitalize each word

print("Matching Genres between Spotify and Movies:", list(matching_genres))

Please go to the following URL to authorize:
https://accounts.spotify.com/authorize?client_id=ecec60c9a316409a84a45c923f7473ee&response_type=code&redirect_uri=http%3A%2F%2Flocalhost%2F&scope=user-top-read
Enter the code from the URL (localhost/code='...'): AQBuHPmnRgOJ1TbLox4Jxnc9EIT3TPqG6EvWPTJSnAUTAnsRZP4cGyKTBXQTgIWVFaipMsgPOguNRnlOtPUlfQK-xU6asQqd7BLbreBZTNU9Lxc6M9MpxcOEyTp15Ku82O6vBSUJUp94VcZhI9K6xBtw3pBnFeagdliCiedUAnfpDnp9
Matching Genres between Spotify and Movies: ['Crime', 'Adventure', 'Action', 'Animation', 'Romance', 'Fantasy', 'Thriller']


#### Code Breakdown



* Fetching User's Top Artists and Genres:

  Once the user provides the authorization code, an access token is obtained.
  This token allows the retrieval of the user's top artists from Spotify.
  For each artist, the code keeps a count of how frequently each genre is associated with the top artists.
  The genres are then sorted based on frequency, and the top 5 are extracted to top_5_genres.

* Genre Mapping:

  A dictionary named genre_mapping is defined. It maps music genres to movie genres.
  For instance, the music genre 'pop' is associated with the movie genres 'action', 'adventure', and 'romance'.

* Matching Music Genres with Movie Genres:

  The code then cross-references the top_5_genres from Spotify with the genre_mapping dictionary.
  For each of the top music genres, the code retrieves the associated movie genres.
  These movie genres are added to the matching_genres set, which ensures that there are no duplicates.

* Displaying Results:

  The code prints the list of movie genres that match with the user's top Spotify genres.

### 5. Movie Recommendation

#### Code

In [8]:
import spotipy
from spotipy.oauth2 import SpotifyOAuth
import pandas as pd
import random

# Spotify API Credentials
SPOTIPY_CLIENT_ID = 'ecec60c9a316409a84a45c923f7473ee'
SPOTIPY_CLIENT_SECRET = '3129b9225476463d86ddc4074cfc8500'
SPOTIPY_REDIRECT_URI = 'http://localhost/'

# Create OAuth2 object
sp_oauth = SpotifyOAuth(client_id=SPOTIPY_CLIENT_ID,
                        client_secret=SPOTIPY_CLIENT_SECRET,
                        redirect_uri=SPOTIPY_REDIRECT_URI,
                        scope=["user-top-read"])

# Get the authorization URL
auth_url = sp_oauth.get_authorize_url()
print("Please go to the following URL to authorize:")
print(auth_url)

# Extract the code from the redirected URL
code = input("Enter the code from the URL (localhost/code='...'): ")
token = sp_oauth.get_access_token(code, as_dict=False)
sp = spotipy.Spotify(auth=token)

# Fetch user's top artists
top_artists = sp.current_user_top_artists()

# Count frequency of each genre
genre_count = {}
for artist in top_artists['items']:
    for genre in artist['genres']:
        if genre in genre_count:
            genre_count[genre] += 1
        else:
            genre_count[genre] = 1

# Sort genres by frequency and get top 5
top_5_genres = sorted(genre_count, key=genre_count.get, reverse=True)[:5]

# Genre mapping
genre_mapping = {
    # Main genres
    'pop': ['action', 'adventure', 'romance'],
    'art pop': ['animation', 'fantasy', 'romance'],
    'reggaeton': ['action', 'adventure'],
    'urbano latino': ['action', 'adventure'],
    'trap latino': ['crime', 'thriller'],
    'rock': ['action', 'adventure', 'war'],
    'indie rock': ['drama', 'romance', 'adventure'],
    'classical': ['biography', 'history', 'romance'],
    'hip hop': ['action', 'crime', 'drama'],
    'jazz': ['film-noir', 'romance', 'biography'],
    'country': ['family', 'romance', 'history'],
    'electronic': ['sci-fi', 'mystery', 'thriller'],
    'metal': ['action', 'horror', 'war'],
    'folk': ['family', 'history', 'biography'],
    'blues': ['drama', 'biography', 'film-noir'],
    'r&b': ['drama', 'romance', 'crime'],
    'soul': ['drama', 'family', 'romance'],
    'punk': ['action', 'thriller', 'mystery'],
    'disco': ['comedy', 'romance', 'family'],
    'house': ['sci-fi', 'thriller', 'mystery'],
    'techno': ['sci-fi', 'action'],
    'edm': ['action', 'adventure', 'sci-fi'],
    'latin': ['romance', 'family', 'adventure'],
    'reggae': ['comedy', 'adventure', 'family'],
    'funk': ['comedy', 'romance', 'action'],
    'k-pop': ['romance', 'comedy', 'action'],
    'psychedelic': ['animation', 'fantasy', 'mystery'],
    'world': ['history', 'family', 'biography'],
    'ambient': ['mystery', 'sci-fi', 'animation'],

    # Subgenres
    'lo-fi beats': ['drama', 'romance', 'animation'],
    'vaporwave': ['sci-fi', 'mystery', 'animation'],
    'emo': ['drama', 'romance', 'mystery'],
    'hardcore': ['action', 'thriller', 'war'],
    'dubstep': ['sci-fi', 'action', 'thriller'],
    'ska': ['comedy', 'family', 'adventure'],
    'swing': ['history', 'romance', 'family'],
    'trance': ['sci-fi', 'fantasy', 'thriller'],
    'grime': ['action', 'crime', 'drama'],
    'bluegrass': ['family', 'history', 'drama'],
    'new wave': ['drama', 'sci-fi', 'mystery'],
    'post-punk': ['drama', 'mystery', 'thriller'],
    'trip hop': ['mystery', 'drama', 'sci-fi'],
    'neosoul': ['drama', 'romance', 'family'],
    'afrobeat': ['history', 'drama', 'family'],
    'chillhop': ['drama', 'animation', 'romance'],
    'synthwave': ['sci-fi', 'action', 'drama']
}

# Using the genre mapping to find corresponding movie genres
matching_genres_weighted = []
for spotify_genre in top_5_genres:
    if spotify_genre in genre_mapping:
        for movie_genre in genre_mapping[spotify_genre]:
            # Add the movie genre according to its frequency
            matching_genres_weighted.extend([movie_genre] * genre_count[spotify_genre])

# Read the CSV
df = pd.read_csv('/content/combined_movies.csv')

# Filter movies with rating 8 and above
df = df[df['rating'] >= 8]

# Filter movies with votes greater than or equal to 20,000
df = df[df['votes'] >= 20000]

# Filter movies based on matching genres
filtered_movies = df[df['genre'].isin(matching_genres_weighted)]

# If there are no movies matching the criteria, print a message
if filtered_movies.empty:
    print("No movies found matching the criteria.")
else:
    # Recommend a movie from the filtered movies based on the weighted genres
    recommended_genre = random.choice(matching_genres_weighted)
    recommended_movie = filtered_movies[filtered_movies['genre'] == recommended_genre].sample().iloc[0]

    print(f"Recommended Movie: {recommended_movie['movie_name']} (Genre: {recommended_movie['genre'].capitalize()}, Rating: {recommended_movie['rating']}, Release Year: {recommended_movie['year']})")

Please go to the following URL to authorize:
https://accounts.spotify.com/authorize?client_id=ecec60c9a316409a84a45c923f7473ee&response_type=code&redirect_uri=http%3A%2F%2Flocalhost%2F&scope=user-top-read
Enter the code from the URL (localhost/code='...'): /content/combined_movies.csv
Recommended Movie: Forrest Gump (Genre: Romance, Rating: 8.8, Release Year: 1994)


#### Code Breakdown


* Fetch User's Top Artists:

  Once authorized, the notebook fetches the user's top artists from Spotify.

* Analyze Top Artists’ Genres:

  The genres of the top artists are analyzed, and a frequency count of each genre is maintained in the genre_count dictionary.

* Extract Top 5 Genres:

  The genres are sorted by frequency, and the top 5 are selected.

* Genre-to-Movie Genre Mapping:

  A predefined genre_mapping dictionary maps Spotify music genres to corresponding movie genres. For example, if the user likes 'pop' music, they might enjoy movies in the 'action', 'adventure', and 'romance' genres.

* Get Weighted List of Matching Movie Genres:

  Using the top 5 music genres, a weighted list of corresponding movie genres is created. This list takes into account the frequency of each music genre.

* Read and Filter Movies Dataset:

  The notebook reads a CSV file containing movie data.
  Movies are filtered based on:
  Rating being 8 and above.
  Number of votes being 20,000 or more.
  The genre matching the genres in the weighted list.

* Recommend a Movie:

  If there are movies that match the criteria, one movie is randomly recommended to the user based on the weighted genres list.
  If no movies match the criteria, the user is informed accordingly.

# TV Show Dataset, Implementation and Filtering Logic

Data Source: [Kaggle, IMDB](https://www.kaggle.com/datasets/payamamanat/imbd-dataset/data)

- Dataset was updated 16 days ago making it relevant
- Steps: click on link, download dataset, name file "tvshowdata.csv", upload to colab content (files)


### 1. Extracting TV Show Genre Data Using pandas


#### Code

Install Spotipy Library to easily handle Spotify's API

In [9]:
!pip install spotipy



In [12]:
import pandas as pd

# Read the CSV file into a DataFrame
df = pd.read_csv('/content/tvshowdata.csv')

# Filter rows where the 'certificate' column contains "TV"
df = df[df['certificate'].str.contains("TV", na=False)]

# Drop rows where the 'genre' column is NaN
df = df.dropna(subset=['genre'])

# Split the genres on commas and flatten the list
all_genres = [genre.strip() for sublist in df['genre'].str.split(',') for genre in sublist]

# Get unique genres using a set
unique_genres = set(all_genres)

# Convert unique genres into a DataFrame
df_genres = pd.DataFrame(sorted(unique_genres), columns=['Genres'])
df_genres

Unnamed: 0,Genres
0,Action
1,Adventure
2,Animation
3,Biography
4,Comedy
5,Crime
6,Documentary
7,Drama
8,Family
9,Fantasy


#### Code Breakdown

* Importing Required Library:

  The pandas library, a popular data handling library in Python, is imported with the alias pd.

* Reading Data:

  The CSV file located at /content/TVShowGenreDatasets/tvshowdata.csv is read into a DataFrame named df. A DataFrame is a two-dimensional labeled data structure in pandas, similar to a table.

* Filtering by Certificate:

  The code filters the DataFrame to retain only rows where the 'certificate' column contains the string "TV". This operation is useful when the goal is to focus only on TV-certified shows.

* Handling Missing Genre Data:

  The DataFrame is further refined by dropping (removing) rows where the 'genre' column has a missing value (NaN or Not a Number).

* Extracting and Cleaning Genres:

  The genres in the 'genre' column are split on commas. This means if a row has "Drama, Thriller", it gets split into two separate strings "Drama" and "Thriller".
  The result of the split is a list of lists. To get a single list of genres, the code "flattens" it by iterating over each sublist and each genre within that sublist.
  While doing so, it also strips (removes) any extra spaces from each genre using the strip() method.

* Obtaining Unique Genres:

  To get a list of unique genres without any duplicates, the code converts the list of all genres into a set named unique_genres. In Python, sets store unique values.

* Creating a DataFrame for Unique Genres:

  The unique genres are converted back to a list (and sorted) to form a DataFrame named df_genres with a single column labeled 'Genres'.

* Output:

  Displays the df_genres DataFrame.

### 2. Fetch Spotify User's Top 5 Most Listened Genres Over the Past 30 days

#### Code

In [31]:
import spotipy
from spotipy.oauth2 import SpotifyOAuth

# Spotify API Credentials
SPOTIPY_CLIENT_ID = 'ecec60c9a316409a84a45c923f7473ee'
SPOTIPY_CLIENT_SECRET = '3129b9225476463d86ddc4074cfc8500'
SPOTIPY_REDIRECT_URI = 'http://localhost/'

# Create OAuth2 object
sp_oauth = SpotifyOAuth(client_id=SPOTIPY_CLIENT_ID,
                        client_secret=SPOTIPY_CLIENT_SECRET,
                        redirect_uri=SPOTIPY_REDIRECT_URI,
                        scope=["user-top-read"])

# Get the authorization URL
auth_url = sp_oauth.get_authorize_url()
print("Please go to the following URL to authorize:")
print(auth_url)

# After clicking the URL and authorizing, you'll be redirected to your specified redirect_uri.
# Extract the code parameter from that URL.
code = input("Enter the code from the URL (localhost/code='...'): ")
token = sp_oauth.get_access_token(code, as_dict=False)
sp = spotipy.Spotify(auth=token)

# Fetch user's top artists over the past 30 days
top_artists = sp.current_user_top_artists(time_range='short_term')

# Count the frequency of each genre
genre_count = {}
for artist in top_artists['items']:
    for genre in artist['genres']:
        genre_capitalized = genre.capitalize()  # Capitalize first letter of each genre
        if genre_capitalized in genre_count:
            genre_count[genre_capitalized] += 1
        else:
            genre_count[genre_capitalized] = 1

# Sort genres by frequency and get top 5
top_5_genres = sorted(genre_count, key=genre_count.get, reverse=True)[:5]

print("Your Top 5 Genres from Spotify over the past 30 days:")
for genre in top_5_genres:
    print(genre)

Please go to the following URL to authorize:
https://accounts.spotify.com/authorize?client_id=ecec60c9a316409a84a45c923f7473ee&response_type=code&redirect_uri=http%3A%2F%2Flocalhost%2F&scope=user-top-read
Enter the code from the URL (localhost/code='...'): AQCKH9AmlNd5qE0sD10hauq_TGp1MU42TT4Hol7eBBfDW_4y-utKfJgYzu_Xcnvens1txgs6pYtsV3OTvQdHW5x5HGWIKGGy1CkKu1mq1WlfEka3TOjz58Bf0N35H93RGCGeMsM4Qy4_v22ByCxC2CTTAfsfReAtF4T0u1E3c3dEakbD
Your Top 5 Genres from Spotify over the past 30 days:
Pop
Reggaeton
Trap latino
Urbano latino
Art pop


#### Code Breakdown

* Library Imports:

  The spotipy library is imported, which is a Python client for the Spotify Web API.
  The SpotifyOAuth class is imported from spotipy.oauth2 to handle OAuth2 authorization with Spotify.
  The pandas library is imported for potential data handling tasks, although it isn't used in this specific code snippet.

* Spotify API Credentials:

  Three constants (SPOTIPY_CLIENT_ID, SPOTIPY_CLIENT_SECRET, and SPOTIPY_REDIRECT_URI) are defined to store the Spotify application's credentials and the URI to redirect after a successful authorization.

* Setting up OAuth2 Authentication:

  An instance of SpotifyOAuth is created using the previously defined credentials and the desired scope of "user-top-read", which means it will have access to the user's top played tracks and artists.

* Authorization Flow:

  An authorization URL is obtained using the get_authorize_url() method of the OAuth2 object. The user is prompted to visit this URL to provide permission.
  After the user gives permission on the Spotify website, they'll be redirected to the redirect_uri which contains a 'code' parameter in the URL. This code is input manually by the user to continue the authentication flow.
  The get_access_token() method exchanges the provided code for an access token which is then used to create an authenticated instance of spotipy.Spotify.

* Fetching Top Artists:

  The method current_user_top_artists() of the spotipy instance fetches the user's top artists over the past 30 days using the time_range='short_term' parameter.

* Counting Genre Frequencies:

  An empty dictionary called genre_count is initialized.
  For each artist retrieved, their associated genres are extracted.
  The frequency of each genre is counted and stored in the genre_count dictionary.

* Identifying Top 5 Genres:

  The genres are sorted based on their frequency in descending order, and the top 5 genres are selected.

* Displaying Results:

  The code finally prints out the user's top 5 genres from Spotify over the past 30 days.

### 3. Creating a Genre Map


#### Code


  The code below was created with the help of ChatGPT as per my indications of the genres previously extracted from the dataset

In [24]:
# Genre mapping
genre_mapping = {
    # Main genres
    'pop': ['Action', 'Adventure', 'Romance'],
    'art pop': ['Animation', 'Fantasy', 'Romance'],
    'reggaeton': ['Action', 'Adventure'],
    'urbano latino': ['Action', 'Adventure'],
    'trap latino': ['Crime', 'Thriller'],
    'rock': ['Action', 'Adventure', 'War'],
    'indie rock': ['Drama', 'Romance', 'Adventure'],
    'classical': ['Biography', 'History', 'Romance'],
    'hip hop': ['Action', 'Crime', 'Drama'],
    'jazz': ['Musical', 'Romance', 'Biography'],
    'country': ['Family', 'Romance', 'History'],
    'electronic': ['Sci-Fi', 'Mystery', 'Thriller'],
    'metal': ['Action', 'Horror', 'War'],
    'folk': ['Family', 'History', 'Biography'],
    'blues': ['Drama', 'Biography', 'Music'],
    'r&b': ['Drama', 'Romance', 'Crime'],
    'soul': ['Drama', 'Family', 'Romance'],
    'punk': ['Action', 'Thriller', 'Mystery'],
    'disco': ['Comedy', 'Romance', 'Family'],
    'house': ['Sci-Fi', 'Thriller', 'Mystery'],
    'techno': ['Sci-Fi', 'Action'],
    'edm': ['Action', 'Adventure', 'Sci-Fi'],
    'latin': ['Romance', 'Family', 'Adventure'],
    'reggae': ['Comedy', 'Adventure', 'Family'],
    'funk': ['Comedy', 'Romance', 'Action'],
    'k-pop': ['Romance', 'Comedy', 'Action'],
    'psychedelic': ['Animation', 'Fantasy', 'Mystery'],
    'world': ['History', 'Family', 'Biography'],
    'ambient': ['Mystery', 'Sci-Fi', 'Animation'],

    # Subgenres
    'lo-fi beats': ['Drama', 'Romance', 'Animation'],
    'vaporwave': ['Sci-Fi', 'Mystery', 'Animation'],
    'emo': ['Drama', 'Romance', 'Mystery'],
    'hardcore': ['Action', 'Thriller', 'War'],
    'dubstep': ['Sci-Fi', 'Action', 'Thriller'],
    'ska': ['Comedy', 'Family', 'Adventure'],
    'swing': ['History', 'Romance', 'Family'],
    'trance': ['Sci-Fi', 'Fantasy', 'Thriller'],
    'grime': ['Action', 'Crime', 'Drama'],
    'bluegrass': ['Family', 'History', 'Drama'],
    'new wave': ['Drama', 'Sci-Fi', 'Mystery'],
    'post-punk': ['Drama', 'Mystery', 'Thriller'],
    'trip hop': ['Mystery', 'Drama', 'Sci-Fi'],
    'neosoul': ['Drama', 'Romance', 'Family'],
    'afrobeat': ['History', 'Drama', 'Family'],
    'chillhop': ['Drama', 'Animation', 'Romance'],
    'synthwave': ['Sci-Fi', 'Action', 'Drama']
}

# Convert genre_mapping to DataFrame
df = pd.DataFrame.from_records([(key,) + tuple(val) + (None,) * (3 - len(val)) for key, val in genre_mapping.items()])
df.columns = ['Music Genre', 'TV Genre 1', 'TV Genre 2', 'TV Genre 3']

# Capitalize the first letter of every music genre in the 'Music Genre' column
df['Music Genre'] = df['Music Genre'].str.capitalize()

# Display DataFrame
df

Unnamed: 0,Music Genre,TV Genre 1,TV Genre 2,TV Genre 3
0,Pop,Action,Adventure,Romance
1,Art pop,Animation,Fantasy,Romance
2,Reggaeton,Action,Adventure,
3,Urbano latino,Action,Adventure,
4,Trap latino,Crime,Thriller,
5,Rock,Action,Adventure,War
6,Indie rock,Drama,Romance,Adventure
7,Classical,Biography,History,Romance
8,Hip hop,Action,Crime,Drama
9,Jazz,Musical,Romance,Biography


#### Code Breakdown

* Genre Mapping Dictionary:

  genre_mapping is a dictionary where each key is a music genre, and the associated value is a list of up to three TV or movie genres that correspond to the music genre.
  The music genres include main genres like "pop" and "rock", as well as sub-genres like "lo-fi beats" and "vaporwave".
  Each music genre maps to 1-3 TV or movie genres, representing perhaps the TV/movie genres a fan of a particular music genre might enjoy.

* Converting genre_mapping to a DataFrame:

  A comprehension is used to convert the dictionary into a list of tuples, where each tuple contains the music genre followed by its associated TV/movie genres.
  To ensure each tuple has the same length (and hence can fit into a DataFrame with a consistent number of columns), (None,) * (3 - len(val)) is added. This will append None values to the tuple until it has a length of 4 (1 for the music genre and 3 for the TV/movie genres).
  pd.DataFrame.from_records() then converts this list of tuples into a pandas DataFrame.

* Setting DataFrame Columns:

  The columns of the DataFrame are named 'Music Genre', 'TV Genre 1', 'TV Genre 2', and 'TV Genre 3'.
  'Music Genre' represents the music genre from the dictionary, and the next three columns represent the associated TV/movie genres.

* Display DataFrame:

  The resulting DataFrame df displays each music genre and its corresponding TV or movie genres in a tabular format.

### 4. Matching Spotify Top 5 Genres to TV Show Genres with Genre Mapping

#### Code

In [16]:
import spotipy
from spotipy.oauth2 import SpotifyOAuth
import pandas as pd

# Spotify API Credentials
SPOTIPY_CLIENT_ID = 'ecec60c9a316409a84a45c923f7473ee'
SPOTIPY_CLIENT_SECRET = '3129b9225476463d86ddc4074cfc8500'
SPOTIPY_REDIRECT_URI = 'http://localhost/'

# Create OAuth2 object
sp_oauth = SpotifyOAuth(client_id=SPOTIPY_CLIENT_ID,
                        client_secret=SPOTIPY_CLIENT_SECRET,
                        redirect_uri=SPOTIPY_REDIRECT_URI,
                        scope=["user-top-read"])

# Get the authorization URL
auth_url = sp_oauth.get_authorize_url()
print("Please go to the following URL to authorize:")
print(auth_url)

# Extract the code from the redirected URL
code = input("Enter the code from the URL (localhost/code='...'): ")
token = sp_oauth.get_access_token(code, as_dict=False)
sp = spotipy.Spotify(auth=token)

# Fetch user's top artists
top_artists = sp.current_user_top_artists()

# Count frequency of each genre
genre_count = {}
for artist in top_artists['items']:
    for genre in artist['genres']:
        if genre in genre_count:
            genre_count[genre] += 1
        else:
            genre_count[genre] = 1

# Sort genres by frequency and get top 5
top_5_genres = sorted(genre_count, key=genre_count.get, reverse=True)[:5]

# Genre mapping for TV Shows
genre_mapping = {
    # Main genres
    'pop': ['Action', 'Adventure', 'Romance'],
    'art pop': ['Animation', 'Fantasy', 'Romance'],
    'reggaeton': ['Action', 'Adventure'],
    'urbano latino': ['Action', 'Adventure'],
    'trap latino': ['Crime', 'Thriller'],
    'rock': ['Action', 'Adventure', 'War'],
    'indie rock': ['Drama', 'Romance', 'Adventure'],
    'classical': ['Biography', 'History', 'Romance'],
    'hip hop': ['Action', 'Crime', 'Drama'],
    'jazz': ['Musical', 'Romance', 'Biography'],
    'country': ['Family', 'Romance', 'History'],
    'electronic': ['Sci-Fi', 'Mystery', 'Thriller'],
    'metal': ['Action', 'Horror', 'War'],
    'folk': ['Family', 'History', 'Biography'],
    'blues': ['Drama', 'Biography', 'Music'],
    'r&b': ['Drama', 'Romance', 'Crime'],
    'soul': ['Drama', 'Family', 'Romance'],
    'punk': ['Action', 'Thriller', 'Mystery'],
    'disco': ['Comedy', 'Romance', 'Family'],
    'house': ['Sci-Fi', 'Thriller', 'Mystery'],
    'techno': ['Sci-Fi', 'Action'],
    'edm': ['Action', 'Adventure', 'Sci-Fi'],
    'latin': ['Romance', 'Family', 'Adventure'],
    'reggae': ['Comedy', 'Adventure', 'Family'],
    'funk': ['Comedy', 'Romance', 'Action'],
    'k-pop': ['Romance', 'Comedy', 'Action'],
    'psychedelic': ['Animation', 'Fantasy', 'Mystery'],
    'world': ['History', 'Family', 'Biography'],
    'ambient': ['Mystery', 'Sci-Fi', 'Animation'],

    # Subgenres
    'lo-fi beats': ['Drama', 'Romance', 'Animation'],
    'vaporwave': ['Sci-Fi', 'Mystery', 'Animation'],
    'emo': ['Drama', 'Romance', 'Mystery'],
    'hardcore': ['Action', 'Thriller', 'War'],
    'dubstep': ['Sci-Fi', 'Action', 'Thriller'],
    'ska': ['Comedy', 'Family', 'Adventure'],
    'swing': ['History', 'Romance', 'Family'],
    'trance': ['Sci-Fi', 'Fantasy', 'Thriller'],
    'grime': ['Action', 'Crime', 'Drama'],
    'bluegrass': ['Family', 'History', 'Drama'],
    'new wave': ['Drama', 'Sci-Fi', 'Mystery'],
    'post-punk': ['Drama', 'Mystery', 'Thriller'],
    'trip hop': ['Mystery', 'Drama', 'Sci-Fi'],
    'neosoul': ['Drama', 'Romance', 'Family'],
    'afrobeat': ['History', 'Drama', 'Family'],
    'chillhop': ['Drama', 'Animation', 'Romance'],
    'synthwave': ['Sci-Fi', 'Action', 'Drama']
}

# Using the genre mapping to find corresponding TV show genres
matching_genres = set()
for spotify_genre in top_5_genres:
    if spotify_genre in genre_mapping:
        for tvshow_genre in genre_mapping[spotify_genre]:
            matching_genres.add(tvshow_genre)

print("Matching Genres between Spotify and TV Shows:", list(matching_genres))

Please go to the following URL to authorize:
https://accounts.spotify.com/authorize?client_id=ecec60c9a316409a84a45c923f7473ee&response_type=code&redirect_uri=http%3A%2F%2Flocalhost%2F&scope=user-top-read
Enter the code from the URL (localhost/code='...'): AQDozcMYbc1m9HJMKHOWi7Y_MPD6dOLqks7vWDKodPJiudctLN6vKHkhordKOSBDGJyD5Q-Pew7QtpEWXAdio3N8FaTtf5O1VMNDsoPy5s87pIKCXTfVqj7QC6ObejEyVbd7W9k0Uca7Uy5PVvHZHxY3ljDTqX_6lAM0JfMAHoulhL7S
Matching Genres between Spotify and TV Shows: ['Crime', 'Adventure', 'Action', 'Animation', 'Romance', 'Fantasy', 'Thriller']


#### Code Breakdown


* Fetching User's Top Artists:

  The top artists of the user are fetched using the current_user_top_artists method.

* Genre Frequency Count:

  For each artist in the user's top artists, their associated genres are extracted.
  A dictionary (genre_count) keeps track of the frequency of each genre.

* Sorting Genres:

  The genres are sorted based on their frequency, and the top 5 genres (top_5_genres) are selected.

* Genre Mapping for TV Shows:

  A dictionary (genre_mapping) is created that maps Spotify music genres to corresponding TV show genres. This acts as a bridge to recommend TV show genres based on a user's preferred music genres.

* Mapping Spotify Genres to TV Show Genres:

  For each of the top_5_genres from Spotify, the script checks if they are in the genre_mapping dictionary.
  If they are, the corresponding TV show genres are added to the matching_genres set.

* Displaying Results:
  
  Finally, the script prints out the TV show genres that match with the user's top Spotify genres.

### 5. TV Show Recommendation

#### Code

In [18]:
import spotipy
from spotipy.oauth2 import SpotifyOAuth
import pandas as pd
import random

# Spotify API Credentials
SPOTIPY_CLIENT_ID = 'ecec60c9a316409a84a45c923f7473ee'
SPOTIPY_CLIENT_SECRET = '3129b9225476463d86ddc4074cfc8500'
SPOTIPY_REDIRECT_URI = 'http://localhost/'

# Create OAuth2 object
sp_oauth = SpotifyOAuth(client_id=SPOTIPY_CLIENT_ID,
                        client_secret=SPOTIPY_CLIENT_SECRET,
                        redirect_uri=SPOTIPY_REDIRECT_URI,
                        scope=["user-top-read"])

# Get the authorization URL
auth_url = sp_oauth.get_authorize_url()
print("Please go to the following URL to authorize:")
print(auth_url)

# Extract the code from the redirected URL
code = input("Enter the code from the URL (localhost/code='...'): ")
token = sp_oauth.get_access_token(code, as_dict=False)
sp = spotipy.Spotify(auth=token)

# Fetch user's top artists
top_artists = sp.current_user_top_artists()

# Count frequency of each genre
genre_count = {}
for artist in top_artists['items']:
    for genre in artist['genres']:
        if genre in genre_count:
            genre_count[genre] += 1
        else:
            genre_count[genre] = 1

# Sort genres by frequency and get top 5
top_5_genres = sorted(genre_count, key=genre_count.get, reverse=True)[:5]

# Update the genre mapping with TV Show genres
genre_mapping = {
    # Main genres
    'pop': ['Action', 'Adventure', 'Romance'],
    'art pop': ['Animation', 'Fantasy', 'Romance'],
    'reggaeton': ['Action', 'Adventure'],
    'urbano latino': ['Action', 'Adventure'],
    'trap latino': ['Crime', 'Thriller'],
    'rock': ['Action', 'Adventure', 'War'],
    'indie rock': ['Drama', 'Romance', 'Adventure'],
    'classical': ['Biography', 'History', 'Romance'],
    'hip hop': ['Action', 'Crime', 'Drama'],
    'jazz': ['Musical', 'Romance', 'Biography'],
    'country': ['Family', 'Romance', 'History'],
    'electronic': ['Sci-Fi', 'Mystery', 'Thriller'],
    'metal': ['Action', 'Horror', 'War'],
    'folk': ['Family', 'History', 'Biography'],
    'blues': ['Drama', 'Biography', 'Music'],
    'r&b': ['Drama', 'Romance', 'Crime'],
    'soul': ['Drama', 'Family', 'Romance'],
    'punk': ['Action', 'Thriller', 'Mystery'],
    'disco': ['Comedy', 'Romance', 'Family'],
    'house': ['Sci-Fi', 'Thriller', 'Mystery'],
    'techno': ['Sci-Fi', 'Action'],
    'edm': ['Action', 'Adventure', 'Sci-Fi'],
    'latin': ['Romance', 'Family', 'Adventure'],
    'reggae': ['Comedy', 'Adventure', 'Family'],
    'funk': ['Comedy', 'Romance', 'Action'],
    'k-pop': ['Romance', 'Comedy', 'Action'],
    'psychedelic': ['Animation', 'Fantasy', 'Mystery'],
    'world': ['History', 'Family', 'Biography'],
    'ambient': ['Mystery', 'Sci-Fi', 'Animation'],

    # Subgenres
    'lo-fi beats': ['Drama', 'Romance', 'Animation'],
    'vaporwave': ['Sci-Fi', 'Mystery', 'Animation'],
    'emo': ['Drama', 'Romance', 'Mystery'],
    'hardcore': ['Action', 'Thriller', 'War'],
    'dubstep': ['Sci-Fi', 'Action', 'Thriller'],
    'ska': ['Comedy', 'Family', 'Adventure'],
    'swing': ['History', 'Romance', 'Family'],
    'trance': ['Sci-Fi', 'Fantasy', 'Thriller'],
    'grime': ['Action', 'Crime', 'Drama'],
    'bluegrass': ['Family', 'History', 'Drama'],
    'new wave': ['Drama', 'Sci-Fi', 'Mystery'],
    'post-punk': ['Drama', 'Mystery', 'Thriller'],
    'trip hop': ['Mystery', 'Drama', 'Sci-Fi'],
    'neosoul': ['Drama', 'Romance', 'Family'],
    'afrobeat': ['History', 'Drama', 'Family'],
    'chillhop': ['Drama', 'Animation', 'Romance'],
    'synthwave': ['Sci-Fi', 'Action', 'Drama']
}

# Using the genre mapping to find corresponding TV show genres
matching_genres_weighted = []
for spotify_genre in top_5_genres:
    if spotify_genre in genre_mapping:
        for tvshow_genre in genre_mapping[spotify_genre]:
            # Add the TV show genre according to its frequency
            matching_genres_weighted.extend([tvshow_genre] * genre_count[spotify_genre])

# Read the CSV for TV Shows
df = pd.read_csv('/content/tvshowdata.csv')

# Make a copy of the original dataframe to avoid SettingWithCopyWarning
df_copy = df.copy()

# Convert "votes" to strings
df_copy['votes'] = df_copy['votes'].astype(str)

# Remove commas and convert to float
df_copy['votes'] = df_copy['votes'].str.replace(',', '').astype(float)

# Now, convert to integers, but only for non-NaN values
df_copy.loc[df_copy['votes'].notna(), 'votes'] = df_copy['votes'].dropna().astype(int)

# Convert 'genre' column to string type
df_copy['genre'] = df_copy['genre'].astype(str)

# Now proceed with the genre filtering
filtered_tvshows = df_copy[df_copy['genre'].str.split(', ').apply(lambda x: bool(set(x) & set(matching_genres_weighted)) if x != 'nan' else False)]

# Continue with the rating and votes filters
filtered_tvshows = filtered_tvshows[filtered_tvshows['rating'] >= 8]
filtered_tvshows = filtered_tvshows[filtered_tvshows['votes'] >= 20000]

# Recommend a TV show from the filtered shows based on the weighted genres
if filtered_tvshows.empty:
    print("No TV shows found matching the criteria.")
else:
    recommended_genre = random.choice(matching_genres_weighted)
    recommended_show = filtered_tvshows[filtered_tvshows['genre'].str.contains(recommended_genre)].sample().iloc[0]
    # Extract the earliest year from the 'year' column
    # Extract the earliest year from the 'year' column and remove any preceding '-'
    earliest_year = str(recommended_show['year']).split('–')[0].strip().replace('-', '')
    print(f"Recommended TV Show: {recommended_show['title']} (Genre: {recommended_genre}, Rating: {recommended_show['rating']}, Release Year: {earliest_year.strip('()')})")


Please go to the following URL to authorize:
https://accounts.spotify.com/authorize?client_id=ecec60c9a316409a84a45c923f7473ee&response_type=code&redirect_uri=http%3A%2F%2Flocalhost%2F&scope=user-top-read
Enter the code from the URL (localhost/code='...'): AQDozcMYbc1m9HJMKHOWi7Y_MPD6dOLqks7vWDKodPJiudctLN6vKHkhordKOSBDGJyD5Q-Pew7QtpEWXAdio3N8FaTtf5O1VMNDsoPy5s87pIKCXTfVqj7QC6ObejEyVbd7W9k0Uca7Uy5PVvHZHxY3ljDTqX_6lAM0JfMAHoulhL7S
Recommended TV Show: Planet of the Apes (Genre: Adventure, Rating: 8.0, Release Year: 1968)


#### Code Breakdown


* Fetching User's Top Artists & Genres

  The user's top artists are fetched from Spotify.
  The genres for each artist are tallied to see how often they appear.
  The five most frequent genres are stored in top_5_genres.

* Genre Mapping

  A dictionary, genre_mapping, links music genres from Spotify with TV show genres. It provides a way to recommend TV shows based on a user's music preferences.

* Genre Weighting
  
  For the user's top 5 genres, their TV show genre matches are added multiple times based on the frequency of the music genre. This creates a weighted list, matching_genres_weighted.

* Loading & Filtering TV Show Data

  The TV show data is loaded from a CSV file into a DataFrame.
  A few data cleaning steps are applied:
  Votes are converted from strings to integers.
  Genre data is processed to be used as strings.
  TV shows are then filtered based on their genres, their ratings (>= 8), and the number of votes they have (>= 20,000).

# Custom Recommendation Based on User Choice

#### Code

In [19]:
! pip install pandas



In [20]:
! pip install spotipy



In [23]:
import spotipy
from spotipy.oauth2 import SpotifyOAuth
import pandas as pd
import random
import ipywidgets as widgets
from IPython.display import display

# Spotify API Credentials
SPOTIPY_CLIENT_ID = 'ecec60c9a316409a84a45c923f7473ee'
SPOTIPY_CLIENT_SECRET = '3129b9225476463d86ddc4074cfc8500'
SPOTIPY_REDIRECT_URI = 'http://localhost/'

# Create OAuth2 object
sp_oauth = SpotifyOAuth(client_id=SPOTIPY_CLIENT_ID,
                        client_secret=SPOTIPY_CLIENT_SECRET,
                        redirect_uri=SPOTIPY_REDIRECT_URI,
                        scope=["user-top-read"])

# Get the authorization URL
auth_url = sp_oauth.get_authorize_url()
print("Please go to the following URL to authorize:")
print(auth_url)

# Extract the code from the redirected URL
code = input("Enter the code from the URL (localhost/code='...'): ")
token = sp_oauth.get_access_token(code, as_dict=False)
sp = spotipy.Spotify(auth=token)

# Fetch user's top artists
top_artists = sp.current_user_top_artists()

# Count frequency of each genre
genre_count = {}
for artist in top_artists['items']:
    for genre in artist['genres']:
        if genre in genre_count:
            genre_count[genre] += 1
        else:
            genre_count[genre] = 1

# Sort genres by frequency and get top 5
top_5_genres = sorted(genre_count, key=genre_count.get, reverse=True)[:5]


# Movie genre mapping
movie_genre_mapping = {
   # Main genres
    'pop': ['action', 'adventure', 'romance'],
    'art pop': ['animation', 'fantasy', 'romance'],
    'reggaeton': ['action', 'adventure'],
    'urbano latino': ['action', 'adventure'],
    'trap latino': ['crime', 'thriller'],
    'rock': ['action', 'adventure', 'war'],
    'indie rock': ['drama', 'romance', 'adventure'],
    'classical': ['biography', 'history', 'romance'],
    'hip hop': ['action', 'crime', 'drama'],
    'jazz': ['film-noir', 'romance', 'biography'],
    'country': ['family', 'romance', 'history'],
    'electronic': ['sci-fi', 'mystery', 'thriller'],
    'metal': ['action', 'horror', 'war'],
    'folk': ['family', 'history', 'biography'],
    'blues': ['drama', 'biography', 'film-noir'],
    'r&b': ['drama', 'romance', 'crime'],
    'soul': ['drama', 'family', 'romance'],
    'punk': ['action', 'thriller', 'mystery'],
    'disco': ['comedy', 'romance', 'family'],
    'house': ['sci-fi', 'thriller', 'mystery'],
    'techno': ['sci-fi', 'action'],
    'edm': ['action', 'adventure', 'sci-fi'],
    'latin': ['romance', 'family', 'adventure'],
    'reggae': ['comedy', 'adventure', 'family'],
    'funk': ['comedy', 'romance', 'action'],
    'k-pop': ['romance', 'comedy', 'action'],
    'psychedelic': ['animation', 'fantasy', 'mystery'],
    'world': ['history', 'family', 'biography'],
    'ambient': ['mystery', 'sci-fi', 'animation'],

    # Subgenres
    'lo-fi beats': ['drama', 'romance', 'animation'],
    'vaporwave': ['sci-fi', 'mystery', 'animation'],
    'emo': ['drama', 'romance', 'mystery'],
    'hardcore': ['action', 'thriller', 'war'],
    'dubstep': ['sci-fi', 'action', 'thriller'],
    'ska': ['comedy', 'family', 'adventure'],
    'swing': ['history', 'romance', 'family'],
    'trance': ['sci-fi', 'fantasy', 'thriller'],
    'grime': ['action', 'crime', 'drama'],
    'bluegrass': ['family', 'history', 'drama'],
    'new wave': ['drama', 'sci-fi', 'mystery'],
    'post-punk': ['drama', 'mystery', 'thriller'],
    'trip hop': ['mystery', 'drama', 'sci-fi'],
    'neosoul': ['drama', 'romance', 'family'],
    'afrobeat': ['history', 'drama', 'family'],
    'chillhop': ['drama', 'animation', 'romance'],
    'synthwave': ['sci-fi', 'action', 'drama']
}

# TV genre mapping
tv_genre_mapping = {
    # Main genres
    'pop': ['Action', 'Adventure', 'Romance'],
    'art pop': ['Animation', 'Fantasy', 'Romance'],
    'reggaeton': ['Action', 'Adventure'],
    'urbano latino': ['Action', 'Adventure'],
    'trap latino': ['Crime', 'Thriller'],
    'rock': ['Action', 'Adventure', 'War'],
    'indie rock': ['Drama', 'Romance', 'Adventure'],
    'classical': ['Biography', 'History', 'Romance'],
    'hip hop': ['Action', 'Crime', 'Drama'],
    'jazz': ['Musical', 'Romance', 'Biography'],
    'country': ['Family', 'Romance', 'History'],
    'electronic': ['Sci-Fi', 'Mystery', 'Thriller'],
    'metal': ['Action', 'Horror', 'War'],
    'folk': ['Family', 'History', 'Biography'],
    'blues': ['Drama', 'Biography', 'Music'],
    'r&b': ['Drama', 'Romance', 'Crime'],
    'soul': ['Drama', 'Family', 'Romance'],
    'punk': ['Action', 'Thriller', 'Mystery'],
    'disco': ['Comedy', 'Romance', 'Family'],
    'house': ['Sci-Fi', 'Thriller', 'Mystery'],
    'techno': ['Sci-Fi', 'Action'],
    'edm': ['Action', 'Adventure', 'Sci-Fi'],
    'latin': ['Romance', 'Family', 'Adventure'],
    'reggae': ['Comedy', 'Adventure', 'Family'],
    'funk': ['Comedy', 'Romance', 'Action'],
    'k-pop': ['Romance', 'Comedy', 'Action'],
    'psychedelic': ['Animation', 'Fantasy', 'Mystery'],
    'world': ['History', 'Family', 'Biography'],
    'ambient': ['Mystery', 'Sci-Fi', 'Animation'],

    # Subgenres
    'lo-fi beats': ['Drama', 'Romance', 'Animation'],
    'vaporwave': ['Sci-Fi', 'Mystery', 'Animation'],
    'emo': ['Drama', 'Romance', 'Mystery'],
    'hardcore': ['Action', 'Thriller', 'War'],
    'dubstep': ['Sci-Fi', 'Action', 'Thriller'],
    'ska': ['Comedy', 'Family', 'Adventure'],
    'swing': ['History', 'Romance', 'Family'],
    'trance': ['Sci-Fi', 'Fantasy', 'Thriller'],
    'grime': ['Action', 'Crime', 'Drama'],
    'bluegrass': ['Family', 'History', 'Drama'],
    'new wave': ['Drama', 'Sci-Fi', 'Mystery'],
    'post-punk': ['Drama', 'Mystery', 'Thriller'],
    'trip hop': ['Mystery', 'Drama', 'Sci-Fi'],
    'neosoul': ['Drama', 'Romance', 'Family'],
    'afrobeat': ['History', 'Drama', 'Family'],
    'chillhop': ['Drama', 'Animation', 'Romance'],
    'synthwave': ['Sci-Fi', 'Action', 'Drama']
}

def recommend(choice):
    matching_genres_weighted = []
    if choice == "movie":
        genre_mapping = movie_genre_mapping
    elif choice == "tvshow":
        genre_mapping = tv_genre_mapping
    else:
        print("Invalid choice!")
        return

    for spotify_genre in top_5_genres:
        if spotify_genre in genre_mapping:
            for corresponding_genre in genre_mapping[spotify_genre]:
                matching_genres_weighted.extend([corresponding_genre] * genre_count[spotify_genre])

    # Movie Filtering
    if choice == "movie":
        df = pd.read_csv('/content/combined_movies.csv', on_bad_lines='warn')

        # Filtering criteria
        df = df[df['rating'] >= 8]
        df = df[df['votes'] >= 20000]
        filtered_movies = df[df['genre'].isin(matching_genres_weighted)]

        if filtered_movies.empty:
            print("No movies found matching the criteria.")
        else:
            recommended_genre = random.choice(matching_genres_weighted)
            recommended_movie = filtered_movies[filtered_movies['genre'] == recommended_genre].sample().iloc[0]
            print(f"Recommended Movie: {recommended_movie['movie_name']} (Genre: {recommended_movie['genre'].capitalize()}, Rating: {recommended_movie['rating']}, Release Year: {recommended_movie['year']})")

    # TV Show Filtering
    elif choice == "tvshow":
        df = pd.read_csv('/content/tvshowdata.csv', on_bad_lines='warn')
        df_copy = df.copy()
        df_copy['votes'] = df_copy['votes'].astype(str)
        df_copy['votes'] = df_copy['votes'].str.replace(',', '').astype(float)
        df_copy.loc[df_copy['votes'].notna(), 'votes'] = df_copy['votes'].dropna().astype(int)
        df_copy['genre'] = df_copy['genre'].astype(str)

        filtered_tvshows = df_copy[df_copy['genre'].str.split(', ').apply(lambda x: bool(set(x) & set(matching_genres_weighted)) if x != 'nan' else False)]
        filtered_tvshows = filtered_tvshows[filtered_tvshows['rating'] >= 8]
        filtered_tvshows = filtered_tvshows[filtered_tvshows['votes'] >= 20000]

        if filtered_tvshows.empty:
            print("No TV shows found matching the criteria.")
        else:
            recommended_genre = random.choice(matching_genres_weighted)
            recommended_show = filtered_tvshows[filtered_tvshows['genre'].str.contains(recommended_genre)].sample().iloc[0]
            earliest_year = str(recommended_show['year']).split('–')[0].strip().replace('-', '')
            print(f"Recommended TV Show: {recommended_show['title']} (Genre: {recommended_genre}, Rating: {recommended_show['rating']}, Release Year: {earliest_year.strip('()')})")


def on_button_click(button):
    choice = button.description.lower().replace(' ', '')
    recommend(choice)

movie_button = widgets.Button(description="Movie")
tv_button = widgets.Button(description="TV Show")

movie_button.on_click(on_button_click)
tv_button.on_click(on_button_click)

# Display the buttons
display(movie_button, tv_button)


Please go to the following URL to authorize:
https://accounts.spotify.com/authorize?client_id=ecec60c9a316409a84a45c923f7473ee&response_type=code&redirect_uri=http%3A%2F%2Flocalhost%2F&scope=user-top-read
Enter the code from the URL (localhost/code='...'): AQDozcMYbc1m9HJMKHOWi7Y_MPD6dOLqks7vWDKodPJiudctLN6vKHkhordKOSBDGJyD5Q-Pew7QtpEWXAdio3N8FaTtf5O1VMNDsoPy5s87pIKCXTfVqj7QC6ObejEyVbd7W9k0Uca7Uy5PVvHZHxY3ljDTqX_6lAM0JfMAHoulhL7S


Button(description='Movie', style=ButtonStyle())

Button(description='TV Show', style=ButtonStyle())

Recommended TV Show: The Legend of Korra (Genre: Adventure, Rating: 8.4, Release Year: 2012)


#### Code Breakdown

1. Libraries and Credentials
- spotipy: Used to interact with the Spotify API.
- pandas: A data manipulation and analysis library.
- random: Provides functions to work with randomness.
- ipywidgets: For creating interactive GUIs in Jupyter notebooks.
- IPython.display: Allows for the display of GUI elements in Jupyter.

2. Spotify API Credentials: Constants for API integration
- OAuth2: Initializes the Spotify OAuth2 authentication.
- Authorization URL: Directs the user to Spotify for permission.
- Code Extraction: Asks the user to provide the authentication code from the redirect URL to authenticate.

3. Fetch Spotify Data

- Fetch User's Top Artists: The script fetches the top artists for the authenticated user.
- Genre Counting: It then counts the number of times each genre appears among the user's top artists.

4. Movie & TV Genre Mapping

- movie_genre_mapping: Maps Spotify music genres to corresponding movie genres.
- tv_genre_mapping: Does the same as above but for TV shows.

5. Recommendation Function (recommend):

- This function provides a recommendation based on the user's preference (movie or TV show) and the genres of their top Spotify artists.
- It first selects the right genre mapping dictionary based on the choice.
- Then, it matches the top Spotify genres to movie or TV show genres.
- The function then filters a dataset of movies or TV shows based on these genres and some additional criteria (like ratings and number of votes).
- Finally, it randomly selects a recommendation from this filtered list and prints it.

6. Interactive Buttons and Callbacks

- on_button_click: Function that gets triggered when a button (Movie or TV Show) is clicked.
- movie_button and tv_button: Interactive buttons for the user to choose between getting a movie or TV show recommendation.
- The on_click method of these buttons is used to bind the buttons to the on_button_click function.

7. Display the Buttons:

- Using the display function from IPython.display to show the buttons on the Jupyter notebook, enabling the user to interact with the recommendation system.


  


# Flask Web Application

#### Suggested Directory Setup

In [None]:
/movierecs
  /templates
    - index.html
    - loginpage.html
    - displayrecommendation.html
    - displayhistory.html
  - __pycache__
  - .DS_Store
  - .cache
  - .env
  - .gitignore
  - LICENSE.md
  - README.md
  - Watchify.py

#### Code

In [None]:
from flask import Flask, render_template, request, redirect, url_for, flash, session
import spotipy
from spotipy.oauth2 import SpotifyOAuth
import pandas as pd
import random

app = Flask(__name__)
app.secret_key = 'SECRET_KEY'

SPOTIPY_CLIENT_ID = 'ecec60c9a316409a84a45c923f7473ee'
SPOTIPY_CLIENT_SECRET = '3129b9225476463d86ddc4074cfc8500'
SPOTIPY_REDIRECT_URI = 'http://localhost:5000/callback'

# Spotify session
sp = None

# Define functions provided in the earlier response

def get_user_top_genres(token, limit=50):
    global sp
    sp = spotipy.Spotify(auth=token)
    results = sp.current_user_top_artists(limit=limit)

    genres = []
    for artist in results['items']:
        genres.extend(artist['genres'])
    return genres

def get_top_genres_and_counts(token, top_n=5):
    genres = get_user_top_genres(token)
    genre_counts = pd.Series(genres).value_counts()
    top_5_genres = genre_counts.head(top_n).index.tolist()
    return top_5_genres, genre_counts

def get_recommendation(genres_weighted):
    return random.choice(genres_weighted)

# Movie genre mapping
movie_genre_mapping = {
   # Main genres
    'pop': ['action', 'adventure', 'romance'],
    'art pop': ['animation', 'fantasy', 'romance'],
    'reggaeton': ['action', 'adventure'],
    'urbano latino': ['action', 'adventure'],
    'trap latino': ['crime', 'thriller'],
    'rock': ['action', 'adventure', 'war'],
    'indie rock': ['drama', 'romance', 'adventure'],
    'classical': ['biography', 'history', 'romance'],
    'hip hop': ['action', 'crime', 'drama'],
    'jazz': ['film-noir', 'romance', 'biography'],
    'country': ['family', 'romance', 'history'],
    'electronic': ['sci-fi', 'mystery', 'thriller'],
    'metal': ['action', 'horror', 'war'],
    'folk': ['family', 'history', 'biography'],
    'blues': ['drama', 'biography', 'film-noir'],
    'r&b': ['drama', 'romance', 'crime'],
    'soul': ['drama', 'family', 'romance'],
    'punk': ['action', 'thriller', 'mystery'],
    'disco': ['comedy', 'romance', 'family'],
    'house': ['sci-fi', 'thriller', 'mystery'],
    'techno': ['sci-fi', 'action'],
    'edm': ['action', 'adventure', 'sci-fi'],
    'latin': ['romance', 'family', 'adventure'],
    'reggae': ['comedy', 'adventure', 'family'],
    'funk': ['comedy', 'romance', 'action'],
    'k-pop': ['romance', 'comedy', 'action'],
    'psychedelic': ['animation', 'fantasy', 'mystery'],
    'world': ['history', 'family', 'biography'],
    'ambient': ['mystery', 'sci-fi', 'animation'],

    # Subgenres
    'lo-fi beats': ['drama', 'romance', 'animation'],
    'vaporwave': ['sci-fi', 'mystery', 'animation'],
    'emo': ['drama', 'romance', 'mystery'],
    'hardcore': ['action', 'thriller', 'war'],
    'dubstep': ['sci-fi', 'action', 'thriller'],
    'ska': ['comedy', 'family', 'adventure'],
    'swing': ['history', 'romance', 'family'],
    'trance': ['sci-fi', 'fantasy', 'thriller'],
    'grime': ['action', 'crime', 'drama'],
    'bluegrass': ['family', 'history', 'drama'],
    'new wave': ['drama', 'sci-fi', 'mystery'],
    'post-punk': ['drama', 'mystery', 'thriller'],
    'trip hop': ['mystery', 'drama', 'sci-fi'],
    'neosoul': ['drama', 'romance', 'family'],
    'afrobeat': ['history', 'drama', 'family'],
    'chillhop': ['drama', 'animation', 'romance'],
    'synthwave': ['sci-fi', 'action', 'drama']
}

# TV genre mapping
tv_genre_mapping = {
    # Main genres
    'pop': ['Action', 'Adventure', 'Romance'],
    'art pop': ['Animation', 'Fantasy', 'Romance'],
    'reggaeton': ['Action', 'Adventure'],
    'urbano latino': ['Action', 'Adventure'],
    'trap latino': ['Crime', 'Thriller'],
    'rock': ['Action', 'Adventure', 'War'],
    'indie rock': ['Drama', 'Romance', 'Adventure'],
    'classical': ['Biography', 'History', 'Romance'],
    'hip hop': ['Action', 'Crime', 'Drama'],
    'jazz': ['Musical', 'Romance', 'Biography'],
    'country': ['Family', 'Romance', 'History'],
    'electronic': ['Sci-Fi', 'Mystery', 'Thriller'],
    'metal': ['Action', 'Horror', 'War'],
    'folk': ['Family', 'History', 'Biography'],
    'blues': ['Drama', 'Biography', 'Music'],
    'r&b': ['Drama', 'Romance', 'Crime'],
    'soul': ['Drama', 'Family', 'Romance'],
    'punk': ['Action', 'Thriller', 'Mystery'],
    'disco': ['Comedy', 'Romance', 'Family'],
    'house': ['Sci-Fi', 'Thriller', 'Mystery'],
    'techno': ['Sci-Fi', 'Action'],
    'edm': ['Action', 'Adventure', 'Sci-Fi'],
    'latin': ['Romance', 'Family', 'Adventure'],
    'reggae': ['Comedy', 'Adventure', 'Family'],
    'funk': ['Comedy', 'Romance', 'Action'],
    'k-pop': ['Romance', 'Comedy', 'Action'],
    'psychedelic': ['Animation', 'Fantasy', 'Mystery'],
    'world': ['History', 'Family', 'Biography'],
    'ambient': ['Mystery', 'Sci-Fi', 'Animation'],

    # Subgenres
    'lo-fi beats': ['Drama', 'Romance', 'Animation'],
    'vaporwave': ['Sci-Fi', 'Mystery', 'Animation'],
    'emo': ['Drama', 'Romance', 'Mystery'],
    'hardcore': ['Action', 'Thriller', 'War'],
    'dubstep': ['Sci-Fi', 'Action', 'Thriller'],
    'ska': ['Comedy', 'Family', 'Adventure'],
    'swing': ['History', 'Romance', 'Family'],
    'trance': ['Sci-Fi', 'Fantasy', 'Thriller'],
    'grime': ['Action', 'Crime', 'Drama'],
    'bluegrass': ['Family', 'History', 'Drama'],
    'new wave': ['Drama', 'Sci-Fi', 'Mystery'],
    'post-punk': ['Drama', 'Mystery', 'Thriller'],
    'trip hop': ['Mystery', 'Drama', 'Sci-Fi'],
    'neosoul': ['Drama', 'Romance', 'Family'],
    'afrobeat': ['History', 'Drama', 'Family'],
    'chillhop': ['Drama', 'Animation', 'Romance'],
    'synthwave': ['Sci-Fi', 'Action', 'Drama']
}

@app.route('/')
def index():
    return render_template('index.html')

@app.route('/login')
def login():
    sp_oauth = SpotifyOAuth(client_id=SPOTIPY_CLIENT_ID,
                            client_secret=SPOTIPY_CLIENT_SECRET,
                            redirect_uri=SPOTIPY_REDIRECT_URI,
                            scope=["user-top-read"])
    auth_url = sp_oauth.get_authorize_url()
    return render_template('loginpage.html', auth_url=auth_url)

@app.route('/callback')
def callback():
    sp_oauth = SpotifyOAuth(client_id=SPOTIPY_CLIENT_ID,
                            client_secret=SPOTIPY_CLIENT_SECRET,
                            redirect_uri=SPOTIPY_REDIRECT_URI,
                            scope=["user-top-read"])
    token_info = sp_oauth.get_access_token(request.args['code'])
    session['token'] = token_info['access_token']
    return redirect(url_for('recommend'))

@app.route('/recommend', methods=['POST'])
def recommend():
    choice = request.form.get('choice', None)

    if not choice:
        flash('Choice is missing!', 'danger')
        return redirect(url_for('index'))

    # Assuming top_5_genres and genre_count are computed based on the session['token']
    top_5_genres, genre_count = get_top_genres_and_counts(session['token'])

    matching_genres_weighted = []

    if choice == "movie":
        genre_mapping = movie_genre_mapping
    elif choice == "tvshow":
        genre_mapping = tv_genre_mapping
    else:
        flash('Invalid choice!', 'danger')
        return redirect(url_for('index'))

    for spotify_genre in top_5_genres:
        if spotify_genre in genre_mapping:
            for corresponding_genre in genre_mapping[spotify_genre]:
                matching_genres_weighted.extend([corresponding_genre] * genre_count[spotify_genre])

    # Create a result variable based on matching_genres_weighted
    result = get_recommendation(matching_genres_weighted)

    return render_template('displayrecommendation.html', recommendation=result)

if __name__ == '__main__':
    app.run(debug=True)


[33m * Tip: There are .env or .flaskenv files present. Do "pip install python-dotenv" to use them.[0m


 * Serving Flask app '__main__'
 * Debug mode: on


 * Running on http://127.0.0.1:5000
INFO:werkzeug:[33mPress CTRL+C to quit[0m
INFO:werkzeug: * Restarting with stat


SystemExit: ignored

#### Code Breakdown

Module Imports and Initialization
1. Flask Framework:

- Flask: It's the core class that all Flask applications must create an instance of. The instance acts as the central object. In this application, it's instantiated as app.
- render_template: This function is crucial for integrating Flask with Jinja2 templates. It allows the back-end code to render front-end HTML templates and potentially pass Python values into them.
- request: Represents the client's HTTP request and contains all the information sent by the client. This module helps retrieve form data or query parameters.
- redirect: Used to redirect a user to a different endpoint.
- url_for: Generates URLs to different views, making it easier to change URLs in the future without breaking links.
- flash: A utility to send one-time alerts or messages to the rendered HTML templates.
- session: Allows storing information specific to a user from one request to the next, like cookies or local storage but server-side.

2. spotipy & OAuth:
- spotipy: A lightweight Python library for accessing the Spotify Web API.
- SpotifyOAuth: A subclass in spotipy that manages authentication flow.

3. Miscellaneous Libraries:
- pandas (as pd): A library offering powerful, expressive, and flexible data structures that make data manipulation and analysis easy. It's primarily used here to count genres.
- random: Provides functions that use randomness. In this context, it's used for making weighted random choices.
- The Flask application is initialized, and app.secret_key is assigned a value. This secret key is instrumental in session-based authentication, ensuring that the client-side sessions are kept secure.
- The Spotify credentials, including SPOTIPY_CLIENT_ID, SPOTIPY_CLIENT_SECRET, and SPOTIPY_REDIRECT_URI, are initialized to authenticate and interact with the Spotify API.

Helper Functions and Data Structures
1. get_user_top_genres:

- Accepts token (to authorize Spotify API access) and limit (number of artists to retrieve).
- The function initializes a new spotipy.Spotify object using the given token.
- Fetches the user's top artists via the API.
- Iterates through these artists, accumulating a list of all genres associated with these artists.
- Returns the combined list of genres.

2. get_top_genres_and_counts:

- Accepts token and the number of top genres, top_n.
- Fetches the user's top genres using the aforementioned function.
- Utilizes pandas to count the occurrence of each genre.
- Extracts the top N genres based on their frequency.
- Returns these top N genres and their counts.

3. get_recommendation:

- Given a list with repeated genres (indicating weight or frequency), it randomly selects a genre, offering a weighted probability based on genre frequency.

4. Genre Mapping Dictionaries:

-  detailed dictionaries (movie_genre_mapping & tv_genre_mapping) map numerous Spotify music genres to appropriate cinematic genres for movies and TV shows, respectively.
- They act as a reference for converting a user's musical preferences into film or series genre recommendations.

Flask Route Handlers
1. "/" (index):

- A basic route that serves the main homepage of the application.

2. "/login":

- Initiates the Spotify OAuth process.
- An authorization URL is constructed using SpotifyOAuth, which will redirect users to Spotify for authentication.
- The authorization URL is passed to a template (loginpage.html), which likely contains a login button or link.

3. "/callback":

- The endpoint Spotify redirects to after the user authorizes.
- The provided OAuth code is exchanged for an access token using SpotifyOAuth.
- The access token is then stored in the session to authenticate subsequent requests to Spotify.

4. "/recommend":

- Accepts only POST requests.
- Depending on the user's provided choice (movies or TV shows), this handler processes genre recommendations.
- The top music genres of the user are derived from Spotify based on the stored access token.
- The relevant genre mapping dictionary is consulted to compile a weighted list of movie or TV genres.
- A random recommendation is made from this weighted list, and then it's relayed to the user via a template (displayrecommendation.html).
- Lastly, the if __name__ == '__main__': structure ensures the Flask app runs only when this script is executed as the main program. When activated, the Flask development server is started with debugging enabled, providing real-time feedback on errors and code changes.