# Watchify

A Flask Web App:

- Uses Spotify users' top artists and tracks to provide a custom Movie/TV Show recommendation (downloadable graphic/animation) based on a match to the user's top 5 genres that were more frequently listened to.
- Prompts user to provide feedback for our recommendation system to be stored in a .csv file in Spotify Project (Google Drive)
- Feedback file will be updated whenever code is ran on Google Colab if feedback .csv file is found in Sprotify Project already
- Users' and their respective recommendation's data will be updated then

To Use This Notebook:

- Run the code in Google Colab notebook (Spotify Project).
- You'll be prompted to click on a URL to authorize the app to access your Spotify data.
- After authorization, you'll get redirected to a localhost URL.
- Copy the code from that URL and paste it back into the Colab input prompt.
- Then, click either the "Movie" or "TV Show" button to get a recommendation!

## Movie Genres Datasets, Implementation and Filtering Logic

Data Source: [Kaggle, IMDB](https://www.kaggle.com/datasets/rajugc/imdb-movies-dataset-based-on-genre)

- Datasets were updated 6months ago making it up do date.


### 1.  Concatenating Movie Genre Data Using pandas

####Code

In [None]:
import pandas as pd
from google.colab import files
import io

# Step 1: Upload the files
uploaded_files = files.upload()

all_data = []

# Step 2, 3, and 4: Read each uploaded CSV, add genre column, and concatenate
for filename in uploaded_files.keys():
    # Read the file content into a pandas DataFrame
    df = pd.read_csv(io.StringIO(uploaded_files[filename].decode('utf-8')))

    # Assume that the filename before '.csv' is the genre name
    genre_name = filename.split('.csv')[0]
    df['genre'] = genre_name

    all_data.append(df)

# Concatenate all dataframes
final_df = pd.concat(all_data, ignore_index=True)

# Step 5: Save the combined DataFrame into a new CSV file
final_df.to_csv('combined_movies.csv', index=False)

# Download the combined CSV
files.download('combined_movies.csv')


Saving action.csv to action.csv
Saving adventure.csv to adventure.csv
Saving animation.csv to animation.csv
Saving biography.csv to biography.csv
Saving crime.csv to crime.csv
Saving family.csv to family.csv
Saving fantasy.csv to fantasy.csv
Saving film-noir.csv to film-noir.csv
Saving history.csv to history.csv
Saving horror.csv to horror.csv
Saving mystery.csv to mystery.csv
Saving romance.csv to romance.csv
Saving scifi.csv to scifi.csv
Saving sports.csv to sports.csv
Saving thriller.csv to thriller.csv
Saving war.csv to war.csv


<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

#### Code Breakdown

* Imports:

  The necessary libraries are imported. Pandas is for data handling, the files module from Google Colab enables file interactions, and the io module allows for stream operations.

* Uploading Files:

  The user is prompted to upload one or more CSV files using Google Colab's file upload utility.

* Data Aggregation Initialization:

  An empty list named all_data is initialized. This will be used to store the data from each uploaded file after processing.

* Processing Each Uploaded File:

  For each uploaded file:
  The content of the file is read into a pandas DataFrame.
  The genre is determined from the filename itself (it assumes the filename without the '.csv' extension is the genre name).
  A new column named 'genre' is added to the DataFrame, and it's filled with the determined genre name.
  This processed DataFrame is then added to the all_data list.

* Combining DataFrames:

  All the DataFrames in the all_data list are concatenated (combined) into a single DataFrame. This combined DataFrame includes all the data from the uploaded files, with an additional 'genre' column indicating the genre of each movie.

* Saving and Downloading:

  The final combined DataFrame is saved to a new CSV file named 'combined_movies.csv'.
  This combined CSV file is then automatically downloaded to the user's computer using Google Colab's file download utility.

### 2. Extracting Movie Genres

#### Code

In [None]:
import pandas as pd

# Mount Google Drive
drive.mount('/content/drive')

# Read the CSV file into a DataFrame
df = pd.read_csv('/content/drive/MyDrive/Spotify Project/combined_movies.csv')

# Drop rows where the 'genre' column is NaN
df = df.dropna(subset=['genre'])

# Split the genres on commas, capitalize the first letter, and flatten the list
all_genres = [genre.strip().capitalize() for sublist in df['genre'].str.split(',') for genre in sublist]

# Get unique genres using a set
unique_genres = set(all_genres)

# Convert unique genres into a DataFrame
df_genres = pd.DataFrame(sorted(unique_genres), columns=['Genres'])
df_genres

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


Unnamed: 0,Genres
0,Action
1,Adventure
2,Animation
3,Biography
4,Crime
5,Family
6,Fantasy
7,Film-noir
8,History
9,Horror


#### Code Breakdown

1. Introduction
- Purpose: Analyzing genres in a dataset containing information about movies from Spotify.

2. Setting Up Environment
- Data Source: A CSV file located in Google Drive.
Import necessary libraries
- import pandas as pd

3. Google Drive Integration
Mount Google Drive for direct access to files:
- drive.mount('/content/drive')

4. Data Loading
Read data from the CSV file into a DataFrame:
- File Path: /content/drive/MyDrive/Spotify Project/combined_movies.csv
- Command: df = pd.read_csv('/content/drive/MyDrive/Spotify Project/combined_movies.csv')

5. Data Cleaning
Challenge:
- Missing genre data for some movies.
Solution:
- Drop rows where the 'genre' column has NaN values.
- Command: df = df.dropna(subset=['genre'])

6. Data Transformation: Splitting and Capitalizing
Genres in the dataset are comma-separated.
- Convert this to a list where each genre is capitalized.
- For example: "action, drama" → ["Action", "Drama"]
- Command: all_genres = [genre.strip().capitalize() for sublist in df['genre'].str.split(',') for genre in sublist]

7. Data Deduplication
Objective: Identify unique genres in the dataset.
- Use a Python set to deduplicate the list.
- Command: unique_genres = set(all_genres)

8. Creating a New DataFrame for Genres
Convert the set of unique genres into a DataFrame for easier analysis.
- Command: df_genres = pd.DataFrame(sorted(unique_genres), columns=['Genres'])


We've successfully cleaned, transformed, and deduplicated the genres data.
The result is stored in df_genres, ready for further analysis.

### 3. Fetch Spotify User's Top 5 Most Listened Genres Over the Past 30 days

#### Code

In [None]:
!pip install spotipy

Collecting spotipy
  Downloading spotipy-2.23.0-py3-none-any.whl (29 kB)
Collecting redis>=3.5.3 (from spotipy)
  Downloading redis-5.0.1-py3-none-any.whl (250 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m250.3/250.3 kB[0m [31m4.5 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: redis, spotipy
Successfully installed redis-5.0.1 spotipy-2.23.0


In [None]:
import spotipy
from spotipy.oauth2 import SpotifyOAuth

# Spotify API Credentials
SPOTIPY_CLIENT_ID = 'ecec60c9a316409a84a45c923f7473ee'
SPOTIPY_CLIENT_SECRET = '3129b9225476463d86ddc4074cfc8500'
SPOTIPY_REDIRECT_URI = 'http://localhost/'

# Create OAuth2 object
sp_oauth = SpotifyOAuth(client_id=SPOTIPY_CLIENT_ID,
                        client_secret=SPOTIPY_CLIENT_SECRET,
                        redirect_uri=SPOTIPY_REDIRECT_URI,
                        scope=["user-top-read"])

# Get the authorization URL
auth_url = sp_oauth.get_authorize_url()
print("Please go to the following URL to authorize:")
print(auth_url)

# After clicking the URL and authorizing, you'll be redirected to your specified redirect_uri.
# Extract the code parameter from that URL.
code = input("Enter the code from the URL (localhost/code='...'): ")
token = sp_oauth.get_access_token(code, as_dict=False)
sp = spotipy.Spotify(auth=token)

# Fetch user's top tracks from the past 30 days (short term)
top_tracks = sp.current_user_top_tracks(limit=50, time_range='short_term')

# Extract artists from the top tracks
artist_ids = [track['album']['artists'][0]['id'] for track in top_tracks['items']]

# Fetch artist details for these artists
artists = sp.artists(artist_ids)['artists']

# Count frequency of each genre
genre_count = {}
for artist in artists:
    for genre in artist['genres']:
        if genre in genre_count:
            genre_count[genre] += 1
        else:
            genre_count[genre] = 1

# Sort genres by frequency, get top 5, and capitalize the first letter of each genre
top_5_genres = [genre.capitalize() for genre in sorted(genre_count, key=genre_count.get, reverse=True)[:5]]

# Display the result
for genre in top_5_genres:
    print("-", genre)

Please go to the following URL to authorize:
https://accounts.spotify.com/authorize?client_id=ecec60c9a316409a84a45c923f7473ee&response_type=code&redirect_uri=http%3A%2F%2Flocalhost%2F&scope=user-top-read
Enter the code from the URL (localhost/code='...'): AQDbvgaTLTPvcQ0gpGXY6YP9brSAkl40qFLal5RbO0FhQDvulnDDMnk44VuxDBfJcH1nvOOtcN7AlTKJOLCDY9w40MUPFNVvB_4muEThltSZ4sl8thX3OuXyTmVun-UfmTzE-3-C70nSwlTdsbNAV3zTmU2vWvU5laFFTegEx0E_iaWl
- Pop
- Latin viral pop
- R&b en espanol
- Trap latino
- Urbano latino


#### Code Breakdown

1. Introduction
- Purpose: Extract and analyze a user's top tracks and associated genres using Spotify's API.
- Data Source: Spotify API.

2. Setting Up Environment
Import necessary libraries/modules:
- import spotipy
- from spotipy.oauth2 import SpotifyOAuth

3. API Credentials
Spotify API requires credentials for authentication:
- SPOTIPY_CLIENT_ID
- SPOTIPY_CLIENT_SECRET
- SPOTIPY_REDIRECT_URI

4. OAuth2 Authorization
Create an OAuth2 object for authentication:
- Command: sp_oauth = SpotifyOAuth(...)
Obtain the authorization URL and instruct the user to visit it:
Commands:
- auth_url = sp_oauth.get_authorize_url()
- print("Please go to the following URL to authorize:")

5. Token Generation
After authorization, capture the code from the redirected URL:
- Command: code = input("Enter the code from the URL...")

Use this code to get the access token:
- Command: token = sp_oauth.get_access_token(...)

6. Fetching User's Top Tracks
Using the authenticated API client (spotipy.Spotify), fetch user's top 50 tracks from the past month:
- Command: top_tracks = sp.current_user_top_tracks(...)

7. Extracting Artist IDs
From the top tracks, we extract the unique artist IDs associated with these tracks:
- Command: artist_ids = [track['album']['artists'][0]['id'] for track in top_tracks['items']]

8. Retrieving Artist Details
With the artist IDs, we retrieve detailed information about each artist:
- Command: artists = sp.artists(artist_ids)['artists']

9. Genre Analysis
Iterate through artist details to count the frequency of each genre:
Commands:
- genre_count = {}
- for artist in artists: ...
Determine the top 5 most frequent genres and capitalize them:
- Command: top_5_genres = [genre.capitalize() for genre in sorted(genre_count, key=genre_count.get, reverse=True)[:5]]

10. Display Results
Present the top 5 genres to the user:
- Command: for genre in top_5_genres: print("-", genre)


Successfully fetched and analyzed a user's top tracks and their associated genres from Spotify.
Showcased the top 5 genres for the user.


### 4. Creating a Genre Map


#### Code

In [2]:
movie_genre_mapping = {
    'pop': ['Comedy', 'Romance'],
    'art pop': ['Fantasy', 'Drama'],
    'reggaeton': ['Action', 'Adventure'],
    'urbano latino': ['Action', 'Drama'],
    'trap latino': ['Crime', 'Action'],
    'rock': ['Action', 'Adventure'],
    'indie rock': ['Drama', 'Romance'],
    'classical': ['History', 'Biography'],
    'hip hop': ['Drama', 'Crime'],
    'jazz': ['Drama', 'Biography'],
    'country': ['Drama', 'Family'],
    'electronic': ['Sci-Fi', 'Mystery'],
    'metal': ['Horror', 'Thriller'],
    'folk': ['Drama', 'History'],
    'blues': ['Drama', 'Crime'],
    'r&b': ['Romance', 'Drama'],
    'soul': ['Drama', 'Romance'],
    'punk': ['Action', 'Thriller'],
    'disco': ['Comedy', 'Romance'],
    'house': ['Sci-Fi', 'Thriller'],
    'techno': ['Sci-Fi', 'Action'],
    'edm': ['Action', 'Sci-Fi'],
    'latin': ['Drama', 'Romance'],
    'reggae': ['Comedy', 'Adventure'],
    'funk': ['Comedy', 'Action'],
    'k-pop': ['Comedy', 'Romance'],
    'psychedelic': ['Fantasy', 'Adventure'],
    'world': ['History', 'Family'],
    'ambient': ['Drama', 'Sci-Fi'],
    'lo-fi beats': ['Drama', 'Romance'],
    'vaporwave': ['Sci-Fi', 'Fantasy'],
    'emo': ['Drama', 'Romance'],
    'hardcore': ['Action', 'Thriller'],
    'dubstep': ['Action', 'Sci-Fi'],
    'ska': ['Comedy', 'Adventure'],
    'swing': ['Comedy', 'Romance'],
    'trance': ['Sci-Fi', 'Adventure'],
    'grime': ['Crime', 'Action'],
    'bluegrass': ['Drama', 'Adventure'],
    'new wave': ['Sci-Fi', 'Romance'],
    'post-punk': ['Drama', 'Thriller'],
    'trip hop': ['Mystery', 'Drama'],
    'neosoul': ['Romance', 'Drama'],
    'afrobeat': ['Drama', 'Adventure'],
    'chillhop': ['Drama', 'Romance'],
    'synthwave': ['Sci-Fi', 'Action'],
    'latin viral pop': ['Comedy', 'Adventure'],
    'r&b en espanol': ['Romance', 'Drama']
}

# Convert genre_mapping to DataFrame
import pandas as pd
df = pd.DataFrame.from_dict(movie_genre_mapping, orient='index').reset_index()
df.columns = ['Spotify Genre', 'Movie Genre 1', 'Movie Genre 2']

# Capitalize the first letter of each entry in the specified columns
df['Spotify Genre'] = df['Spotify Genre'].str.capitalize()
df['Movie Genre 1'] = df['Movie Genre 1'].str.capitalize()
df['Movie Genre 2'] = df['Movie Genre 2'].str.capitalize()

# Display DataFrame
df

Unnamed: 0,Spotify Genre,Movie Genre 1,Movie Genre 2
0,Pop,Comedy,Romance
1,Art pop,Fantasy,Drama
2,Reggaeton,Action,Adventure
3,Urbano latino,Action,Drama
4,Trap latino,Crime,Action
5,Rock,Action,Adventure
6,Indie rock,Drama,Romance
7,Classical,History,Biography
8,Hip hop,Drama,Crime
9,Jazz,Drama,Biography


#### Maping Relevance

In [None]:
unmapped_genres = [genre for genre in top_5_genres if genre not in movie_genre_mapping]
print("Top genres without a movie mapping:", unmapped_genres)

Top genres without a movie mapping: []


#### Code Breakdown

1. Introduction
- Purpose: Map musical genres from Spotify to corresponding movie genres.
- Intention: Understand the thematic overlap between music preferences and potential film preferences.

2. The Mapping Dictionary
- Present the movie_genre_mapping dictionary.
- Each key represents a Spotify musical genre.
- Each value is a list containing two associated movie genres.

3. Data Transformation to DataFrame
Utilize the pandas library to structure and visualize our data:
- Command: df = pd.DataFrame.from_dict(movie_genre_mapping, orient='index').reset_index()
Adjust column names for clarity:
- Command: df.columns = ['Spotify Genre', 'Movie Genre 1', 'Movie Genre 2']

4. Data Cleaning
Capitalize the first letter of each genre to ensure consistency:
- Command: df['Spotify Genre'] = df['Spotify Genre'].str.capitalize()
- Command: df['Movie Genre 1'] = df['Movie Genre 1'].str.capitalize()
- Command: df['Movie Genre 2'] = df['Movie Genre 2'].str.capitalize()

5. Resulting DataFrame
- Showcase the finalized DataFrame df to the audience.
- Highlight the neatly mapped relationships between Spotify musical genres and their corresponding movie genres.


The code uccessfully mapped and visualized relationships between musical genres and potential film preferences.
The mapping could serve as a foundation for recommending movies based on a user's musical taste.

### 5. Matching Spotify Top 5 Genres to Movie Genres with Genre Mapping

#### Code

In [None]:
import spotipy
from spotipy.oauth2 import SpotifyOAuth
import pandas as pd

# Spotify API Credentials
SPOTIPY_CLIENT_ID = 'ecec60c9a316409a84a45c923f7473ee'
SPOTIPY_CLIENT_SECRET = '3129b9225476463d86ddc4074cfc8500'
SPOTIPY_REDIRECT_URI = 'http://localhost/'

# Create OAuth2 object
sp_oauth = SpotifyOAuth(client_id=SPOTIPY_CLIENT_ID,
                        client_secret=SPOTIPY_CLIENT_SECRET,
                        redirect_uri=SPOTIPY_REDIRECT_URI,
                        scope=["user-top-read"])

# Get the authorization URL
auth_url = sp_oauth.get_authorize_url()
print("Please go to the following URL to authorize:")
print(auth_url)

# Extract the code from the redirected URL
code = input("Enter the code from the URL (localhost/code='...'): ")
token = sp_oauth.get_access_token(code, as_dict=False)
sp = spotipy.Spotify(auth=token)

# Fetch user's top tracks from the past 30 days (short term)
top_tracks = sp.current_user_top_tracks(limit=50, time_range='short_term')

# Extract artists from the top tracks
artist_ids = [track['album']['artists'][0]['id'] for track in top_tracks['items']]

# Fetch artist details for these artists
artists = sp.artists(artist_ids)['artists']

# Count frequency of each genre
genre_count = {}
for artist in artists:
    for genre in artist['genres']:
        if genre in genre_count:
            genre_count[genre] += 1
        else:
            genre_count[genre] = 1

# Ensure that the genres from Spotify are in lowercase for the mapping.
top_5_genres = [genre.lower() for genre in sorted(genre_count, key=genre_count.get, reverse=True)[:5]]

# Movie Genre Mapping
movie_genre_mapping = {
    'pop': ['Comedy', 'Romance'],
    'art pop': ['Fantasy', 'Drama'],
    'reggaeton': ['Action', 'Adventure'],
    'urbano latino': ['Action', 'Drama'],
    'trap latino': ['Crime', 'Action'],
    'rock': ['Action', 'Adventure'],
    'indie rock': ['Drama', 'Romance'],
    'classical': ['History', 'Biography'],
    'hip hop': ['Drama', 'Crime'],
    'jazz': ['Drama', 'Biography'],
    'country': ['Drama', 'Family'],
    'electronic': ['Sci-Fi', 'Mystery'],
    'metal': ['Horror', 'Thriller'],
    'folk': ['Drama', 'History'],
    'blues': ['Drama', 'Crime'],
    'r&b': ['Romance', 'Drama'],
    'soul': ['Drama', 'Romance'],
    'punk': ['Action', 'Thriller'],
    'disco': ['Comedy', 'Romance'],
    'house': ['Sci-Fi', 'Thriller'],
    'techno': ['Sci-Fi', 'Action'],
    'edm': ['Action', 'Sci-Fi'],
    'latin': ['Drama', 'Romance'],
    'reggae': ['Comedy', 'Adventure'],
    'funk': ['Comedy', 'Action'],
    'k-pop': ['Comedy', 'Romance'],
    'psychedelic': ['Fantasy', 'Adventure'],
    'world': ['History', 'Family'],
    'ambient': ['Drama', 'Sci-Fi'],
    'lo-fi beats': ['Drama', 'Romance'],
    'vaporwave': ['Sci-Fi', 'Fantasy'],
    'emo': ['Drama', 'Romance'],
    'hardcore': ['Action', 'Thriller'],
    'dubstep': ['Action', 'Sci-Fi'],
    'ska': ['Comedy', 'Adventure'],
    'swing': ['Comedy', 'Romance'],
    'trance': ['Sci-Fi', 'Adventure'],
    'grime': ['Crime', 'Action'],
    'bluegrass': ['Drama', 'Adventure'],
    'new wave': ['Sci-Fi', 'Romance'],
    'post-punk': ['Drama', 'Thriller'],
    'trip hop': ['Mystery', 'Drama'],
    'neosoul': ['Romance', 'Drama'],
    'afrobeat': ['Drama', 'Adventure'],
    'chillhop': ['Drama', 'Romance'],
    'synthwave': ['Sci-Fi', 'Action'],
    'latin viral pop': ['Comedy', 'Adventure'],
    'r&b en espanol': ['Romance', 'Drama']
}

# Using the genre mapping to find corresponding movie genres
matching_genres = set()
for spotify_genre in top_5_genres:
    if spotify_genre.lower() in movie_genre_mapping:  # convert to lowercase to match the mapping keys
        for movie_genre in movie_genre_mapping[spotify_genre.lower()]:
            matching_genres.add(movie_genre.title())  # capitalize each word

print("Matching Genres between Spotify and Movies:", list(matching_genres))

Please go to the following URL to authorize:
https://accounts.spotify.com/authorize?client_id=ecec60c9a316409a84a45c923f7473ee&response_type=code&redirect_uri=http%3A%2F%2Flocalhost%2F&scope=user-top-read
Enter the code from the URL (localhost/code='...'): AQBpw81y6STqTkljC2TT4KcBZa43VFtYKkmAGYnY-chJ7QFX49o5kq1iS-BDmS75uGNmoNGKX7lg576IJIxDPo3CMKR5c-Q42nFWqZpb_GnCozEV_Tn7p6PJKb4mefcNv7DEpM3cS7GKdwjRdeMZ4OzIoQuZVhEEYHL9FX8WYQZv13O_
Matching Genres between Spotify and Movies: ['Thriller', 'Crime', 'Adventure', 'Action']


#### Code Breakdown

1. Introduction
- Objective: Retrieve a user's top music genres from Spotify and map these to corresponding movie genres.
- Tools: Spotipy library for Spotify API and pandas for data structuring.

2. Spotify API Setup
- Necessary to use SpotifyOAuth for accessing the user's data.
- Display the set up for OAuth2 using Spotipy's SpotifyOAuth.
- Mention that users will be redirected to an authorization URL to grant permissions.

3. User Authorization
- Emphasize the need for user consent.
- Once the user authorizes, they will be redirected to a URL.
- Extract the code from the redirected URL and use it to get an access token.

4. Fetching User's Top Tracks
- Using the Spotipy client, fetch the user's top 50 tracks from the last 30 days.
- Extract artist IDs from these tracks to get more information about the artists.

5. Genre Analysis
- For each artist fetched, identify the musical genres associated with them.
- Tally the frequency of each genre to identify the user's top genres.

6. Movie Genre Mapping
- Display the movie_genre_mapping dictionary to the audience.
- Each Spotify music genre has one or two associated movie genres.

7. Matching Music to Movies
- Using the user's top Spotify genres, identify matching movie genres.
- Convert Spotify genres to lowercase to ensure a consistent match with the mapping dictionary.
- Showcase the final list of movie genres that match the user's music preferences.

The code successfully retrieved the user's top musical genres from Spotify.
Additionally the code mapped these musical genres to potential movie genre preferences.
This information will be useful for tailored movie recommendations based on music preferences.

### 6. Movie Recommendation

#### Code

In [None]:
from google.colab import drive
import spotipy
from spotipy.oauth2 import SpotifyOAuth
import pandas as pd
import random

# Mount Google Drive
drive.mount('/content/drive')

# Spotify API Credentials
SPOTIPY_CLIENT_ID = 'ecec60c9a316409a84a45c923f7473ee'
SPOTIPY_CLIENT_SECRET = '3129b9225476463d86ddc4074cfc8500'
SPOTIPY_REDIRECT_URI = 'http://localhost/'

# Create OAuth2 object
sp_oauth = SpotifyOAuth(client_id=SPOTIPY_CLIENT_ID,
                        client_secret=SPOTIPY_CLIENT_SECRET,
                        redirect_uri=SPOTIPY_REDIRECT_URI,
                        scope=["user-top-read"])

# Get the authorization URL
auth_url = sp_oauth.get_authorize_url()
print("Please go to the following URL to authorize:")
print(auth_url)

# Extract the code from the redirected URL
code = input("Enter the code from the URL (localhost/code='...'): ")
token = sp_oauth.get_access_token(code, as_dict=False)
sp = spotipy.Spotify(auth=token)

# Fetch user's top tracks from the past 30 days (short term)
top_tracks = sp.current_user_top_tracks(limit=50, time_range='short_term')

# Extract artists from the top tracks
artist_ids = [track['album']['artists'][0]['id'] for track in top_tracks['items']]

# Fetch artist details for these artists
artists = sp.artists(artist_ids)['artists']

# Count frequency of each genre
genre_count = {}
for artist in artists:
    for genre in artist['genres']:
        if genre in genre_count:
            genre_count[genre] += 1
        else:
            genre_count[genre] = 1

# Ensure that the genres from Spotify are in lowercase for the mapping.
top_5_genres = [genre.lower() for genre in sorted(genre_count, key=genre_count.get, reverse=True)[:5]]

# Movie Genre Mapping
movie_genre_mapping = {
    'pop': ['Comedy', 'Romance'],
    'art pop': ['Fantasy', 'Drama'],
    'reggaeton': ['Action', 'Adventure'],
    'urbano latino': ['Action', 'Drama'],
    'trap latino': ['Crime', 'Action'],
    'rock': ['Action', 'Adventure'],
    'indie rock': ['Drama', 'Romance'],
    'classical': ['History', 'Biography'],
    'hip hop': ['Drama', 'Crime'],
    'jazz': ['Drama', 'Biography'],
    'country': ['Drama', 'Family'],
    'electronic': ['Sci-Fi', 'Mystery'],
    'metal': ['Horror', 'Thriller'],
    'folk': ['Drama', 'History'],
    'blues': ['Drama', 'Crime'],
    'r&b': ['Romance', 'Drama'],
    'soul': ['Drama', 'Romance'],
    'punk': ['Action', 'Thriller'],
    'disco': ['Comedy', 'Romance'],
    'house': ['Sci-Fi', 'Thriller'],
    'techno': ['Sci-Fi', 'Action'],
    'edm': ['Action', 'Sci-Fi'],
    'latin': ['Drama', 'Romance'],
    'reggae': ['Comedy', 'Adventure'],
    'funk': ['Comedy', 'Action'],
    'k-pop': ['Comedy', 'Romance'],
    'psychedelic': ['Fantasy', 'Adventure'],
    'world': ['History', 'Family'],
    'ambient': ['Drama', 'Sci-Fi'],
    'lo-fi beats': ['Drama', 'Romance'],
    'vaporwave': ['Sci-Fi', 'Fantasy'],
    'emo': ['Drama', 'Romance'],
    'hardcore': ['Action', 'Thriller'],
    'dubstep': ['Action', 'Sci-Fi'],
    'ska': ['Comedy', 'Adventure'],
    'swing': ['Comedy', 'Romance'],
    'trance': ['Sci-Fi', 'Adventure'],
    'grime': ['Crime', 'Action'],
    'bluegrass': ['Drama', 'Adventure'],
    'new wave': ['Sci-Fi', 'Romance'],
    'post-punk': ['Drama', 'Thriller'],
    'trip hop': ['Mystery', 'Drama'],
    'neosoul': ['Romance', 'Drama'],
    'afrobeat': ['Drama', 'Adventure'],
    'chillhop': ['Drama', 'Romance'],
    'synthwave': ['Sci-Fi', 'Action'],
    'latin viral pop': ['Comedy', 'Adventure'],
    'r&b en espanol': ['Romance', 'Drama']
}

# Using the genre mapping to find corresponding movie genres
matching_genres_weighted = []
for spotify_genre in top_5_genres:
    if spotify_genre in movie_genre_mapping:
        for movie_genre in movie_genre_mapping[spotify_genre]:
            # Make sure the movie genres are in lowercase for the weighting and filtering.
            matching_genres_weighted.extend([movie_genre.lower()] * genre_count[spotify_genre])

# Read the CSV
df = pd.read_csv('/content/drive/MyDrive/Spotify Project/combined_movies.csv')

# Filter movies with rating 8 and above and votes greater than or equal to 20,000
df = df[(df['rating'] >= 8) & (df['votes'] >= 20000)]

# Filtering is now case-sensitive, matching the lowercase genres.
filtered_movies = df[df['genre'].isin(matching_genres_weighted)]

if filtered_movies.empty:
    print("No movies found matching the criteria.")
else:
    recommended_genre = random.choice(matching_genres_weighted)
    recommended_movie = filtered_movies[filtered_movies['genre'] == recommended_genre].sample().iloc[0]
    print(f"Recommended Movie: {recommended_movie['movie_name']} (Genre: {recommended_movie['genre'].capitalize()}, Rating: {recommended_movie['rating']}, Release Year: {recommended_movie['year']})")

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).
Please go to the following URL to authorize:
https://accounts.spotify.com/authorize?client_id=ecec60c9a316409a84a45c923f7473ee&response_type=code&redirect_uri=http%3A%2F%2Flocalhost%2F&scope=user-top-read
Enter the code from the URL (localhost/code='...'): AQDbvgaTLTPvcQ0gpGXY6YP9brSAkl40qFLal5RbO0FhQDvulnDDMnk44VuxDBfJcH1nvOOtcN7AlTKJOLCDY9w40MUPFNVvB_4muEThltSZ4sl8thX3OuXyTmVun-UfmTzE-3-C70nSwlTdsbNAV3zTmU2vWvU5laFFTegEx0E_iaWl
Recommended Movie: The Lion King (Genre: Adventure, Rating: 8.5, Release Year: 1994)


#### Code Breakdown

1. Introduction
- Objective: Use a user's top Spotify music genres to recommend a movie stored in a Google Drive database.
- Tools: Spotipy for Spotify API, Google Colab for Google Drive integration, and pandas for data handling.

2. Google Drive Integration
- The first step is to mount Google Drive to access our movie dataset.

3. Setting up Spotify API
- Display the Spotify API setup using the SpotifyOAuth class from the spotipy library.
- Emphasize user authentication and authorization for data access.

4. Retrieving User's Top Tracks
- After obtaining the token, we fetch the user's top 50 tracks from the past 30 days.
- Extract artist IDs from these tracks to gather more details about the artists.

5. Genre Analysis
- Iterate over each artist to identify their musical genres.
- Count the frequency of each genre to pinpoint the user's preferred genres.
- Showcase the genre extraction process and genre frequency calculation.

6. Movie Genre Mapping
- Introduce the movie_genre_mapping dictionary.
- Explain how each Spotify genre corresponds to one or two movie genres.
- This mapping will be essential for movie recommendations.

7. Weighted Matching of Genres
- Using the top Spotify genres, we identify matching movie genres.
- The weight (frequency) of each genre influences the movie recommendations.

8. Movie Recommendations
- Fetch movie data from the Google Drive database.
- Filter movies based on certain criteria: ratings and votes.
- Utilizing the weighted music genres, recommend a movie that aligns with the user's music taste.
- Provide a code snippet of the movie filtering and recommendation process.

The code successfully integrated Spotify and Google Drive to recommend movies based on music genres.
Demonstrate how personalization can be enhanced using datasets from different sources.

# TV Show Dataset, Implementation and Filtering Logic

Data Source: [Kaggle, IMDB](https://www.kaggle.com/datasets/payamamanat/imbd-dataset/data)

- Dataset was updated 16 days ago making it up to date


### 1. Extracting TV Show Genre Data Using pandas


#### Code

Install Spotipy Library to easily handle Spotify's API

In [None]:
!pip install spotipy

Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/pip/_vendor/pkg_resources/__init__.py", line 3108, in _dep_map
    return self.__dep_map
  File "/usr/local/lib/python3.10/dist-packages/pip/_vendor/pkg_resources/__init__.py", line 2901, in __getattr__
    raise AttributeError(attr)
AttributeError: _DistInfoDistribution__dep_map

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/pip/_internal/cli/base_command.py", line 169, in exc_logging_wrapper
    status = run_func(*args)
  File "/usr/local/lib/python3.10/dist-packages/pip/_internal/cli/req_command.py", line 242, in wrapper
    return func(self, options, args)
  File "/usr/local/lib/python3.10/dist-packages/pip/_internal/commands/install.py", line 441, in run
    conflicts = self._determine_conflicts(to_install)
  File "/usr/local/lib/python3.10/dist-packages/pip/_internal/commands/install.py", line 

In [None]:
import pandas as pd

# Read the CSV file into a DataFrame
df = pd.read_csv('/content/drive/MyDrive/Spotify Project/tvshowdata.csv')

# Filter rows where the 'certificate' column contains "TV"
df = df[df['certificate'].str.contains("TV", na=False)]

# Drop rows where the 'genre' column is NaN
df = df.dropna(subset=['genre'])

# Split the genres on commas and flatten the list
all_genres = [genre.strip() for sublist in df['genre'].str.split(',') for genre in sublist]

# Get unique genres using a set
unique_genres = set(all_genres)

# Convert unique genres into a DataFrame
df_genres = pd.DataFrame(sorted(unique_genres), columns=['Genres'])
df_genres

Unnamed: 0,Genres
0,Action
1,Adventure
2,Animation
3,Biography
4,Comedy
5,Crime
6,Documentary
7,Drama
8,Family
9,Fantasy


#### Code Breakdown

1. Import Necessary Libraries:
- We start by importing the pandas library, which is essential for data handling.

2. Load the Dataset:
- We read the CSV file located in the Google Drive path and store it in a DataFrame named df.

3. Filter TV Shows:
- Focus on rows where the 'certificate' column mentions "TV", filtering out other types of content.

4. Clean the Data:
- Handle missing data by dropping rows where the 'genre' column doesn't have a value.

5. Transform Genre Data:
- Given that genres are stored as comma-separated strings, we split each string and flatten the resulting lists to get a comprehensive list of all genres.

6. Data Deduplication:
- Identify unique genres in the dataset by leveraging the properties of a set, which inherently doesn't allow duplicate values.

7. Reformatting for Analysis:
- Transform the set of unique genres into a DataFrame (named df_genres) for further exploration or visualization.

8. Final Output:
- Display the df_genres DataFrame, which now lists all unique genres from the TV show dataset.

### 2. Fetch Spotify User's Top 5 Most Listened Genres Over the Past 30 days

#### Code

In [None]:
import spotipy
from spotipy.oauth2 import SpotifyOAuth

# Spotify API Credentials
SPOTIPY_CLIENT_ID = 'ecec60c9a316409a84a45c923f7473ee'
SPOTIPY_CLIENT_SECRET = '3129b9225476463d86ddc4074cfc8500'
SPOTIPY_REDIRECT_URI = 'http://localhost/'

# Create OAuth2 object
sp_oauth = SpotifyOAuth(client_id=SPOTIPY_CLIENT_ID,
                        client_secret=SPOTIPY_CLIENT_SECRET,
                        redirect_uri=SPOTIPY_REDIRECT_URI,
                        scope=["user-top-read"])

# Get the authorization URL
auth_url = sp_oauth.get_authorize_url()
print("Please go to the following URL to authorize:")
print(auth_url)

# After clicking the URL and authorizing, you'll be redirected to your specified redirect_uri.
# Extract the code parameter from that URL.
code = input("Enter the code from the URL (localhost/code='...'): ")
token = sp_oauth.get_access_token(code, as_dict=False)
sp = spotipy.Spotify(auth=token)

# Fetch user's top tracks from the past 30 days (short term)
top_tracks = sp.current_user_top_tracks(limit=50, time_range='short_term')

# Extract artists from the top tracks
artist_ids = [track['album']['artists'][0]['id'] for track in top_tracks['items']]

# Fetch artist details for these artists
artists = sp.artists(artist_ids)['artists']

# Count frequency of each genre
genre_count = {}
for artist in artists:
    for genre in artist['genres']:
        if genre in genre_count:
            genre_count[genre] += 1
        else:
            genre_count[genre] = 1

# Sort genres by frequency, get top 5, and capitalize the first letter of each genre
top_5_genres = [genre.capitalize() for genre in sorted(genre_count, key=genre_count.get, reverse=True)[:5]]

# Display the result
for genre in top_5_genres:
    print("-", genre)

Please go to the following URL to authorize:
https://accounts.spotify.com/authorize?client_id=ecec60c9a316409a84a45c923f7473ee&response_type=code&redirect_uri=http%3A%2F%2Flocalhost%2F&scope=user-top-read
Enter the code from the URL (localhost/code='...'): AQBpw81y6STqTkljC2TT4KcBZa43VFtYKkmAGYnY-chJ7QFX49o5kq1iS-BDmS75uGNmoNGKX7lg576IJIxDPo3CMKR5c-Q42nFWqZpb_GnCozEV_Tn7p6PJKb4mefcNv7DEpM3cS7GKdwjRdeMZ4OzIoQuZVhEEYHL9FX8WYQZv13O_
- Pop
- Latin viral pop
- R&b en espanol
- Trap latino
- Urbano latino


#### Code Breakdown

1. Introduction
- Purpose: Extract and analyze a user's top tracks and associated genres using Spotify's API.
- Data Source: Spotify API.

2. Setting Up Environment Import necessary libraries/modules:
- import spotipy
- from spotipy.oauth2 import SpotifyOAuth

3. API Credentials Spotify API requires credentials for authentication:
- SPOTIPY_CLIENT_ID
- SPOTIPY_CLIENT_SECRET
- SPOTIPY_REDIRECT_URI

4. OAuth2 Authorization Create an OAuth2 object for authentication:
- Command: sp_oauth = SpotifyOAuth(...) Obtain the authorization URL and instruct the user to visit it: Commands:
- auth_url = sp_oauth.get_authorize_url()
- print("Please go to the following URL to authorize:")

5. Token Generation After authorization, capture the code from the redirected URL:
- Command: code = input("Enter the code from the URL...")
- Use this code to get the access token:
- Command: token = sp_oauth.get_access_token(...)

6. Fetching User's Top Tracks Using the authenticated API client (spotipy.Spotify), fetch user's top 50 tracks from the past month:
- Command: top_tracks = sp.current_user_top_tracks(...)
Extracting Artist IDs From the top tracks, we extract the unique artist IDs associated with these tracks:
- Command: artist_ids = [track['album']['artists'][0]['id'] for track in top_tracks['items']]

7. Retrieving Artist Details With the artist IDs, we retrieve detailed information about each artist:
- Command: artists = sp.artists(artist_ids)['artists']

8. Genre Analysis Iterate through artist details to count the frequency of each genre: Commands:
- genre_count = {}
- for artist in artists: ... Determine the top 5 most frequent genres and capitalize them:
- Command: top_5_genres = [genre.capitalize() for genre in sorted(genre_count, key=genre_count.get, reverse=True)[:5]]

9. Display Results Present the top 5 genres to the user:
- Command: for genre in top_5_genres: print("-", genre)

Successfully fetched and analyzed a user's top tracks and their associated genres from Spotify. Showcased the top 5 genres for the user.

### 3. Creating a Genre Map


#### Code

In [None]:
tvshow_genre_mapping = {
    'pop': ['Comedy', 'Family'],
    'art pop': ['Drama', 'Fantasy'],
    'reggaeton': ['Reality-TV', 'Music'],
    'urbano latino': ['Comedy', 'Music'],
    'trap latino': ['Crime', 'Drama'],
    'rock': ['Adventure', 'Drama'],
    'indie rock': ['Drama', 'Romance'],
    'classical': ['Biography', 'History'],
    'hip hop': ['Documentary', 'Drama'],
    'jazz': ['Music', 'Documentary'],
    'country': ['Drama', 'Western'],
    'electronic': ['Sci-Fi', 'Drama'],
    'metal': ['Thriller', 'Horror'],
    'folk': ['Drama', 'History'],
    'blues': ['Documentary', 'Music'],
    'r&b': ['Drama', 'Romance'],
    'soul': ['Biography', 'Music'],
    'punk': ['Documentary', 'Music'],
    'disco': ['Comedy', 'Music'],
    'house': ['Reality-TV', 'Music'],
    'techno': ['Documentary', 'Music'],
    'edm': ['Reality-TV', 'Music'],
    'latin': ['Drama', 'Romance'],
    'reggae': ['Documentary', 'Music'],
    'funk': ['Comedy', 'Music'],
    'k-pop': ['Reality-TV', 'Music'],
    'psychedelic': ['Drama', 'Fantasy'],
    'world': ['Documentary', 'Travel'],
    'ambient': ['Documentary', 'Sci-Fi'],
    'lo-fi beats': ['Drama', 'Romance'],
    'vaporwave': ['Sci-Fi', 'Drama'],
    'emo': ['Drama', 'Music'],
    'hardcore': ['Documentary', 'Music'],
    'dubstep': ['Reality-TV', 'Sci-Fi'],
    'ska': ['Comedy', 'Music'],
    'swing': ['Musical', 'History'],
    'trance': ['Sci-Fi', 'Drama'],
    'grime': ['Documentary', 'Crime'],
    'bluegrass': ['Documentary', 'Music'],
    'new wave': ['Drama', 'Sci-Fi'],
    'post-punk': ['Documentary', 'Music'],
    'trip hop': ['Crime', 'Drama'],
    'neosoul': ['Drama', 'Romance'],
    'afrobeat': ['Documentary', 'Music'],
    'chillhop': ['Drama', 'Animation'],
    'synthwave': ['Sci-Fi', 'Drama'],
    'latin viral pop': ['Reality-TV', 'Comedy'],
    'r&b en espanol': ['Drama', 'Music']
}

# Convert genre_mapping to DataFrame
import pandas as pd
df = pd.DataFrame.from_dict(tvshow_genre_mapping, orient='index').reset_index()
df.columns = ['Spotify Genre', 'TV Show Genre 1', 'TV Show Genre 2']

# Capitalize the first letter of each entry in the specified columns
df['Spotify Genre'] = df['Spotify Genre'].str.capitalize()
df['TV Show Genre 1'] = df['TV Show Genre 1'].str.capitalize()
df['TV Show Genre 2'] = df['TV Show Genre 2'].str.capitalize()

# Display DataFrame
df

Unnamed: 0,Spotify Genre,TV Show Genre 1,TV Show Genre 2
0,Pop,Adventure,Action
1,Art pop,Animation,Fantasy
2,Reggaeton,Adventure,Action
3,Urbano latino,Comedy,Documentary
4,Trap latino,Crime,Thriller
5,Rock,Adventure,Action
6,Indie rock,Drama,Romance
7,Classical,Biography,History
8,Hip hop,Crime,Game-show
9,Jazz,Musical,Music


#### Code Breakdown

1. Initialization:
- Define a dictionary, tvshow_genre_mapping, mapping musical genres from Spotify to two TV show genres.
- Each key-value pair signifies a musical genre and its corresponding TV show genres.

2. DataFrame Conversion:
- Convert the tvshow_genre_mapping dictionary to a DataFrame called df. - The keys (Spotify genres) form the index, which is reset to transform them into a column.

3. DataFrame Structuring:
- Rename columns to 'Spotify Genre', 'TV Show Genre 1', and 'TV Show Genre 2' to make the DataFrame more readable.

4. Text Formatting:
- Capitalize the first letter of each genre in the 'Spotify Genre', 'TV Show Genre 1', and 'TV Show Genre 2' columns. This ensures consistent text formatting.

The code effectively displays the restructured and formatted DataFrame df for examination.

#### Mapping Relevance

In [None]:
unmapped_genres = [genre for genre in top_5_genres if genre not in tvshow_genre_mapping]
print("Top genres without a movie mapping:", unmapped_genres)

Top genres without a movie mapping: []


### 4. Matching Spotify Top 5 Genres to TV Show Genres with Genre Mapping

#### Code

In [None]:
import spotipy
from spotipy.oauth2 import SpotifyOAuth
import pandas as pd

# Spotify API Credentials
SPOTIPY_CLIENT_ID = 'ecec60c9a316409a84a45c923f7473ee'
SPOTIPY_CLIENT_SECRET = '3129b9225476463d86ddc4074cfc8500'
SPOTIPY_REDIRECT_URI = 'http://localhost/'

# Create OAuth2 object
sp_oauth = SpotifyOAuth(client_id=SPOTIPY_CLIENT_ID,
                        client_secret=SPOTIPY_CLIENT_SECRET,
                        redirect_uri=SPOTIPY_REDIRECT_URI,
                        scope=["user-top-read"])

# Get the authorization URL
auth_url = sp_oauth.get_authorize_url()
print("Please go to the following URL to authorize:")
print(auth_url)

# Extract the code from the redirected URL
code = input("Enter the code from the URL (localhost/code='...'): ")
token = sp_oauth.get_access_token(code, as_dict=False)
sp = spotipy.Spotify(auth=token)

# Fetch user's top artists
top_artists = sp.current_user_top_artists()

# Count frequency of each genre
genre_count = {}
for artist in top_artists['items']:
    for genre in artist['genres']:
        if genre in genre_count:
            genre_count[genre] += 1
        else:
            genre_count[genre] = 1

# Ensure that the genres from Spotify are in lowercase for the mapping.
top_5_genres = [genre.lower() for genre in sorted(genre_count, key=genre_count.get, reverse=True)[:5]]

# Genre mapping for TV Shows
tvshow_genre_mapping = {
    'pop': ['Comedy', 'Family'],
    'art pop': ['Drama', 'Fantasy'],
    'reggaeton': ['Reality-TV', 'Music'],
    'urbano latino': ['Comedy', 'Music'],
    'trap latino': ['Crime', 'Drama'],
    'rock': ['Adventure', 'Drama'],
    'indie rock': ['Drama', 'Romance'],
    'classical': ['Biography', 'History'],
    'hip hop': ['Documentary', 'Drama'],
    'jazz': ['Music', 'Documentary'],
    'country': ['Drama', 'Western'],
    'electronic': ['Sci-Fi', 'Drama'],
    'metal': ['Thriller', 'Horror'],
    'folk': ['Drama', 'History'],
    'blues': ['Documentary', 'Music'],
    'r&b': ['Drama', 'Romance'],
    'soul': ['Biography', 'Music'],
    'punk': ['Documentary', 'Music'],
    'disco': ['Comedy', 'Music'],
    'house': ['Reality-TV', 'Music'],
    'techno': ['Documentary', 'Music'],
    'edm': ['Reality-TV', 'Music'],
    'latin': ['Drama', 'Romance'],
    'reggae': ['Documentary', 'Music'],
    'funk': ['Comedy', 'Music'],
    'k-pop': ['Reality-TV', 'Music'],
    'psychedelic': ['Drama', 'Fantasy'],
    'world': ['Documentary', 'Travel'],
    'ambient': ['Documentary', 'Sci-Fi'],
    'lo-fi beats': ['Drama', 'Romance'],
    'vaporwave': ['Sci-Fi', 'Drama'],
    'emo': ['Drama', 'Music'],
    'hardcore': ['Documentary', 'Music'],
    'dubstep': ['Reality-TV', 'Sci-Fi'],
    'ska': ['Comedy', 'Music'],
    'swing': ['Musical', 'History'],
    'trance': ['Sci-Fi', 'Drama'],
    'grime': ['Documentary', 'Crime'],
    'bluegrass': ['Documentary', 'Music'],
    'new wave': ['Drama', 'Sci-Fi'],
    'post-punk': ['Documentary', 'Music'],
    'trip hop': ['Crime', 'Drama'],
    'neosoul': ['Drama', 'Romance'],
    'afrobeat': ['Documentary', 'Music'],
    'chillhop': ['Drama', 'Animation'],
    'synthwave': ['Sci-Fi', 'Drama'],
    'latin viral pop': ['Reality-TV', 'Comedy'],
    'r&b en espanol': ['Drama', 'Music']
}

# Using the genre mapping to find corresponding TV show genres
matching_genres = set()
for spotify_genre in top_5_genres:
    if spotify_genre in tvshow_genre_mapping:
        for tvshow_genre in tvshow_genre_mapping[spotify_genre]:
            matching_genres.add(tvshow_genre)

print("Matching Genres between Spotify and TV Shows:", list(matching_genres))

ModuleNotFoundError: ignored

#### Code Breakdown

1. Imports and Setup:
- Libraries like spotipy, SpotifyOAuth, and pandas are imported.
- Spotify API credentials (client ID, client secret, and redirect URI) are specified.

2. Authentication with Spotify:
- An OAuth2 object is initialized using the credentials and scope is set to "user-top-read" to access the user's top tracks.
- An authorization URL is generated and printed for the user.
- The user inputs the code from the redirected URL to get an access token.

3. Fetch Data from Spotify:
- Using the access token, a Spotify connection is established.
- The user's top artists are fetched.

4. Genre Analysis:
- Iterated through each artist's genres.
- Counted the frequency of each genre using a dictionary, genre_count.
- The top 5 genres are extracted, ensuring they're all lowercase.

5. Genre Mapping:
- A pre-defined dictionary, tvshow_genre_mapping, maps Spotify music genres to TV show genres.

6. Find Corresponding TV Genres:
- Using the top 5 Spotify genres, identified the matching TV show genres.
- The results are displayed.

### 5. TV Show Recommendation

#### Code

In [None]:
import spotipy
from spotipy.oauth2 import SpotifyOAuth
import pandas as pd
import random

# Spotify API Credentials
SPOTIPY_CLIENT_ID = 'ecec60c9a316409a84a45c923f7473ee'
SPOTIPY_CLIENT_SECRET = '3129b9225476463d86ddc4074cfc8500'
SPOTIPY_REDIRECT_URI = 'http://localhost/'

# Create OAuth2 object
sp_oauth = SpotifyOAuth(client_id=SPOTIPY_CLIENT_ID,
                        client_secret=SPOTIPY_CLIENT_SECRET,
                        redirect_uri=SPOTIPY_REDIRECT_URI,
                        scope=["user-top-read"])

# Get the authorization URL
auth_url = sp_oauth.get_authorize_url()
print("Please go to the following URL to authorize:")
print(auth_url)

# Extract the code from the redirected URL
code = input("Enter the code from the URL (localhost/code='...'): ")
token = sp_oauth.get_access_token(code, as_dict=False)
sp = spotipy.Spotify(auth=token)

# Fetch user's top artists
top_artists = sp.current_user_top_artists()

# Count frequency of each genre
genre_count = {}
for artist in top_artists['items']:
    for genre in artist['genres']:
        if genre in genre_count:
            genre_count[genre] += 1
        else:
            genre_count[genre] = 1

# Ensure that the genres from Spotify are in lowercase for the mapping.
top_5_genres = [genre.lower() for genre in sorted(genre_count, key=genre_count.get, reverse=True)[:5]]

# Genre mapping for TV Shows
tvshow_genre_mapping = {
    'pop': ['Comedy', 'Family'],
    'art pop': ['Drama', 'Fantasy'],
    'reggaeton': ['Reality-TV', 'Music'],
    'urbano latino': ['Comedy', 'Music'],
    'trap latino': ['Crime', 'Drama'],
    'rock': ['Adventure', 'Drama'],
    'indie rock': ['Drama', 'Romance'],
    'classical': ['Biography', 'History'],
    'hip hop': ['Documentary', 'Drama'],
    'jazz': ['Music', 'Documentary'],
    'country': ['Drama', 'Western'],
    'electronic': ['Sci-Fi', 'Drama'],
    'metal': ['Thriller', 'Horror'],
    'folk': ['Drama', 'History'],
    'blues': ['Documentary', 'Music'],
    'r&b': ['Drama', 'Romance'],
    'soul': ['Biography', 'Music'],
    'punk': ['Documentary', 'Music'],
    'disco': ['Comedy', 'Music'],
    'house': ['Reality-TV', 'Music'],
    'techno': ['Documentary', 'Music'],
    'edm': ['Reality-TV', 'Music'],
    'latin': ['Drama', 'Romance'],
    'reggae': ['Documentary', 'Music'],
    'funk': ['Comedy', 'Music'],
    'k-pop': ['Reality-TV', 'Music'],
    'psychedelic': ['Drama', 'Fantasy'],
    'world': ['Documentary', 'Travel'],
    'ambient': ['Documentary', 'Sci-Fi'],
    'lo-fi beats': ['Drama', 'Romance'],
    'vaporwave': ['Sci-Fi', 'Drama'],
    'emo': ['Drama', 'Music'],
    'hardcore': ['Documentary', 'Music'],
    'dubstep': ['Reality-TV', 'Sci-Fi'],
    'ska': ['Comedy', 'Music'],
    'swing': ['Musical', 'History'],
    'trance': ['Sci-Fi', 'Drama'],
    'grime': ['Documentary', 'Crime'],
    'bluegrass': ['Documentary', 'Music'],
    'new wave': ['Drama', 'Sci-Fi'],
    'post-punk': ['Documentary', 'Music'],
    'trip hop': ['Crime', 'Drama'],
    'neosoul': ['Drama', 'Romance'],
    'afrobeat': ['Documentary', 'Music'],
    'chillhop': ['Drama', 'Animation'],
    'synthwave': ['Sci-Fi', 'Drama'],
    'latin viral pop': ['Reality-TV', 'Comedy'],
    'r&b en espanol': ['Drama', 'Music']
}

# Using the genre mapping to find corresponding TV show genres
matching_genres_weighted = []
for spotify_genre in top_5_genres:
    if spotify_genre in tvshow_genre_mapping:
        for tvshow_genre in tvshow_genre_mapping[spotify_genre]:
            # Add the TV show genre according to its frequency
            matching_genres_weighted.extend([tvshow_genre] * genre_count[spotify_genre])

# Read the CSV for TV Shows
df = pd.read_csv('/content/drive/MyDrive/Spotify Project/tvshowdata.csv')

# Make a copy of the original dataframe to avoid SettingWithCopyWarning
df_copy = df.copy()

# Convert "votes" to strings
df_copy['votes'] = df_copy['votes'].astype(str)

# Remove commas and convert to float
df_copy['votes'] = df_copy['votes'].str.replace(',', '').astype(float)

# Now, convert to integers, but only for non-NaN values
df_copy.loc[df_copy['votes'].notna(), 'votes'] = df_copy['votes'].dropna().astype(int)

# Convert 'genre' column to string type
df_copy['genre'] = df_copy['genre'].astype(str)

# Now proceed with the genre filtering
filtered_tvshows = df_copy[df_copy['genre'].str.split(', ').apply(lambda x: bool(set(x) & set(matching_genres_weighted)) if x != 'nan' else False)]

# Continue with the rating and votes filters
filtered_tvshows = filtered_tvshows[filtered_tvshows['rating'] >= 8]
filtered_tvshows = filtered_tvshows[filtered_tvshows['votes'] >= 20000]

# Recommend a TV show from the filtered shows based on the weighted genres
if filtered_tvshows.empty:
    print("No TV shows found matching the criteria.")
else:
    recommended_genre = random.choice(matching_genres_weighted)
    recommended_show = filtered_tvshows[filtered_tvshows['genre'].str.contains(recommended_genre)].sample().iloc[0]
    # Extract the earliest year from the 'year' column
    # Extract the earliest year from the 'year' column and remove any preceding '-'
    earliest_year = str(recommended_show['year']).split('–')[0].strip().replace('-', '')
    print(f"Recommended TV Show: {recommended_show['title']} (Genre: {recommended_genre}, Rating: {recommended_show['rating']}, Release Year: {earliest_year.strip('()')})")


Please go to the following URL to authorize:
https://accounts.spotify.com/authorize?client_id=ecec60c9a316409a84a45c923f7473ee&response_type=code&redirect_uri=http%3A%2F%2Flocalhost%2F&scope=user-top-read
Enter the code from the URL (localhost/code='...'): AQBpw81y6STqTkljC2TT4KcBZa43VFtYKkmAGYnY-chJ7QFX49o5kq1iS-BDmS75uGNmoNGKX7lg576IJIxDPo3CMKR5c-Q42nFWqZpb_GnCozEV_Tn7p6PJKb4mefcNv7DEpM3cS7GKdwjRdeMZ4OzIoQuZVhEEYHL9FX8WYQZv13O_
Recommended TV Show: Black Sails (Genre: Adventure, Rating: 8.2, Release Year: 2014)


#### Code Breakdown

1. Imports and Setup:
- Libraries like spotipy, SpotifyOAuth, pandas, and random are imported.
- Spotify API credentials are provided.

2. Authentication with Spotify:
- An OAuth2 object is created using the Spotify API credentials and scope "user-top-read".
- The user is presented with an authorization URL to authorize access.
- Once authorized, the user provides the code from the redirected URL, which is used to retrieve an access token.

3. Fetching User's Top Artists from Spotify:
- Using the authenticated connection, the user's top artists are fetched.
- The frequency of each music genre from the top artists is counted.

4. Mapping Music Genres to TV Genres:
- A pre-defined dictionary, tvshow_genre_mapping, maps each Spotify music genre to TV show genres.
- Using the top 5 Spotify genres, the corresponding TV genres are identified with weighted frequency.

5. TV Show Data Filtering and Recommendation:
- TV show data is loaded from a CSV file.
- Data preprocessing: Commas in the 'votes' column are removed, and the column is converted from string to float, and then to integer.
- The 'genre' column is also converted to string.
TV shows are filtered based on:
- Whether they match the weighted genres from Spotify.
- A rating of at least 8.
- A minimum of 20,000 votes.
- A random TV genre from the weighted list is chosen.

6. Output:
A random TV show that matches this genre is recommended to the user.

# Custom Recommendation Based on User Choice (Movie/TV Show Recommendation)

#### Code

In [None]:
! pip install pandas



In [None]:
! pip install spotipy

Collecting spotipy
  Downloading spotipy-2.23.0-py3-none-any.whl (29 kB)
Collecting redis>=3.5.3 (from spotipy)
  Downloading redis-5.0.1-py3-none-any.whl (250 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m250.3/250.3 kB[0m [31m5.3 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: redis, spotipy
Successfully installed redis-5.0.1 spotipy-2.23.0


In [None]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [None]:
import spotipy
from spotipy.oauth2 import SpotifyOAuth
import pandas as pd
import random
import ipywidgets as widgets
from IPython.display import display

# Spotify API Credentials
SPOTIPY_CLIENT_ID = 'ecec60c9a316409a84a45c923f7473ee'
SPOTIPY_CLIENT_SECRET = '3129b9225476463d86ddc4074cfc8500'
SPOTIPY_REDIRECT_URI = 'http://localhost/'

# Create OAuth2 object
sp_oauth = SpotifyOAuth(client_id=SPOTIPY_CLIENT_ID,
                        client_secret=SPOTIPY_CLIENT_SECRET,
                        redirect_uri=SPOTIPY_REDIRECT_URI,
                        scope=["user-top-read"])

# Get the authorization URL
auth_url = sp_oauth.get_authorize_url()
print("Please go to the following URL to authorize:")
print(auth_url)

# Extract the code from the redirected URL
code = input("Enter the code from the URL (localhost/code='...'): ")
token = sp_oauth.get_access_token(code, as_dict=False)
sp = spotipy.Spotify(auth=token)

# Fetch user's top tracks from the past 30 days (short term)
top_tracks = sp.current_user_top_tracks(limit=50, time_range='short_term')

# Extract artists from the top tracks
artist_ids = [track['album']['artists'][0]['id'] for track in top_tracks['items']]

# Fetch artist details for these artists
artists = sp.artists(artist_ids)['artists']

# Count frequency of each genre
genre_count = {}
for artist in artists:
    for genre in artist['genres']:
        if genre in genre_count:
            genre_count[genre] += 1
        else:
            genre_count[genre] = 1

# Ensure that the genres from Spotify are in lowercase for the mapping.
top_5_genres = [genre.lower() for genre in sorted(genre_count, key=genre_count.get, reverse=True)[:5]]

# Movie Genre Mapping
movie_genre_mapping = {
    'pop': ['Comedy', 'Romance'],
    'art pop': ['Fantasy', 'Drama'],
    'reggaeton': ['Action', 'Adventure'],
    'urbano latino': ['Action', 'Drama'],
    'trap latino': ['Crime', 'Action'],
    'rock': ['Action', 'Adventure'],
    'indie rock': ['Drama', 'Romance'],
    'classical': ['History', 'Biography'],
    'hip hop': ['Drama', 'Crime'],
    'jazz': ['Drama', 'Biography'],
    'country': ['Drama', 'Family'],
    'electronic': ['Sci-Fi', 'Mystery'],
    'metal': ['Horror', 'Thriller'],
    'folk': ['Drama', 'History'],
    'blues': ['Drama', 'Crime'],
    'r&b': ['Romance', 'Drama'],
    'soul': ['Drama', 'Romance'],
    'punk': ['Action', 'Thriller'],
    'disco': ['Comedy', 'Romance'],
    'house': ['Sci-Fi', 'Thriller'],
    'techno': ['Sci-Fi', 'Action'],
    'edm': ['Action', 'Sci-Fi'],
    'latin': ['Drama', 'Romance'],
    'reggae': ['Comedy', 'Adventure'],
    'funk': ['Comedy', 'Action'],
    'k-pop': ['Comedy', 'Romance'],
    'psychedelic': ['Fantasy', 'Adventure'],
    'world': ['History', 'Family'],
    'ambient': ['Drama', 'Sci-Fi'],
    'lo-fi beats': ['Drama', 'Romance'],
    'vaporwave': ['Sci-Fi', 'Fantasy'],
    'emo': ['Drama', 'Romance'],
    'hardcore': ['Action', 'Thriller'],
    'dubstep': ['Action', 'Sci-Fi'],
    'ska': ['Comedy', 'Adventure'],
    'swing': ['Comedy', 'Romance'],
    'trance': ['Sci-Fi', 'Adventure'],
    'grime': ['Crime', 'Action'],
    'bluegrass': ['Drama', 'Adventure'],
    'new wave': ['Sci-Fi', 'Romance'],
    'post-punk': ['Drama', 'Thriller'],
    'trip hop': ['Mystery', 'Drama'],
    'neosoul': ['Romance', 'Drama'],
    'afrobeat': ['Drama', 'Adventure'],
    'chillhop': ['Drama', 'Romance'],
    'synthwave': ['Sci-Fi', 'Action'],
    'latin viral pop': ['Comedy', 'Adventure'],
    'r&b en espanol': ['Romance', 'Drama']
}

# Genre mapping for TV Shows
tvshow_genre_mapping = {
    'pop': ['Comedy', 'Family'],
    'art pop': ['Drama', 'Fantasy'],
    'reggaeton': ['Reality-TV', 'Music'],
    'urbano latino': ['Comedy', 'Music'],
    'trap latino': ['Crime', 'Drama'],
    'rock': ['Adventure', 'Drama'],
    'indie rock': ['Drama', 'Romance'],
    'classical': ['Biography', 'History'],
    'hip hop': ['Documentary', 'Drama'],
    'jazz': ['Music', 'Documentary'],
    'country': ['Drama', 'Western'],
    'electronic': ['Sci-Fi', 'Drama'],
    'metal': ['Thriller', 'Horror'],
    'folk': ['Drama', 'History'],
    'blues': ['Documentary', 'Music'],
    'r&b': ['Drama', 'Romance'],
    'soul': ['Biography', 'Music'],
    'punk': ['Documentary', 'Music'],
    'disco': ['Comedy', 'Music'],
    'house': ['Reality-TV', 'Music'],
    'techno': ['Documentary', 'Music'],
    'edm': ['Reality-TV', 'Music'],
    'latin': ['Drama', 'Romance'],
    'reggae': ['Documentary', 'Music'],
    'funk': ['Comedy', 'Music'],
    'k-pop': ['Reality-TV', 'Music'],
    'psychedelic': ['Drama', 'Fantasy'],
    'world': ['Documentary', 'Travel'],
    'ambient': ['Documentary', 'Sci-Fi'],
    'lo-fi beats': ['Drama', 'Romance'],
    'vaporwave': ['Sci-Fi', 'Drama'],
    'emo': ['Drama', 'Music'],
    'hardcore': ['Documentary', 'Music'],
    'dubstep': ['Reality-TV', 'Sci-Fi'],
    'ska': ['Comedy', 'Music'],
    'swing': ['Musical', 'History'],
    'trance': ['Sci-Fi', 'Drama'],
    'grime': ['Documentary', 'Crime'],
    'bluegrass': ['Documentary', 'Music'],
    'new wave': ['Drama', 'Sci-Fi'],
    'post-punk': ['Documentary', 'Music'],
    'trip hop': ['Crime', 'Drama'],
    'neosoul': ['Drama', 'Romance'],
    'afrobeat': ['Documentary', 'Music'],
    'chillhop': ['Drama', 'Animation'],
    'synthwave': ['Sci-Fi', 'Drama'],
    'latin viral pop': ['Reality-TV', 'Comedy'],
    'r&b en espanol': ['Drama', 'Music']
}

def recommend(choice):
    matching_genres_weighted = []

    # Identify the appropriate genre mapping based on choice
    if choice == "movie":
        genre_mapping = movie_genre_mapping
    elif choice == "tvshow":
        genre_mapping = tvshow_genre_mapping
    else:
        print("Invalid choice!")
        return

    # Loop through each Spotify genre and weight the movie genres
    for spotify_genre in top_5_genres:
        # Check if the Spotify genre exists in the mapping
        if spotify_genre in genre_mapping:
            for corresponding_genre in genre_mapping[spotify_genre]:
                matching_genres_weighted.extend([corresponding_genre] * genre_count[spotify_genre])

    # Movie Filtering
    if choice == "movie":
        df = pd.read_csv('/content/drive/MyDrive/Spotify Project/combined_movies.csv')
        df = df[(df['rating'] >= 8) & (df['votes'] >= 20000)]
        # Filter movies using case-insensitive matching
        filtered_movies = df[df['genre'].str.lower().str.contains('|'.join([g.lower() for g in matching_genres_weighted]))]
        if filtered_movies.empty:
            print("No movies found matching the criteria.")
        else:
            recommended_genre = random.choice(matching_genres_weighted)
            filtered_genre_movies = filtered_movies[filtered_movies['genre'].str.lower().str.contains(recommended_genre.lower())]
            if not filtered_genre_movies.empty:
              recommended_movie = filtered_genre_movies.sample().iloc[0]
            else:
              print("No movies found for the genre:", recommended_genre)
              return
        print(f"Recommended Movie: {recommended_movie['movie_name']} (Genre: {recommended_movie['genre'].capitalize()}, Rating: {recommended_movie['rating']}, Release Year: {recommended_movie['year']})")

    # TV Show Filtering
    elif choice == "tvshow":
        df = pd.read_csv('/content/drive/MyDrive/Spotify Project/tvshowdata.csv', on_bad_lines='warn')
        df_copy = df.copy()
        df_copy['votes'] = df_copy['votes'].astype(str)
        df_copy['votes'] = df_copy['votes'].str.replace(',', '').astype(float)
        df_copy.loc[df_copy['votes'].notna(), 'votes'] = df_copy['votes'].dropna().astype(int)
        df_copy['genre'] = df_copy['genre'].astype(str)

        # Filter TV shows using case-insensitive matching
        filtered_tvshows = df_copy[df_copy['genre'].str.split(', ').apply(lambda x: bool(set([y.lower() for y in x]) & set([g.lower() for g in matching_genres_weighted])) if x != 'nan' else False)]
        filtered_tvshows = filtered_tvshows[filtered_tvshows['rating'] >= 8]
        filtered_tvshows = filtered_tvshows[filtered_tvshows['votes'] >= 20000]
        if filtered_tvshows.empty:
            print("No TV shows found matching the criteria.")
        else:
            recommended_genre = random.choice(matching_genres_weighted)
            filtered_genre_tvshows = filtered_tvshows[filtered_tvshows['genre'].str.lower().str.contains(recommended_genre.lower())]
            if not filtered_genre_tvshows.empty:
              recommended_show = filtered_genre_tvshows.sample().iloc[0]
            else:
              print("No TV shows found for the genre:", recommended_genre)
              return
        earliest_year = str(recommended_show['year']).split('–')[0].strip().replace('-', '')
        print(f"Recommended TV Show: {recommended_show['title']} (Genre: {recommended_genre}, Rating: {recommended_show['rating']}, Release Year: {earliest_year.strip('()')})")

def on_button_click(button):
    choice = button.description.lower().replace(' ', '')
    recommend(choice)

movie_button = widgets.Button(description="Movie")
tv_button = widgets.Button(description="TV Show")

movie_button.on_click(on_button_click)
tv_button.on_click(on_button_click)

# Display the buttons
display(movie_button, tv_button)

Please go to the following URL to authorize:
https://accounts.spotify.com/authorize?client_id=ecec60c9a316409a84a45c923f7473ee&response_type=code&redirect_uri=http%3A%2F%2Flocalhost%2F&scope=user-top-read
Enter the code from the URL (localhost/code='...'): AQAm_Sd_PucH--MwH2p7yA7JOM89vws57HPFykwfoOgksAQ9FjTl9Bjyz4R-SaDsziakAmBoTQwNv_O6yYM6l26dmqhbLg0_sFUFXv461xgsGF3x7e5Vf7qzQU4KtbhUEssos4xNj1Bfhardio6yB-2pGeSdBT8TRX3BGJwcLItwy7HB


Button(description='Movie', style=ButtonStyle())

Button(description='TV Show', style=ButtonStyle())

Recommended Movie: Song of the Sea (Genre: Adventure, Rating: 8.0, Release Year: 2014)


#### Code Breakdown

The code above leverages the Spotify API to extract the user's top tracks (of the past 30 days), from which it deduces their favorite music genres. Based on these genres, it recommends movies or TV shows that presumably align with the user's taste.

Step-by-step breakdown of what the code does:

Setup and API Authentication:

- Imports necessary modules (like spotipy, pandas, etc.).
- Defines Spotify API credentials (you've shared a client ID and secret).
- It uses these credentials to set up OAuth2 authentication for accessing user's top tracks.

Fetching User's Top Tracks:

- The user is redirected to an authorization URL where they grant the app access to their top tracks.
- After granting access, they're redirected to a localhost URL with an access code in it.
- This code is then used to fetch the access token, which allows the app to make requests on behalf of the user.
- The user's top 50 tracks from the past 30 days are fetched.

Genre Extraction:

- From these top tracks, the code extracts the artist IDs.
- For each artist, it fetches their details which includes the genres they're associated with.
- It then creates a dictionary (genre_count) to tally up how many times each genre appears among the top artists.

Movie & TV Show Genre Mapping:

- Two dictionaries, movie_genre_mapping and tvshow_genre_mapping, map music genres to movie or TV show genres.
- For instance, if a user's top music genre is 'pop', the system might recommend movies in the 'Adventure' or 'Action' categories.

Recommendation Function:

- This function, recommend, first determines whether to use the movie or TV show genre mapping.
- For the user's top 5 Spotify genres, it looks up the corresponding movie or TV show genres and weights them based on frequency.
- It then selects a random genre from this weighted list and recommends a movie or TV show within that genre.

Data Loading:

- The movie and TV show data are read from CSV files. This is where the actual recommendations come from.
- Movies/TV shows with a rating of 8 or above and at least 20,000 votes are considered for recommendations.

User Interaction:

- The user can click on either a "Movie" button or a "TV Show" button to get a recommendation of that type.
- Once clicked, the system will provide a recommendation based on the user's top music genres.



1. Libraries and Credentials
- spotipy: Used to interact with the Spotify API.
- pandas: A data manipulation and analysis library.
- random: Provides functions to work with randomness.
- ipywidgets: For creating interactive GUIs in Jupyter notebooks.
- IPython.display: Allows for the display of GUI elements in Jupyter.

2. Spotify API Credentials: Constants for API integration
- OAuth2: Initializes the Spotify OAuth2 authentication.
- Authorization URL: Directs the user to Spotify for permission.
- Code Extraction: Asks the user to provide the authentication code from the redirect URL to authenticate.

3. Fetch Spotify Data

- Fetch User's Top Artists: The script fetches the top artists for the authenticated user.
- Genre Counting: It then counts the number of times each genre appears among the user's top artists.

4. Movie & TV Genre Mapping

- movie_genre_mapping: Maps Spotify music genres to corresponding movie genres.
- tv_genre_mapping: Does the same as above but for TV shows.

5. Recommendation Function (recommend):

- This function provides a recommendation based on the user's preference (movie or TV show) and the genres of their top Spotify artists.
- It first selects the right genre mapping dictionary based on the choice.
- Then, it matches the top Spotify genres to movie or TV show genres.
- The function then filters a dataset of movies or TV shows based on these genres and some additional criteria (like ratings and number of votes).
- Finally, it randomly selects a recommendation from this filtered list and prints it.

6. Interactive Buttons and Callbacks

- on_button_click: Function that gets triggered when a button (Movie or TV Show) is clicked.
- movie_button and tv_button: Interactive buttons for the user to choose between getting a movie or TV show recommendation.
- The on_click method of these buttons is used to bind the buttons to the on_button_click function.

7. Display the Buttons:

- Using the display function from IPython.display to show the buttons on the Jupyter notebook, enabling the user to interact with the recommendation system.


  
