<a href="https://colab.research.google.com/github/RecSys-lab/Popcorn/blob/main/examples/colab/load_movielens_movies.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **üçø Popcorn Framework in Google Colab**
### **Load MovieLens Movies (100K, 1M, 25m)**

üé¨ Popcorn Framework: [link](https://github.com/RecSys-lab/Popcorn)

## **[Step 1] Clone Popcorn Movie Recommender Tool**

Clone the framework into your `GDrive` and prepare it for experiments.

‚ö†Ô∏è You might see a *"Restart Session"* warning during the first run in Google Colab due to library version mismatches. This is expected! Accept the restart, re-run this cell, and continue!

In [1]:
# Clone the repo
!git clone https://github.com/RecSys-lab/Popcorn.git

# Install the required library
%cd Popcorn
!pip install -e .

# Add the repository to the Python path
import sys
sys.path.append('/content/Popcorn')

# Go back to the root
%cd ..

fatal: destination path 'Popcorn' already exists and is not an empty directory.
/content/Popcorn
Obtaining file:///content/Popcorn
  Preparing metadata (setup.py) ... [?25l[?25hdone
Installing collected packages: Popcorn
  Attempting uninstall: Popcorn
    Found existing installation: Popcorn 1.6.0
    Uninstalling Popcorn-1.6.0:
      Successfully uninstalled Popcorn-1.6.0
  Running setup.py develop for Popcorn
Successfully installed Popcorn-1.6.0
/content


## üöÄ **[Step 2] Use the Framework**

### *1. Load Configurations and Imports*

In [5]:
import os
import json
import pandas as pd
from popcorn.utils import readConfigs

# Start the Framework
print("Welcome to 'Popcorn' üçø! Starting the framework for your movie recommendation ...\n")

# Read the configuration file
configs = readConfigs("Popcorn/popcorn/config/config.yml")
# If properly read, print the configurations
if not configs:
    print("Error reading the configuration file!")

# Override (optional)
configs["datasets"]["unimodal"]["movielens"]["version"] = "1m" # '100k' | '1m' | '25m'
configs["datasets"]["unimodal"]["movielens"]["download_path"] = "/content/MovieLens"

Welcome to 'Popcorn' üçø! Starting the framework for your movie recommendation ...

- Reading the framework's configuration file ...
- Configuration file loaded successfully!


### *2. Download MovieLens Dataset Variants*

In [3]:
from popcorn.datasets.movielens.downloader import downloadMovieLens

# Variables
mlVersion = configs["datasets"]["unimodal"]["movielens"]["version"]
downloadPath = configs["datasets"]["unimodal"]["movielens"]["download_path"]

# Download MovieLens dataset
downloadMovieLens(mlVersion, downloadPath)


- Downloading the MovieLens-1m dataset ...
- Creating the download path '/content/MovieLens/ml-1m' ...
- Fetching data from 'https://files.grouplens.org/datasets/movielens/ml-1m.zip' ...
- Download completed and the dataset is saved as a 'zip' file!
- Extracting the dataset files inside '/content/MovieLens/ml-1m' ...
- Dataset extracted to '/content/MovieLens/ml-1m' successfully!
- Removing the zip file '/content/MovieLens/ml-1m/ml-1m.zip' ...
- Zip file removed successfully!


True

### *3. Load MovieLens Dataset Movies*

In [4]:
from popcorn.datasets.movielens.loader import loadMovieLens
from popcorn.datasets.utils import printTextualDatasetStats

# Load MovieLens
itemsDF, usersDF, ratingsDF = loadMovieLens(configs)
if itemsDF is None:
    print("Error in loading the MovieLens dataset! Exiting ...")
else:
  print(f"\n- ItemsDF (shape: {itemsDF.shape}): \n{itemsDF.head()}")
  printTextualDatasetStats(ratingsDF)


- Downloading the MovieLens-1m dataset ...
- The download path '/content/MovieLens/ml-1m' already exists! Skipping the download ...

- Loading 'MovieLens-1m' data from '/content/MovieLens/ml-1m/ml-1m' ...
- Items (movies) have been loaded. Number of rows: 3,883
- Users have been loaded. Number of rows: 6,040
- Ratings have been loaded. Number of rows: 1,000,209

- ItemsDF (shape: (3883, 3)): 
  item_id                               title  \
0       1                    Toy Story (1995)   
1       2                      Jumanji (1995)   
2       3             Grumpier Old Men (1995)   
3       4            Waiting to Exhale (1995)   
4       5  Father of the Bride Part II (1995)   

                             genres  
0   [Animation, Children's, Comedy]  
1  [Adventure, Children's, Fantasy]  
2                 [Comedy, Romance]  
3                   [Comedy, Drama]  
4                          [Comedy]  
--------------------------
- The Dataset Overview:
-- Total Interactions: 100020

### *4. Filter Movies by a Given Genre*

In [7]:
from popcorn.datasets.movielens.helper_movies import filterMoviesByGenre

print("Filtering movies by given genres ...")

# Comedy
itemsDF_filtered = filterMoviesByGenre(itemsDF, genre="Comedy")
print(f"- Filtered ItemsDF: \n{itemsDF_filtered.head(3)}\n")

# Drama
itemsDF_filtered = filterMoviesByGenre(itemsDF, genre="Drama")
print(f"- Filtered ItemsDF: \n{itemsDF_filtered.head(3)}\n")

# N/A Genre
itemsDF_filtered = filterMoviesByGenre(itemsDF, genre="TestGenre")
print(f"- Filtered ItemsDF: \n{itemsDF_filtered.head(3)}\n")

Filtering movies by given genres ...
- Filtering 3883 movies by genre 'Comedy' ...
- Kept 1200 movies with genre 'Comedy'.
- Filtered ItemsDF: 
  item_id                     title                           genres
0       1          Toy Story (1995)  [Animation, Children's, Comedy]
2       3   Grumpier Old Men (1995)                [Comedy, Romance]
3       4  Waiting to Exhale (1995)                  [Comedy, Drama]

- Filtering 3883 movies by genre 'Drama' ...
- Kept 1603 movies with genre 'Drama'.
- Filtered ItemsDF: 
   item_id                           title                    genres
3        4        Waiting to Exhale (1995)           [Comedy, Drama]
10      11  American President, The (1995)  [Comedy, Drama, Romance]
13      14                    Nixon (1995)                   [Drama]

- [Error] Genre 'TestGenre' is not valid. Valid genres are: ['unknown', 'Action', 'Adventure', 'Animation', "Children's", 'Comedy', 'Crime', 'Documentary', 'Drama', 'Fantasy', 'Film-Noir', 'Horror'

### *5. Filter Movies Containing the Main Genres*

In [8]:
from popcorn.datasets.movielens.helper_movies import filterMoviesWithMainGenres

# Filter movies containing the main genres
print("Filtering movies containing the main genres ...")
itemsDF_mainGenres = filterMoviesWithMainGenres(itemsDF)
print(
    f"- Main Genres ItemsDF (shape: {itemsDF_mainGenres.shape}): \n{itemsDF_mainGenres.head(3)}"
)

Filtering movies containing the main genres ...
- Filtering 3883 movies containing the main genres '['Action', 'Comedy', 'Drama', 'Horror']' ...
- Kept 3193 movies containing the main genres.
- Main Genres ItemsDF (shape: (3193, 3)): 
  item_id                     title                           genres
0       1          Toy Story (1995)  [Animation, Children's, Comedy]
2       3   Grumpier Old Men (1995)                [Comedy, Romance]
3       4  Waiting to Exhale (1995)                  [Comedy, Drama]
