<a href="https://colab.research.google.com/github/RecSys-lab/popcorn_dataset/blob/main/examples/load_popcorn_visuals.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **üçø Popcorn Framework in Google Colab**
### **Load Popcorn Dataset - Frame/Shot Embedding Functions**

ü§ó Dataset in HF: [link](https://huggingface.co/datasets/alitourani/Popcorn_Dataset)

üåê Dataset Web-Page: [link](https://recsys-lab.github.io/popcorn_dataset/)

üé¨ Popcorn Framework: [link](https://github.com/RecSys-lab/Popcorn)

## **[Step 1] Clone Popcorn Movie Recommender Tool**

Clone the framework into your `GDrive` and prepare it for experiments.

‚ö†Ô∏è You might see a *"Restart Session"* warning during the first run in Google Colab due to library version mismatches. This is expected! Accept the restart, re-run this cell, and continue!

In [1]:
# Clone the repo
!git clone https://github.com/RecSys-lab/Popcorn.git

# Install the required library
%cd Popcorn
!pip install -e .

# Add the repository to the Python path
import sys
sys.path.append('/content/Popcorn')

# Go back to the root
%cd ..

fatal: destination path 'Popcorn' already exists and is not an empty directory.
/content/Popcorn
Obtaining file:///content/Popcorn
  Preparing metadata (setup.py) ... [?25l[?25hdone
Installing collected packages: Popcorn
  Attempting uninstall: Popcorn
    Found existing installation: Popcorn 1.6.0
    Uninstalling Popcorn-1.6.0:
      Successfully uninstalled Popcorn-1.6.0
  Running setup.py develop for Popcorn
Successfully installed Popcorn-1.6.0
/content


## üöÄ **[Step 2] Use the Framework**

### *1. Load Configurations and Imports*

In [2]:
from popcorn.utils import readConfigs
from popcorn.datasets.popcorn.utils import RAW_DATA_URL
from popcorn.datasets.popcorn.helper_embedding import generatePacketUrl, fetchAllPackets

# Start the Framework
print("Welcome to 'Popcorn' üçø! Starting the framework for your movie recommendation ...\n")

# Read the configuration file
configs = readConfigs("Popcorn/popcorn/config/config.yml")
# If properly read, print the configurations
if not configs:
    print("Error reading the configuration file!")

Welcome to 'Popcorn' üçø! Starting the framework for your movie recommendation ...

- Reading the framework's configuration file ...
- Configuration file loaded successfully!


### *2. Override the Configurations (Optional)*

In [4]:
# Which CNNs to use? ('incp3' | 'vgg19')
cnns = configs["datasets"]["multimodal"]["popcorn"]["cnns"]

# What is the Dataset name?
datasetName = configs["datasets"]["multimodal"]["popcorn"]["name"]

# Which Embedding Sources to Use? ('full_movies' | 'movie_shots' | 'movie_trailers')
embeddings = configs["datasets"]["multimodal"]["popcorn"]["embedding_sources"]

print(
    f"- Preparing to fetch the raw file of '{datasetName}' dataset from '{RAW_DATA_URL}' ..."
)

- Preparing to fetch the raw file of 'Popcorn-visual' dataset from 'https://huggingface.co/datasets/alitourani/Popcorn_Dataset/raw/main/' ...


### *3. Test Generating Sample Packet URLs*

In [5]:
print(f"\n[Func-1] Generating a sample packet URLs to embeddings ...")
givenMovieId, givenPacketId = 6, 1
givenCnn, givenEmbedding = cnns[0], embeddings[0]
packetUrl = generatePacketUrl(givenEmbedding, givenCnn, givenMovieId, givenPacketId)
print(
    f"- URL for packet '#{givenPacketId}' of movie '#{givenMovieId}' extracted by CNN '{givenCnn}' from source '{givenEmbedding}': {packetUrl}"
)

# Another sample
givenMovieId, givenPacketId = 150, 3
givenCnn, givenEmbedding = "vgg19", "movie_trailers"
packetUrl = generatePacketUrl(givenEmbedding, givenCnn, givenMovieId, givenPacketId)
print(
    f"- URL for packet '#{givenPacketId}' of movie '#{givenMovieId}' extracted by CNN '{givenCnn}' from source '{givenEmbedding}': {packetUrl}"
)


[Func-1] Generating a sample packet URLs to embeddings ...
- URL for packet '#1' of movie '#6' extracted by CNN 'incp3' from source 'full_movies': https://huggingface.co/datasets/alitourani/Popcorn_Dataset/raw/main/full_movies/incp3/0000000006/packet0001.json
- URL for packet '#3' of movie '#150' extracted by CNN 'vgg19' from source 'movie_trailers': https://huggingface.co/datasets/alitourani/Popcorn_Dataset/raw/main/movie_trailers/vgg19/0000000150/packet0003.json


### *4. Test Fetching All Packets of a Movie*

In [6]:
print(f"\n[Func-2] Fetching all packets of a movie ...")
givenMovieId = 6
givenCnn, givenEmbedding = "incp3", "movie_trailers"
fetchedEmbeddings = fetchAllPackets(givenEmbedding, givenCnn, givenMovieId)
print(
    f"- Number of embeddings from the fetched packets for movie '#{givenMovieId}': {len(fetchedEmbeddings)}"
)


[Func-2] Fetching all packets of a movie ...
- Fetching all packets of the movie #6 (movie_trailers, incp3) ...
- Loading JSON data from the given URL 'https://huggingface.co/datasets/alitourani/Popcorn_Dataset/raw/main/movie_trailers/incp3/0000000006/packet0001.json' ...
- JSON data loaded successfully!
- Fetched JSON data from the URL 'https://huggingface.co/datasets/alitourani/Popcorn_Dataset/raw/main/movie_trailers/incp3/0000000006/packet0001.json'!
- Loading JSON data from the given URL 'https://huggingface.co/datasets/alitourani/Popcorn_Dataset/raw/main/movie_trailers/incp3/0000000006/packet0002.json' ...
- JSON data loaded successfully!
- Fetched JSON data from the URL 'https://huggingface.co/datasets/alitourani/Popcorn_Dataset/raw/main/movie_trailers/incp3/0000000006/packet0002.json'!
- Loading JSON data from the given URL 'https://huggingface.co/datasets/alitourani/Popcorn_Dataset/raw/main/movie_trailers/incp3/0000000006/packet0003.json' ...
- JSON data loaded successfully!
-