<a href="https://colab.research.google.com/github/RecSys-lab/movifex_dataset/blob/main/examples/load_movifex_visuals.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **MoViFex Framework - Process `MoViFex` Dataset Visual Features**

🎬 Dataset MoViFex Dataset: [link](https://huggingface.co/datasets/alitourani/MoViFex_Dataset/tree/main)

🎬 Framework: [link](https://github.com/RecSys-lab/MoViFex)


# [Step 1] - Load the Framework

Clone the framework into your `GDrive` and prepare it for experiments.

In [None]:
# Clone the repo
!git clone https://github.com/RecSys-lab/MoViFex.git

# Install the required library
%cd MoViFex
!pip install -e .

# Add the repository to the Python path
import sys
sys.path.append('/content/MoViFex')

Cloning into 'MoViFex'...
remote: Enumerating objects: 689, done.[K
remote: Counting objects: 100% (265/265), done.[K
remote: Compressing objects: 100% (181/181), done.[K
remote: Total 689 (delta 133), reused 202 (delta 78), pack-reused 424 (from 1)[K
Receiving objects: 100% (689/689), 193.71 KiB | 2.31 MiB/s, done.
Resolving deltas: 100% (350/350), done.
/content/MoViFex
Obtaining file:///content/MoViFex
  Preparing metadata (setup.py) ... [?25l[?25hdone
Collecting pytube>=15.0 (from MoViFex==1.0.0)
  Downloading pytube-15.0.0-py3-none-any.whl.metadata (5.0 kB)
Collecting scipy>=1.14.1 (from MoViFex==1.0.0)
  Downloading scipy-1.15.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (61 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m62.0/62.0 kB[0m [31m1.8 MB/s[0m eta [36m0:00:00[0m
Downloading pytube-15.0.0-py3-none-any.whl (57 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m57.6/57.6 kB[0m [31m1.9 MB/s[0m eta [3

# [Step 2] - Use the Framework 🚀

Import the framework and define some variables to work with it.

In [None]:
import os
import json
import movifex

# Similar to the `config.yml` file in the framework - section `datasets/visual_dataset/movifex`
configs = {
    "name": "MoViFex-visual",
    "path_metadata": "https://huggingface.co/datasets/alitourani/MoViFex_Dataset/resolve/main/stats.json",
    "path_raw": "https://huggingface.co/datasets/alitourani/MoViFex_Dataset/raw/main/",
    "feature_sources": ["full_movies", "movie_shots", "movie_trailers"],
    "agg_feature_sources": ["full_movies_agg", "movie_shots_agg", "movie_trailers_agg"],
    "feature_models": ["incp3", "vgg19"],
    "aggregation_models": ["Max", "Mean"]
}

# Variables
datasetName = configs['name']
datasetRawFilesUrl = configs['path_raw']
featureModels = configs['feature_models']
featureSources = configs['feature_sources']
aggFeatureSources = configs['agg_feature_sources']

# Other variables
givenMovieId = 6
givenModel = featureModels[0]
givenFeatureSource = featureSources[2]

**Test I. Generating a Packet Address**

- ⚙️ Function: `packetAddressGenerator`

In [None]:
from movifex.datasets.movifex.helper_visualfeats import packetAddressGenerator

print(f"- Generating a sample packet address file from '{datasetRawFilesUrl}' ...")
packetAddress = packetAddressGenerator(datasetRawFilesUrl, givenFeatureSource, givenModel, givenMovieId, 1)
print(f"- Generated address (str): {packetAddress}")

- Generating a sample packet address file from 'https://huggingface.co/datasets/alitourani/MoViFex_Dataset/raw/main/' ...
- Generated address (str): https://huggingface.co/datasets/alitourani/MoViFex_Dataset/raw/main/movie_trailers/incp3/0000000006/packet0001.json


**Test II. Fetching All Packets from the Generated Address**

- ⚙️ Function: `fetchAllPackets`

In [None]:
from movifex.datasets.movifex.helper_visualfeats import fetchAllPackets

print(f"- Fetching all packets of the movie #{givenMovieId} (type: '{givenFeatureSource}', CNN: '{givenModel}') ...")
moviePackets = fetchAllPackets(datasetRawFilesUrl, givenFeatureSource, givenModel, givenMovieId)
print(f"- Number of fetched packets (list): {len(moviePackets)}")
print(f"- Sample packet items (list):\n {moviePackets[:2]}")

- Fetching all packets of the movie #6 (type: 'movie_trailers', CNN: 'incp3') ...
Generated packet address: https://huggingface.co/datasets/alitourani/MoViFex_Dataset/raw/main/movie_trailers/incp3/0000000006/packet0001.json
Fetched JSON from the address ...
Generated packet address: https://huggingface.co/datasets/alitourani/MoViFex_Dataset/raw/main/movie_trailers/incp3/0000000006/packet0002.json
Fetched JSON from the address ...
Generated packet address: https://huggingface.co/datasets/alitourani/MoViFex_Dataset/raw/main/movie_trailers/incp3/0000000006/packet0003.json
Fetched JSON from the address ...
Generated packet address: https://huggingface.co/datasets/alitourani/MoViFex_Dataset/raw/main/movie_trailers/incp3/0000000006/packet0004.json
Fetched JSON from the address ...
Generated packet address: https://huggingface.co/datasets/alitourani/MoViFex_Dataset/raw/main/movie_trailers/incp3/0000000006/packet0005.json
Fetched JSON from the address ...
Generated packet address: https://hugg