<a href="https://colab.research.google.com/github/RecSys-lab/Popcorn/blob/main/examples/colab/modality_data_fusion_all_mmtf.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **üçø Popcorn Framework in Google Colab**
### **Modality Fusion RAG-Plus (Text) and MMTF-14K (Visual & Audio)**

üé¨ Popcorn Framework: [link](https://github.com/RecSys-lab/Popcorn)

## **[Step 1] Clone Popcorn Movie Recommender Tool**

Clone the framework into your `GDrive` and prepare it for experiments.

‚ö†Ô∏è You might see a *"Restart Session"* warning during the first run in Google Colab due to library version mismatches. This is expected! Accept the restart, re-run this cell, and continue!

In [1]:
# Clone the repo
!git clone https://github.com/RecSys-lab/Popcorn.git

# Install the required library
%cd Popcorn
!pip install -e .

# Add the repository to the Python path
import sys
sys.path.append('/content/Popcorn')

# Go back to the root
%cd ..

fatal: destination path 'Popcorn' already exists and is not an empty directory.
/content/Popcorn
Obtaining file:///content/Popcorn
  Preparing metadata (setup.py) ... [?25l[?25hdone
Installing collected packages: Popcorn
  Attempting uninstall: Popcorn
    Found existing installation: Popcorn 1.6.0
    Uninstalling Popcorn-1.6.0:
      Successfully uninstalled Popcorn-1.6.0
  Running setup.py develop for Popcorn
Successfully installed Popcorn-1.6.0
/content


## üöÄ **[Step 2] Use the Framework**

### *1. Load Configurations and Imports*

In [2]:
import os
import json
import pandas as pd
from popcorn.utils import readConfigs

# Start the Framework
print("Welcome to 'Popcorn' üçø! Starting the framework for your movie recommendation ...\n")

# Read the configuration file
configs = readConfigs("Popcorn/popcorn/config/config.yml")
# If properly read, print the configurations
if not configs:
    print("Error reading the configuration file!")

Welcome to 'Popcorn' üçø! Starting the framework for your movie recommendation ...

- Reading the framework's configuration file ...
- Configuration file loaded successfully!


### *2. Load Poison-RAG-Plus (Text)*

In [3]:
from popcorn.datasets.poison_rag_plus.loader import loadPoisonRagPlus

# Load Dataset
textDF = loadPoisonRagPlus(configs)

if textDF is not None:
  print(f"- textDF shape: {textDF.shape}")


- Preparing the 'Poison-RAG-Plus' dataset with 'llama'-driven enriched embeddings ...
-- Loading data from 'llama_enriched_description_part1.csv.gz' ...
-- Loading data from 'llama_enriched_description_part2.csv.gz' ...
-- Loading data from 'llama_enriched_description_part3.csv.gz' ...
-- Loading data from 'llama_enriched_description_part4.csv.gz' ...
-- Loading data from 'llama_enriched_description_part5.csv.gz' ...
- Finished loading 4 parts of textual enriched data with 1,606 items!
- textDF shape: (1606, 2)


### *3. Load MMTF-14K (Visual & Audio)*

In [4]:
from popcorn.datasets.mmtf14k.helper_audio import loadAudioFusedDF
from popcorn.datasets.mmtf14k.helper_visual import loadVisualFusedDF

# Configurations override (optional)
configs["datasets"]["multimodal"]["mmtf"]["audio_variant"] = "ivec" # 'blf' | 'ivec'
configs["datasets"]["multimodal"]["mmtf"]["visual_variant"] = "cnn" # 'avf' | 'cnn'

# Load Audio
audioDF = loadAudioFusedDF(configs)
if audioDF is not None:
  print(f"- audioDF shape: {audioDF.shape}")

# Load visual
visualDF = loadVisualFusedDF(configs)
if visualDF is not None:
  print(f"- visualDF shape: {visualDF.shape}")

- Fetching MMTF-14K audio data for variant 'ivec' ...
- Fetched 1,807 audio items using 'i-vector' features.
- audioDF shape: (1807, 2)
- Fetching MMTF-14K visual data for variant 'cnn' ...
- Fetched 1,807 visual items using 'cnn' features.
- visualDF shape: (1807, 2)


In [6]:
audioDF.head(5)

Unnamed: 0,item_id,audio
0,1500,"[-0.013718624, -0.0067672646, 0.010239853, 0.0..."
1,367,"[0.02349284, 0.013962358, 0.008394235, -0.0101..."
2,84152,"[-0.016936503, -0.020135608, -8.170488e-05, -0..."
3,3717,"[0.0849043, -0.0054050665, -0.008795045, 0.060..."
4,3238,"[0.011163599, 0.008414111, 0.0021271762, -0.03..."


In [7]:
visualDF.head(5)

Unnamed: 0,item_id,visual
0,1500,"[-2.58587, -0.500776, 1.97281, -3.69441, 0.985..."
1,367,"[-2.37085, -0.774639, 3.25064, -3.8578, -0.152..."
2,84152,"[-3.0163, -1.53324, 0.959031, -3.8849, 0.82468..."
3,3717,"[-2.22053, -0.883457, 2.99239, -4.54089, 0.519..."
4,3238,"[-2.67517, 0.0351334, 3.3415, -3.55823, 2.3788..."


### *3. Fuse Poison-RAG-Plus (Text) and Popcorn (Visual) Datasets*

In [10]:
from popcorn.modalities.fuse_all import createMultimodalDF

# Modalities
modalitiesDict = {
  "text": textDF,
  "audio": audioDF,
  "visual": visualDF,
}

# Fuse
fusedDF, keep = createMultimodalDF(modalitiesDict)
if fusedDF is None:
    print("\n- [Error] Failed to create fused DataFrame!")

- Creating multimodal DataFrame from unimodal DataFrames ...
- Found 958 common items across modalities ...
- Created a Fused DataFrame ((958, 4)) with modalities: ['text', 'audio', 'visual'] ...
  item_id                                               text  \
0     370  [0.49536133, -0.8955078, 0.54248047, -0.454345...   
1    1209  [-0.057403564, -1.1630859, 0.0069847107, -0.27...   
2     590  [0.65722656, -0.99365234, 0.12805176, -0.00423...   

                                               audio  \
0  [-0.044065714, 0.0016181811, 0.032373417, 0.00...   
1  [-0.0009179845, 0.07858936, -0.04275679, -0.06...   
2  [-0.026329517, 0.049430598, -0.0036344617, -0....   

                                              visual  
0  [-3.6854, -1.81988, 1.8939, -4.27242, 1.65253,...  
1  [-2.4743, -0.419608, 2.4254, -2.18165, 1.20525...  
2  [-2.44992, -1.15379, 1.75277, -2.00129, 0.8737...  
- Final fused DataFrame has 958 items after combining all modalities ...


In [11]:
fusedDF.head(10)

Unnamed: 0,item_id,text,audio,visual,all
0,370,"[0.49536133, -0.8955078, 0.54248047, -0.454345...","[-0.044065714, 0.0016181811, 0.032373417, 0.00...","[-3.6854, -1.81988, 1.8939, -4.27242, 1.65253,...","[0.49536133, -0.8955078, 0.54248047, -0.454345..."
1,1209,"[-0.057403564, -1.1630859, 0.0069847107, -0.27...","[-0.0009179845, 0.07858936, -0.04275679, -0.06...","[-2.4743, -0.419608, 2.4254, -2.18165, 1.20525...","[-0.057403564, -1.1630859, 0.0069847107, -0.27..."
2,590,"[0.65722656, -0.99365234, 0.12805176, -0.00423...","[-0.026329517, 0.049430598, -0.0036344617, -0....","[-2.44992, -1.15379, 1.75277, -2.00129, 0.8737...","[0.65722656, -0.99365234, 0.12805176, -0.00423..."
3,1022,"[0.15856934, 0.08807373, -0.35083008, -0.44750...","[-0.023921229, 0.017592588, -0.0010983088, -0....","[-2.08853, -0.511596, 2.10196, -4.10757, -0.51...","[0.15856934, 0.08807373, -0.35083008, -0.44750..."
4,2226,"[0.40844727, -0.80029297, 0.46777344, -0.04315...","[-0.048151948, 0.0027557786, -0.06864824, -0.0...","[-3.04087, -1.63742, 1.81317, -4.04121, 1.2823...","[0.40844727, -0.80029297, 0.46777344, -0.04315..."
5,3363,"[-0.09069824, -0.51464844, 0.2536621, 0.007320...","[0.0060977587, 0.016961595, -0.028441846, -0.0...","[-2.62246, -0.241943, 2.71982, -3.51812, 0.421...","[-0.09069824, -0.51464844, 0.2536621, 0.007320..."
6,93,"[0.43481445, -0.4255371, -0.07397461, 0.468261...","[0.013952541, -0.015251389, 0.016110118, 0.040...","[-2.63712, -1.35229, 2.50686, -3.80382, 1.9525...","[0.43481445, -0.4255371, -0.07397461, 0.468261..."
7,53519,"[0.42163086, -0.92578125, -0.028442383, -0.320...","[-0.015700083, 0.00041778272, 0.10138887, -0.0...","[-1.18339, -1.21295, 3.69286, -3.33757, 1.6623...","[0.42163086, -0.92578125, -0.028442383, -0.320..."
8,1077,"[0.35327148, -0.78515625, 0.8022461, 0.2814941...","[0.04517092, -0.0014995754, 0.001469646, -0.02...","[-1.8034, -1.24653, 2.36966, -2.25848, 0.91729...","[0.35327148, -0.78515625, 0.8022461, 0.2814941..."
9,48516,"[0.37817383, -1.1611328, 0.5546875, -0.0334777...","[-0.07465356, -0.0042447243, 0.03698278, -0.01...","[-2.86588, -0.470954, 1.56624, -3.97172, 1.181...","[0.37817383, -1.1611328, 0.5546875, -0.0334777..."


In [12]:
# Kept items
print(f'- Kept items:\n{keep}')

- Kept items:
{'1748', '4239', '1653', '2596', '1882', '5447', '3039', '2858', '78499', '1358', '3481', '2688', '252', '2501', '2716', '3745', '2140', '4957', '90439', '1304', '58293', '6281', '1805', '327', '1633', '1140', '53000', '232', '1042', '27831', '112512', '1372', '2554', '1387', '1537', '4963', '4848', '911', '1394', '2686', '265', '1097', '1895', '1617', '707', '4090', '1339', '2015', '2482', '1104', '2710', '3668', '3257', '3210', '1284', '96079', '1136', '3893', '100', '187', '1185', '3994', '3061', '3996', '1056', '1972', '2243', '420', '44195', '4448', '335', '4022', '1278', '2761', '95167', '3744', '4447', '3534', '1370', '4466', '2366', '4499', '3115', '475', '1120', '2959', '1876', '474', '316', '2087', '3097', '3107', '2145', '47099', '4440', '109374', '2634', '81847', '1099', '3977', '842', '1291', '2826', '59315', '833', '6145', '231', '1363', '3911', '458', '266', '2570', '2841', '381', '647', '52973', '2144', '1240', '114180', '3175', '466', '918', '1059', '180'