<a href="https://colab.research.google.com/github/RecSys-lab/Popcorn/blob/main/examples/colab/load_poison_rag_plus.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **üçø Popcorn Framework in Google Colab**
### **Load LLM-Augmented MovieLens (Poison-RAG-Plus)**

üé¨ Popcorn Framework: [link](https://github.com/RecSys-lab/Popcorn)

üé¨ Poison-RAG-Plus Dataset: [link](https://github.com/yasdel/Poison-RAG-Plus/tree/main/AttackData/Embeddings_from_Augmentation_Attack_Data/ml-latest-small)

## **[Step 1] Clone Popcorn Movie Recommender Tool**

Clone the framework into your `GDrive` and prepare it for experiments.

‚ö†Ô∏è You might see a *"Restart Session"* warning during the first run in Google Colab due to library version mismatches. This is expected! Accept the restart, re-run this cell, and continue!

In [1]:
# Clone the repo
!git clone https://github.com/RecSys-lab/Popcorn.git

# Install the required library
%cd Popcorn
!pip install -e .

# Add the repository to the Python path
import sys
sys.path.append('/content/Popcorn')

# Go back to the root
%cd ..

fatal: destination path 'Popcorn' already exists and is not an empty directory.
/content/Popcorn
Obtaining file:///content/Popcorn
  Preparing metadata (setup.py) ... [?25l[?25hdone
Installing collected packages: Popcorn
  Attempting uninstall: Popcorn
    Found existing installation: Popcorn 1.6.0
    Uninstalling Popcorn-1.6.0:
      Successfully uninstalled Popcorn-1.6.0
  Running setup.py develop for Popcorn
Successfully installed Popcorn-1.6.0
/content


## üöÄ **[Step 2] Use the Framework**

### *1. Load Configurations and Imports*

In [2]:
import os
import json
import pandas as pd
from popcorn.utils import readConfigs

# Start the Framework
print("Welcome to 'Popcorn' üçø! Starting the framework for your movie recommendation ...\n")

# Read the configuration file
configs = readConfigs("Popcorn/popcorn/config/config.yml")
# If properly read, print the configurations
if not configs:
    print("Error reading the configuration file!")

Welcome to 'Popcorn' üçø! Starting the framework for your movie recommendation ...

- Reading the framework's configuration file ...
- Configuration file loaded successfully!


### *2. Load Poison-RAG-Plus - LLM Enriched Llama*

In [3]:
from popcorn.datasets.poison_rag_plus.loader import loadPoisonRagPlus

# Override Config Variables (Optional)
configs["datasets"]["unimodal"]["poison_rag_plus"]["llm"] = "llama"
configs["datasets"]["unimodal"]["poison_rag_plus"]["augmented"] = True

# Run the Loader
itemsTextDF = loadPoisonRagPlus(configs)
if itemsTextDF is not None:
    print(f"\n- itemsTextDF (shape: {itemsTextDF.shape}): \n{itemsTextDF.head()}")


- Preparing the 'Poison-RAG-Plus' dataset with 'llama'-driven enriched embeddings ...
-- Loading data from 'llama_enriched_description_part1.csv.gz' ...
-- Loading data from 'llama_enriched_description_part2.csv.gz' ...
-- Loading data from 'llama_enriched_description_part3.csv.gz' ...
-- Loading data from 'llama_enriched_description_part4.csv.gz' ...
-- Loading data from 'llama_enriched_description_part5.csv.gz' ...
- Finished loading 4 parts of textual enriched data with 1,606 items!

- itemsTextDF (shape: (1606, 2)): 
  item_id                                               text
0    1516  [0.80371094, -0.3852539, 0.6904297, -0.4497070...
1    5952  [-0.051330566, -0.8935547, -0.4013672, -0.5205...
2     370  [0.49536133, -0.8955078, 0.54248047, -0.454345...
3     292  [0.43847656, -0.7661133, 0.17468262, -0.252929...
4    1209  [-0.057403564, -1.1630859, 0.0069847107, -0.27...


### *3. Load Poison-RAG-Plus - Original (Raw) Llama*

In [4]:
from popcorn.datasets.poison_rag_plus.loader import loadPoisonRagPlus

# Override Config Variables (Optional)
configs["datasets"]["unimodal"]["poison_rag_plus"]["llm"] = "llama"
configs["datasets"]["unimodal"]["poison_rag_plus"]["augmented"] = False

# Run the Loader
itemsTextDF = loadPoisonRagPlus(configs)
if itemsTextDF is not None:
    print(f"\n- itemsTextDF (shape: {itemsTextDF.shape}): \n{itemsTextDF.head()}")


- Preparing the 'Poison-RAG-Plus' dataset with 'llama'-driven original embeddings ...
-- Loading data from 'llama_originalraw_combined_all_part1.csv.gz' ...
-- Loading data from 'llama_originalraw_combined_all_part2.csv.gz' ...
-- Loading data from 'llama_originalraw_combined_all_part3.csv.gz' ...
-- Loading data from 'llama_originalraw_combined_all_part4.csv.gz' ...
-- Loading data from 'llama_originalraw_combined_all_part5.csv.gz' ...
- Finished loading 4 parts of textual original data with 1,606 items!

- itemsTextDF (shape: (1606, 2)): 
  item_id                                               text
0    1516  [0.45874023, -0.53515625, 0.46606445, -0.27514...
1    5952  [0.50927734, -0.23547363, -0.37670898, -0.8081...
2     370  [0.7084961, -0.63427734, 0.35327148, -1.042968...
3     292  [0.296875, -0.99658203, 0.2993164, -0.24438477...
4    1209  [-0.0099487305, -0.8227539, 0.17565918, -0.463...


### *4. Load Poison-RAG-Plus - LLM Enriched OpenAI*

In [5]:
from popcorn.datasets.poison_rag_plus.loader import loadPoisonRagPlus

# Override Config Variables (Optional)
configs["datasets"]["unimodal"]["poison_rag_plus"]["llm"] = "openai"
configs["datasets"]["unimodal"]["poison_rag_plus"]["augmented"] = True

# Run the Loader
itemsTextDF = loadPoisonRagPlus(configs)
if itemsTextDF is not None:
    print(f"\n- itemsTextDF (shape: {itemsTextDF.shape}): \n{itemsTextDF.head()}")


- Preparing the 'Poison-RAG-Plus' dataset with 'openai'-driven enriched embeddings ...
-- Loading data from 'openai_enriched_description_part1.csv.gz' ...
-- Loading data from 'openai_enriched_description_part2.csv.gz' ...
-- Loading data from 'openai_enriched_description_part3.csv.gz' ...
-- Loading data from 'openai_enriched_description_part4.csv.gz' ...
- Finished loading 3 parts of textual enriched data with 1,606 items!

- itemsTextDF (shape: (1606, 2)): 
  item_id                                               text
0    1516  [-0.009714896, -0.024003807, -0.0416483, -0.02...
1    5952  [0.0024696812, -0.03361401, -0.019164726, -0.0...
2     370  [-0.0020823667, -0.027629452, 0.006294715, -0....
3     292  [-0.011372974, -0.038963087, -0.024515806, -0....
4    1209  [0.007154904, -0.025495825, -0.011659123, -0.0...


### *5. Load Poison-RAG-Plus - Original (Raw) OpenAI*

In [6]:
from popcorn.datasets.poison_rag_plus.loader import loadPoisonRagPlus

# Override Config Variables (Optional)
configs["datasets"]["unimodal"]["poison_rag_plus"]["llm"] = "openai"
configs["datasets"]["unimodal"]["poison_rag_plus"]["augmented"] = False

# Run the Loader
itemsTextDF = loadPoisonRagPlus(configs)
if itemsTextDF is not None:
    print(f"\n- itemsTextDF (shape: {itemsTextDF.shape}): \n{itemsTextDF.head()}")


- Preparing the 'Poison-RAG-Plus' dataset with 'openai'-driven original embeddings ...
-- Loading data from 'openai_originalraw_combined_all_part1.csv.gz' ...
-- Loading data from 'openai_originalraw_combined_all_part2.csv.gz' ...
-- Loading data from 'openai_originalraw_combined_all_part3.csv.gz' ...
-- Loading data from 'openai_originalraw_combined_all_part4.csv.gz' ...
- Finished loading 3 parts of textual original data with 1,606 items!

- itemsTextDF (shape: (1606, 2)): 
  item_id                                               text
0    1516  [-0.0027484058, -0.03802286, -0.015009024, -0....
1    5952  [-0.0061330223, -0.043979675, -0.038198307, -0...
2     370  [0.0098179635, -0.040216014, -0.013026823, -0....
3     292  [-0.0026131403, -0.06225721, -0.003234927, -0....
4    1209  [0.01185473, -0.022268554, -0.0034647249, -0.0...


### *6. Load Poison-RAG-Plus - LLM Enriched SentenceTransformer*

In [7]:
from popcorn.datasets.poison_rag_plus.loader import loadPoisonRagPlus

# Override Config Variables (Optional)
configs["datasets"]["unimodal"]["poison_rag_plus"]["llm"] = "st"
configs["datasets"]["unimodal"]["poison_rag_plus"]["augmented"] = True

# Run the Loader
itemsTextDF = loadPoisonRagPlus(configs)
if itemsTextDF is not None:
    print(f"\n- itemsTextDF (shape: {itemsTextDF.shape}): \n{itemsTextDF.head()}")


- Preparing the 'Poison-RAG-Plus' dataset with 'st'-driven enriched embeddings ...
-- Loading data from 'st_enriched_description_part1.csv.gz' ...
-- Loading data from 'st_enriched_description_part2.csv.gz' ...
- Finished loading 1 parts of textual enriched data with 1,606 items!

- itemsTextDF (shape: (1606, 2)): 
  item_id                                               text
0    1516  [-0.08001819, 0.017956821, -0.019217214, -0.04...
1    5952  [-0.18816385, -0.0593223, -0.06749434, -0.0184...
2     370  [-0.044479717, 0.037046302, -0.06883856, -0.09...
3     292  [-0.10242986, -0.11134509, -0.008446949, 0.018...
4    1209  [0.04929607, -0.021340176, 0.11121611, 0.04772...


### *7. Load Poison-RAG-Plus - Original (Raw) SentenceTransformer*

In [8]:
from popcorn.datasets.poison_rag_plus.loader import loadPoisonRagPlus

# Override Config Variables (Optional)
configs["datasets"]["unimodal"]["poison_rag_plus"]["llm"] = "st"
configs["datasets"]["unimodal"]["poison_rag_plus"]["augmented"] = False

# Run the Loader
itemsTextDF = loadPoisonRagPlus(configs)
if itemsTextDF is not None:
    print(f"\n- itemsTextDF (shape: {itemsTextDF.shape}): \n{itemsTextDF.head()}")


- Preparing the 'Poison-RAG-Plus' dataset with 'st'-driven original embeddings ...
-- Loading data from 'st_originalraw_combined_all_part1.csv.gz' ...
-- Loading data from 'st_originalraw_combined_all_part2.csv.gz' ...
- Finished loading 1 parts of textual original data with 1,606 items!

- itemsTextDF (shape: (1606, 2)): 
  item_id                                               text
0    1516  [0.040781662, -0.107335605, 0.022758601, -0.05...
1    5952  [0.05160862, -0.051424377, 0.044959094, -0.008...
2     370  [-0.055119, 0.06066133, -0.04100383, -0.088877...
3     292  [-0.0750024, -0.060989928, -0.0090702865, -0.0...
4    1209  [-0.028676523, -0.056740984, -0.04152333, 0.01...
