
#### Introduction
This section focuses on generating random shopping lists based on an existing food product dataset (food_fact). By leveraging available product codes, the function creates a controlled set of simulated shopping lists, each containing a random selection of items from the dataset. This process ensures that the generated lists are valid and represent realistic scenarios for further testing and analysis of recommendation algorithms.

The goal is to provide a standardized and repeatable method to produce diverse shopping lists that can be used to evaluate recommendation methods, such as algorithm-based (Step 7) and GPT-based systems. These generated lists maintain flexibility in their structure, containing varying numbers of items (ranging from 3 to 50), which mimic the variability in real-world shopping behaviors.

By integrating randomness, data validation, and reproducibility, this process lays the foundation for robust experimentation and comparison across different recommendation systems, ensuring consistency while supporting meaningful insights into their effectiveness.


In [3]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [4]:
import pandas as pd
import random

# read food fact
food_fact_file = 'drive/MyDrive/Colab Notebooks/data/food_fact_withUnhealthScore_withNOVA_withCategory.csv'
food_fact_df = pd.read_csv(food_fact_file)

if 'code' not in food_fact_df.columns:
    raise ValueError("The 'code' column is missing in the food_fact file.")
food_fact_df['code'] = food_fact_df['code'].astype(str)

# read available codes
available_codes = food_fact_df['code'].dropna().tolist()

# initialize a list
random_shopping_lists = []

# randomize designed_listed
for i in range(1, 21):  # create 20 lists
    num_items = random.randint(2, 10)  # each lists contain 2-10 random codes
    random_codes = random.sample(available_codes, num_items)
    shopping_list = {
        'product_ids': random_codes
    }
    random_shopping_lists.append(shopping_list)

# convert DataFrame
shopping_df = pd.DataFrame(random_shopping_lists)

# output to excel
output_file = 'drive/MyDrive/Colab Notebooks/data/random_shopping_lists.csv'
shopping_df.to_csv(output_file, index=False)

print(f"Random shopping lists saved to: {output_file}")

Random shopping lists saved to: drive/MyDrive/Colab Notebooks/data/random_shopping_lists.csv


  food_fact_df = pd.read_csv(food_fact_file)
