# Generate Odoo Dataset with AI Descriptions on Kaggle

This notebook allows you to run the generation task in the background on Kaggle for up to 12 hours.

### Instructions:
1. In the Notebook settings (right sidebar), turn on **Internet**.
2. In the Notebook settings, set Accelerator to **GPU T4 x2**.
3. Click **Save Version** -> **Save & Run All (Commit)** to run this in the background.

In [None]:
# Step 1: Install Ollama and Python Dependencies
# Kaggle uses a different base image, so we ensure dependencies are met
!curl -fsSL https://ollama.com/install.sh | sh
!pip install pandas openpyxl requests tqdm openai

In [None]:
# Step 2: Start Ollama Server in the background
import subprocess
import time

# Start Ollama server
process = subprocess.Popen(["ollama", "serve"], stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL)
time.sleep(10)  # Give it a moment to start

In [None]:
# Step 3: Pull the Llama 3 model (this uses the Kaggle GPU)
!ollama pull llama3

In [None]:
# Step 4: Download Odoo 19.0 Source Code
# We clone it directly to the temporary working directory
!git clone --depth 1 --branch 19.0 https://github.com/odoo/odoo.git odoo-19.0

In [None]:
# Step 5: Download the Generation Script
# REPLACE THIS URL with the Raw GitHub URL of your generate_odoo_dataset.py file once you push it.
# For now, we assume you might upload it as a dataset or paste it, but downloading is easiest for automation.
# Example: !wget https://raw.githubusercontent.com/YOUR_USER/YOUR_REPO/main/generate_odoo_dataset.py

# If you are adding the script as a Kaggle Utility Script or uploading it, adjust the path.
# For this template, we will assume you Clone your repo containing the script.

# OPTION A: Clone your repo (Best)
# !git clone https://github.com/YOUR_GITHUB_USERNAME/odoo_dataset_creation.git my_repo
# script_path = "my_repo/generate_odoo_dataset.py"

# OPTION B: Upload script manually to Kaggle input (Files -> Add Data -> Upload)
# script_path = "/kaggle/input/odoo-script/generate_odoo_dataset.py"

# HOLDER: For now, I will create the file here so this notebook works standalone if you paste the code.
# You should likely replace this with a git clone commands of your repo.

print("Please configure Step 5 to point to your script!")

In [None]:
# Step 6: Run the Generation Script

# Kaggle Output Directory
output_dir = "/kaggle/working"
odoo_source = "/kaggle/working/odoo-19.0"
output_file = f"{output_dir}/odoo_19_full_dataset_ai.xlsx"

# Assume script is at ./generate_odoo_dataset.py (if you upload it or clone it here)
script_path = "./odoo_dataset_creation/generate_odoo_dataset.py" # Adjust this based on Step 5

# Run the script (Full Run)
# Note: In Kaggle Background run, you won't see realtime logs easily, but it will save the file.
!python3 "$script_path" "$odoo_source" "$output_file" --ollama-model llama3