# Generate Odoo Dataset with AI Descriptions (Kaggle Verified)

This notebook runs a background job to generating an Odoo 19.0 dataset with AI-enhanced descriptions using a local LLM (Llama 3 via Ollama).

### **Prerequisites (Before Running)**
1. **Internet Access**: Enable "Internet" in the Notebook Settings (Right Panel).
2. **GPU Accelerator**: Select **GPU T4 x2** (or GPU P100) in the Notebook Settings. 
   * *Do NOT use TPU.*

### **How to Run for Vacation (Background Mode)**
1. Click **"Save Version"** (Top Right).
2. Select **"Save & Run All (Commit)"**.
3. Click **Save**.


In [None]:
# Step 1: Install & Start Ollama (With Health Check)
# We do everything in one shell block to avoid environment issues.

# 1. Install Dependencies
!apt-get update && apt-get install -y zstd pciutils lshw

# 2. Install Ollama
!curl -fsSL https://ollama.com/install.sh | sh

# 3. Install Python Deps
!pip install pandas openpyxl requests tqdm openai

# 4. Start Server in Background using nohup
print("Starting Ollama Server...")
get_ipython().system_raw('ollama serve > ollama.log 2>&1 &')

# 5. Health Check Loop (Crucial)
import time
import requests
print("Waiting for Ollama to start...")
for i in range(20):
    try:
        resp = requests.get("http://localhost:11434")
        if resp.status_code == 200:
            print("Ollama is UP and Running!")
            break
    except:
        pass
    time.sleep(2)
else:
    print("ERROR: Ollama failed to start. Printing logs:")
    !cat ollama.log

print("Pulling Llama 3 (This takes 2-3 minutes)...")
!ollama pull llama3

In [None]:
# Step 2: Clone Resources
# 1. Clone Odoo 19.0 Source Code
print("Cloning Odoo 19.0 source code...")
!git clone --depth 1 --branch 19.0 https://github.com/odoo/odoo.git odoo-19.0

# 2. Clone the Generation Script from your Repo
print("Cloning generation script...")
!rm -rf odoo_dataset_creation_technical # Cleanup if exists
!git clone https://github.com/midhlaj-nk/odoo_dataset_creation_technical.git odoo_dataset_creation_technical

In [None]:
# Step 3: Run the Generation Script

import os

# Define Paths
working_dir = "/kaggle/working"
script_path = f"{working_dir}/odoo_dataset_creation_technical/generate_odoo_dataset.py"
odoo_source = f"{working_dir}/odoo-19.0"
output_file = f"{working_dir}/odoo_19_full_dataset_ai.xlsx"

# Reset Checkpoint for fresh run if needed (Optional: Uncomment to force restart)
# !rm /kaggle/working/*.checkpoint.csv

print("Starting Dataset Generation...")
print(f"Script: {script_path}")

# Run python script
!python3 "$script_path" "$odoo_source" "$output_file" --ollama-model llama3

print("DONE! Check the Output section of this notebook version to download your file.")