# **Honegumi RAG Assistant: A Google Colab Tutorial**
**Agentic Code Generation for Bayesian Optimization**

### **Purpose of This Tutorial**
This tutorial demonstrates how to use [**Honegumi RAG Assistant**](https://github.com/hasan-sayeed/honegumi_rag_assistant), an intelligent agentic AI system that automatically generates high-quality, executable Python code for Bayesian optimization experiments. With just a natural language problem description, Honegumi RAG Assistant can:

- Interpret your optimization problem and extract parameters automatically
- Generate deterministic code skeletons using [Honegumi](https://honegumi.readthedocs.io)
- Retrieve relevant [Ax Platform](https://ax.dev/) documentation to enhance code generation
- Produce complete, ready-to-run Python code tailored to your specific requirements
- Optionally review and refine the generated code

By following this tutorial, you'll learn how to set up Honegumi RAG Assistant in Google Colab, describe your optimization problem, build a vector store for documentation retrieval, and execute the agentic pipeline to generate production-ready Bayesian optimization code.

> This automation offers a powerful **starting point for optimization engineers and researchers**—helping them **move faster**, explore ideas more effectively, and focus on science rather than boilerplate code.

## **Step 1. Install Required Packages**

Installs the `honegumi-rag-assistant` package. Output is suppressed for a cleaner notebook experience.

In [None]:
!pip install honegumi-rag-assistant

## **Step 2. Mount Google Drive**
Allows the notebook to save the vector store and generated code to your Google Drive.

In [None]:
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


## **Step 3. Set Your API Keys**
Honegumi RAG Assistant requires two API keys:

- `OPENAI_API_KEY` – for accessing GPT models (e.g., GPT-5, GPT-4o)
- `LANGCHAIN_API_KEY` – for logging execution traces to LangSmith (optional but recommended)

Choose one of the following methods:

### **Option A: Set Environment Variables Directly**
Replace the placeholders with your actual keys.

In [None]:
%env OPENAI_API_KEY=sk-...
%env LANGCHAIN_API_KEY=lsv2_...

### **Option B: Use the Colab Secrets Sidebar**
1. In the left Secrets tab (🔑), add two secrets with exact names:
   - `OPENAI_API_KEY`
   - `LANGCHAIN_API_KEY`

2. Then run the following to inject them into your environment:

In [None]:
from google.colab import userdata
import os

# List whatever secrets you've added in the UI
for key in ("OPENAI_API_KEY", "LANGCHAIN_API_KEY"):
    val = userdata.get(key)  # Grabs the secret by name
    if val is not None:
        os.environ[key] = val  # Inject into the process env

This makes your API keys available to Honegumi RAG Assistant without hardcoding them into the notebook.

## **Step 4. Build Vector Store (One-Time Setup)**

The vector store contains embeddings of Ax Platform documentation, enabling the assistant to retrieve relevant context when generating code.

**Note:** This defaults to Ax v0.4.3 (matching the honegumi dependency). This ensures documentation matches the API version used in generated code.

In [None]:
# Set the path where the vector store will be saved
VECTORSTORE_PATH = "/content/drive/MyDrive/honegumi_data/ax_docs_vectorstore"

# Build the vector store using the installed package
from honegumi_rag_assistant.build_vector_store import main
import sys

# Pass arguments to the build script
sys.argv = ['build_vector_store.py', '--output', VECTORSTORE_PATH]
main()

## **Step 5. Set Vector Store Path**

Configure environment variables and reload settings to ensure the assistant can find the vector store.

In [None]:
import os

# Set environment variables
os.environ['AX_DOCS_VECTORSTORE_PATH'] = "/content/drive/MyDrive/honegumi_data/ax_docs_vectorstore"
os.environ['OUTPUT_DIR'] = "/content/drive/MyDrive/honegumi_data/honegumi_outputs"

# IMPORTANT: Reload settings if package was already imported
# This ensures the settings pick up the new environment variables
try:
    from honegumi_rag_assistant.app_config import settings
    settings.reload_from_env()
    print(f"✓ Vector store path set to: {settings.retrieval_vectorstore_path}")
except ImportError:
    # Package not yet imported, settings will load correctly on first import
    pass

## **Step 6. Run Honegumi RAG Assistant**

Now you can describe your optimization problem and let the assistant generate code! (You can find an example run in the hidden output)

Use the Python API directly within the notebook:

In [None]:
from honegumi_rag_assistant.orchestrator import run_from_text

# Describe your optimization problem
problem = """
Optimize temperature (50-200°C) and pressure (1-10 bar) for maximum yield
in a chemical reaction.
"""

# Generate code (streaming enabled by default)
code = run_from_text(
    problem,
    output_dir="/content/drive/MyDrive/honegumi_data/honegumi_outputs",  # Set to None to skip saving
    debug=False,  # Set to True for detailed logging
    enable_review=False  # Set to True for code review (slower but more accurate)
)

print("\n" + "="*80)
print("GENERATED CODE")
print("="*80)
print(code)

Analyzing problem and selecting optimization parameters...
Generating code skeleton using Honegumi...
Planning retrieval strategy...
Skeleton is sufficient, skipping retrieval

GENERATED CODE
# Generated by Honegumi (https://arxiv.org/abs/2502.06815)
# %pip install ax-platform==0.4.3 matplotlib
import numpy as np
import pandas as pd
from ax.service.ax_client import AxClient, ObjectiveProperties
import matplotlib.pyplot as plt


obj1_name = "branin"


def branin(x1, x2):
    y = float(
        (x2 - 5.1 / (4 * np.pi**2) * x1**2 + 5.0 / np.pi * x1 - 6.0) ** 2
        + 10 * (1 - 1.0 / (8 * np.pi)) * np.cos(x1)
        + 10
    )

    return y


ax_client = AxClient()

ax_client.create_experiment(
    parameters=[
        {"name": "x1", "type": "range", "bounds": [-5.0, 10.0]},
        {"name": "x2", "type": "range", "bounds": [0.0, 10.0]},
    ],
    objectives={
        obj1_name: ObjectiveProperties(minimize=True),
    },
)


for i in range(19):

    parameterization, trial_index = ax_

## **Step 7. View Generated Code**

If you specified `--output-dir`, navigate to your Google Drive folder to find:
- `honegumi_generated_<hash>.py` - The complete Python script for your optimization problem

You can also trace the assistant's reasoning in your [LangSmith dashboard](https://smith.langchain.com/).

## **Example Problems to Try**

Here are some example optimization problems you can try:

### **1. Chemical Process Optimization**
```
Optimize temperature (100-300°C), pressure (1-5 bar), and catalyst concentration (0.1-1.0 M)
to maximize conversion rate in a catalytic reaction.
```

### **2. Materials Design**
```
Optimize composition of a polymer blend: Component A (0-100%), Component B (0-100%),
and curing temperature (80-150°C) to maximize tensile strength while minimizing cost.
```

### **3. Machine Learning Hyperparameters**
```
Optimize neural network hyperparameters: learning rate (1e-5 to 1e-1),
batch size (16 to 256), and dropout rate (0.1 to 0.5) to maximize validation accuracy.
```

### **4. Pharmaceutical Formulation**
```
Optimize drug formulation: API concentration (5-20 mg/mL), pH (4-8),
and excipient ratio (0.5-2.0) to maximize bioavailability and minimize side effects.
```