## Overview

The following notebook demonstrates using [llama-prompt-ops](https://github.com/meta-llama/llama-prompt-ops) tool with [hotpotqa example](https://github.com/meta-llama/llama-prompt-ops/tree/main/use-cases/hotpotqa) on Databricks environment.

It covers:

**1. Tool Setup**

**2. Data Prepration**

**3. Config file with Databricks url and model**

**4. Execute llama-prompt-ops migrate**




### 1. Tool Setup

In [0]:

# Install llama-prompt-ops
!pip install llama-prompt-ops
dbutils.library.restartPython()

[43mNote: you may need to restart the kernel using %restart_python or dbutils.library.restartPython() to use updated packages.[0m


### 2. Data Prepration

In [0]:
# Download the sample dataset
!curl -o hotpotqa_sample.json http://curtis.ml.cmu.edu/datasets/hotpot/hotpot_dev_distractor_v1.json

I0000 00:00:1748282391.563007   35898 fork_posix.cc:75] Other threads are currently calling into gRPC, skipping fork() handlers


  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0  0 44.1M    0  379k    0     0   549k      0  0:01:22 --:--:--  0:01:22  548k 16 44.1M   16 7412k    0     0  4414k      0  0:00:10  0:00:01  0:00:09 4412k 33 44.1M   33 14.9M    0     0  5755k      0  0:00:07  0:00:02  0:00:05 5754k 53 44.1M   53 23.8M    0     0  6679k      0  0:00:06  0:00:03  0:00:03 6679k 74 44.1M   74 32.7M    0     0  7212k      0  0:00:06  0:00:04  0:00:02 7211k 95 44.1M   95 42.1M    0     0  7606k      0  0:00:05  0:00:05 --:--:-- 8583k100 44.1M  100 44.1M    0     0  7689k      0  0:00:05  0:00:05 --:--:-- 8998k


In [0]:
# Taking a subset of 100 samples
import json
input = json.loads(open("hotpotqa_sample.json").read())
print(len(input))
input_subset = input[:100]
with open("hotpotqa_sample_subset.json", "w") as outfile:
    json.dump(input_subset, outfile)

7405


### 3. Create Config 
- Create the configuration file (hotpotqa_databricks.yaml)
- Set your databricks base url and model name

In [0]:
from dotenv import load_dotenv
import os

load_dotenv()
os.environ["DATABRICKS_API_TOKEN"] = os.getenv("DATABRICKS_API_TOKEN")

### 4. Execute llama-prompt-ops migrate

In [0]:
!llama-prompt-ops migrate --config hotpotqa_databricks.yaml --api-key-env DATABRICKS_API_TOKEN --log-level CRITICAL

Loaded environment variables from .env
Loaded configuration from hotpotqa_databricks.yaml
 Using model with DSPy: databricks/databricks-llama-4-maverick
Using the same model for task and proposer: databricks/databricks-llama-4-maverick
Using metric: HotpotQAMetric
Resolved relative dataset path to: /Workspace/Users/anusha_7777@yahoo.com/databricks-prompt-ops-example/hotpotqa_sample_subset.json
Using dataset adapter: HotpotQAAdapter
Auto-detected LlamaStrategy for model: databricks-llama-4-maverick
Using 'system_prompt' from config
Using output prefix from config: hotpotqa_databricks
Starting prompt optimization...
2025/05/26 20:21:00 INFO dspy.teleprompt.mipro_optimizer_v2: 
RUNNING WITH THE FOLLOWING LIGHT AUTO RUN SETTINGS:
num_trials: 7
minibatch: False
num_candidates: 5
valset size: 7

2025/05/26 20:21:00 INFO dspy.teleprompt.mipro_optimizer_v2: 
==> STEP 1: BOOTSTRAP FEWSHOT EXAMPLES <==
2025/05/26 20:21:00 INFO dspy.teleprompt.mipro_optimizer_v2: These will be