### Objective: 
This notebook provides an overview on how to use the framework for Prompt tracking. 

#### AI/ ML engineer journey on Prompt Engineering
```
AI Engineer
     |
     v
Create Experiment ("prompts for Text2SQL")
     |
     v
+-------------------------+        +--------------+        +----------------------+
|   Run 1: LLM model      | -----> | Check result | -----> | F1 score(metric): 0.81|
| name=gemini-1.5-flash,
 temperature=0.1          |         +--------------+        +----------------------+
+-------------------------+

     |
     v

+-------------------------+        +--------------+        +----------------------+
|   Run 2: LLM model      | -----> | Check result | -----> | F1 score(metric): 0.85|
| name=gemini-1.5-pro,
 temperature=0.1          |        +--------------+        +-----------------------+
+-------------------------+

     |
     v

More Runs (optional)
     |
     v
Compare & Analyze Results
```


In [1]:
## import pacakges
from src import tree

#### Create the project

In [2]:
tree.set_project(name = "TEXT2SQL",
                 created_by = "member 1")

Project TEXT2SQL - (ID: 8391b544-247a-497b-acfb-43694fda3762) created successfully


#### Create Experiment

In [3]:
tree.set_experiment(name = "text2sql_simple_prompting",
                 created_by = "member 1",
                 description = "Apply direct prompting for text2sql")

Experiment text2sql_simple_prompting - (ID: 33d83978-f23b-40fb-a544-7000eda2c0e3) created successfully


#### Start Runs
A run is an instance of the experiment with specific hyperparamters and other things

In [4]:
tree.start_run(name="run1", created_by="member 1")

Run run1 - (ID: 80b5224f-dc10-40be-8883-fc4dafbeb69d) created successfully


#### Prepare dataset

In [5]:
prompt = """You are an expert in generating SQL queries from english text. Given below is text to convert to SQL \n
get me all rows with name XYZ"""

In [6]:
tree.log_prompts(name="direct_prompting", value=prompt)

'prompt : direct_prompting' logged (ID: 48d14691-eec4-4f6b-9d62-e4d2902ef512)


#### Perform training with hyperparameters and log the experiment metadata

In [7]:
llm_params = {
    "model": "gemini-1.5-flash",
    "temperature": 0.1,
    "random_state": 42
}

tree.log_hyperparameters(value = llm_params)

'hyperparameter : model' logged (ID: cbadcf65-781c-41f8-a07b-a33bbcaecaa5)
'hyperparameter : temperature' logged (ID: fd36a53c-942f-47cc-b116-32fa358900e4)
'hyperparameter : random_state' logged (ID: 323fdc63-507e-4908-96e2-0b9ee998daf3)


In [8]:
class LLM():
    ## dummy LLM - replace with actual model api/code
    def __init__(self, params):
        self.params = params

    def invoke(self, query):
        return "SELECT * FROM your_table_name WHERE name = XYZ"
 


In [9]:
model = LLM(llm_params)
result = model.invoke(prompt)

In [10]:
# evaluate -  setting dummy value
tree.log_metrics(name="F1 score", 
                 value = 0.81)

#log result as artifact
tree.log_artifacts(name="output", value=result, artifact_type="data")

'metric : F1 score' logged (ID: 84c55e65-9698-437f-a2e3-83805951df55)
'artifact : output' logged (ID: 06a921a8-2ddc-4bc8-80d3-aac320e9721f)


In [11]:
tree.stop_run()

#### Train with different hyperparameters

In [12]:
tree.start_run(name="run2", created_by="member 1") #create a new run

Run run2 - (ID: 85926484-86f1-4a6c-bb37-5a904cb1c011) created successfully


In [13]:
llm_params = {
    "model": "gemini-1.5-pro",
    "temperature": 0.2,
    "random_state": 42
}

tree.log_hyperparameters(value = llm_params)

'hyperparameter : model' logged (ID: c0dc671f-8d7b-416b-b7d6-20ae82782f2e)
'hyperparameter : temperature' logged (ID: dab15d95-92b5-4b4b-81c3-31e9148001e9)
'hyperparameter : random_state' logged (ID: ed2c0f9e-996a-4917-91cd-bbe16bac8eae)


In [14]:
prompt = """Given below is a text to be converted to SQL. You can do this because you are a data engineer \n
get me all rows with name XYZ"""

tree.log_prompts(name="direct_prompting", value=prompt)

model = LLM(llm_params)
result = model.invoke(prompt)

# evaluate -  setting dummy value
tree.log_metrics(name="F1 score", 
                 value = 0.85)

#log result as artifact
tree.log_artifacts(name="output", value="SELECT * FROM your_table_name WHERE name = 'XYZ'", artifact_type="data")

tree.stop_run()

'prompt : direct_prompting' logged (ID: 52df9319-3cd1-4666-9eee-cfc14de02105)
'metric : F1 score' logged (ID: c95b61cf-c699-4921-ab67-6e86420d3d5c)
'artifact : output' logged (ID: df9b4372-d300-416f-b30c-4671bb1aa412)


#### Explore Experiment and compare Runs 

In [15]:
tree.view_experiments()

HTML(value="<h1 style='color: #2c3e50; margin-bottom: 20px;'>Experiment Dashboard</h1>")

Dropdown(description='Experiment:', layout=Layout(width='350px'), options=('text2sql_simple_prompting',), styl…

HTML(value="\n        <div style='border: 1px solid #e0e0e0; border-radius: 8px; padding: 15px; margin-bottom:…

HTML(value="<div style='font-size: 1.3em; font-weight: bold; color: #3a3a3a; margin-bottom: 12px;'>Metrics Sum…

HBox(children=(Text(value='', description='Filter runs:', layout=Layout(margin='0 20px 0 0', width='350px'), p…

VBox(children=(HBox(children=(HTML(value="<div style='font-size: 1.3em; font-weight: bold; color: #3a3a3a; mar…

HTML(value="<div style='padding: 8px; color: #555;'>Displaying 2 runs for experiment 'text2sql_simple_promptin…

#### Look at individual runs

In [16]:
tree.view_runs(experiment_name="text2sql_simple_prompting")

HTML(value='\n        <style>\n            .run-view-container { max-width: 1000px; margin: 0 auto; font-famil…

VBox(children=(HTML(value='<div class="run-view-header"><div class="run-view-title">ML Run Explorer</div></div…

#### Export tree and track this file on git - experiment metadata can be tracked on git like any other code/file

In [17]:
tree.export_tree(filename="prompt_text2sql.json")