# Introduction

In this notebook, our focus is three-fold: firstly, to demonstrate how one can seamlessly connect to OpenAI's GPT-3.5 using our existing connector, secondly, to showcase how to effectively create Moonshot's recipe and cookbook, and lastly to run benchmarks leveraging the Moonshot library.

* Create an endpoint
* Create a recipe
* Create a cookbook
* List and run a recipe
* List and run a cookbook

## Pre-requisite

If you have not create a virtual environment with this notebook, we suggest creating one to avoid any conflicts in the Python libraries. Once you have created the virtual environment, install all the requirements using the following command:

```pip install -r requirements.txt```

## Import and Environment Variables

Import Moonshot library to use in Jupyter notebook

In [8]:
from IPython.display import display, HTML
display(HTML("<style>.container { width:100% !important; }</style>"))

import sys, os, json
sys.path.insert(0, '../src')

from moonshot.src.common.env_variables import load_env
from moonshot.src.benchmarking.cookbook import (
    add_new_cookbook,
    get_all_cookbooks,
    get_cookbook,
)
from moonshot.src.benchmarking.recipe import (
    add_new_recipe,
    get_all_recipes,
    get_recipes,
)
from moonshot.src.benchmarking.results import get_all_results, read_results
from moonshot.src.benchmarking.run import Run, RunTypes, get_all_runs
from moonshot.src.common.connection import (
    add_new_endpoint,
    get_connection_types,
    get_endpoints,
)

### To prettify the tables, we use Python library - rich ###
from rich.columns import Columns
from rich.console import Console
from rich.panel import Panel
from rich.table import Table

moonshot_path = "../src/moonshot/data/"

env = {
    "LLM_ENDPOINTS": os.path.join(moonshot_path, "llm-endpoints"),
    "LLM_CONNECTION_TYPES": os.path.join(moonshot_path, "llm-connection-types"),
    "RECIPES": os.path.join(moonshot_path, "recipes"),
    "COOKBOOKS": os.path.join(moonshot_path, "cookbooks"),
    "DATASETS": os.path.join(moonshot_path, "datasets"),
    "PROMPT_TEMPLATES": os.path.join(moonshot_path, "prompt-templates"),
    "METRICS": os.path.join(moonshot_path, "metrics"),
    "RESULTS": os.path.join(moonshot_path, "results"),
    "DATABASES": os.path.join(moonshot_path, "databases"),
    "SESSIONS": os.path.join(moonshot_path, "sessions"),
}

load_env(env)
# initialise the global console
console = Console()

## Prettify Functions

These functions help to beautify the results from Moonshot libraries.

<a id='prettified_functions'></a>

In [9]:
def list_connection_types(connection_types):
    if connection_types:
        table = Table("No.", "Connection Type")
        for connection_id, connection_type in enumerate(connection_types, 1):
            table.add_section()
            table.add_row(str(connection_id), connection_type)
        console.print(table)
    else:
        console.print("[red]There are no connection types found.[/red]")
        
def list_endpoints(endpoints_list):
    if endpoints_list:
        table = Table(
            "No.",
            "Connection Type",
            "Name",
            "Uri",
            "Token",
            "Max calls per second",
            "Max concurrency",
            "Params",
            "Created Date",
        )
        for endpoint_id, endpoint in enumerate(endpoints_list, 1):
            (
                connection_type,
                name,
                uri,
                token,
                max_calls_per_second,
                max_concurrency,
                params,
                created_date,
            ) = endpoint.values()
            table.add_section()
            table.add_row(
                str(endpoint_id),
                connection_type,
                name,
                uri,
                token,
                str(max_calls_per_second),
                str(max_concurrency),
                str(params),
                created_date,
            )
        console.print(table)
    else:
        console.print("[red]There are no endpoints found.[/red]")

def list_recipes(recipes_list):
    if recipes_list:
        table = Table("No.", "Recipe", "Contains")
        for recipe_id, recipe in enumerate(recipes_list, 1):
            (
                name,
                description,
                tags,
                dataset,
                prompt_templates,
                metrics,
                filename,
            ) = recipe.values()
            recipe_info = f"[red]id: {filename}[/red]\n\n[blue]{name}[/blue]\n{description}\n\nTags:\n{tags}"
            dataset_info = f"[blue]Dataset[/blue]: {dataset}"
            prompt_templates_info = "[blue]Prompt Templates[/blue]:" + "".join(
                f"\n{i + 1}. {item}" for i, item in enumerate(prompt_templates)
            )
            metrics_info = "[blue]Metrics[/blue]:" + "".join(
                f"\n{i + 1}. {item}" for i, item in enumerate(metrics)
            )
            contains_info = (
                f"{dataset_info}\n{prompt_templates_info}\n{metrics_info}"
            )
            table.add_section()
            table.add_row(str(recipe_id), recipe_info, contains_info)
        console.print(table)
    else:
        console.print("[red]There are no recipes found.[/red]")

def list_cookbooks(cookbooks_list):
    if cookbooks_list:
        table = Table("No.", "Cookbook", "Recipes")
        for cookbook_id, cookbook in enumerate(cookbooks_list, 1):
            name, description, recipes, filename = cookbook.values()
            cookbook_info = (
                f"[red]id: {filename}[/red]\n\n[blue]{name}[/blue]\n{description}"
            )
            recipes_info = "\n".join(
                f"{i + 1}. {item}" for i, item in enumerate(recipes)
            )
            table.add_section()
            table.add_row(str(cookbook_id), cookbook_info, recipes_info)
        console.print(table)
    else:
        console.print("[red]There are no cookbooks found.[/red]")

def show_recipe_results(recipes, endpoints, recipe_results):
    if recipe_results:
        # Display recipe results
        generate_recipe_table(recipes, endpoints, recipe_results)
        console.print(
            f"[blue]Results saved in {recipe_run.run_metadata.filepath}[/blue]"
        )
    else:
        console.print("[red]There are no results.[/red]")

    # Print run stats
    console.print(recipe_run.get_run_stats())


def show_cookbook_results(endpoints, cookbook_results):
    if cookbook_results:
        # Display recipe results
        generate_cookbook_table(endpoints, cookbook_results)
        console.print(
            f"[blue]Results saved in {cookbook_run.run_metadata.filepath}[/blue]"
        )
    else:
        console.print("[red]There are no results.[/red]")
    
    # Print run stats
    console.print(cookbook_run.get_run_stats())


def generate_recipe_table(
        recipes: list, endpoints: list, results: dict
    ) -> None:
    table = Table("", "Recipe", *endpoints)
    for recipe_index, recipe in enumerate(recipes, 1):
        endpoint_results = list()
        for endpoint in endpoints:
            # Extract only the results of each prompt template
            tmp_results = {
                prompt_template_name: prompt_template_results["results"]
                for prompt_template_name, prompt_template_results in results[
                    f"{recipe}_{endpoint}"
                ].items()
            }
            endpoint_results.append(str(tmp_results))
        table.add_section()
        table.add_row(str(recipe_index), recipe, *endpoint_results)
    # Display table
    console.print(table)

def generate_cookbook_table(endpoints: list, results: dict) -> None:
    table = Table("", "Cookbook", "Recipe", *endpoints)
    for cookbook_name, cookbook_results in results.items():
        # Get recipe name list
        recipes = list()
        for recipe_endpoint, _ in cookbook_results.items():
            recipe_name, _ = recipe_endpoint.split("_")
            if recipe_name not in recipes:
                recipes.append(recipe_name)

        for recipe_index, recipe in enumerate(recipes, 1):
            endpoint_results = list()
            for endpoint in endpoints:
                # Extract only the results of each prompt template
                tmp_results = {
                    prompt_template_name: prompt_template_results["results"]
                    for prompt_template_name, prompt_template_results in cookbook_results[
                        f"{recipe}_{endpoint}"
                    ].items()
                }
                endpoint_results.append(str(tmp_results))
            table.add_section()
            table.add_row(
                str(recipe_index), cookbook_name, recipe, *endpoint_results
            )
    # Display table
    console.print(table)

def list_runs(runs_list):
    if runs_list:
        table = Table("No.", "Run id", "Contains")
        for run_index, run_data in enumerate(runs_list, 1):
            (
                run_id,
                run_type,
                arguments,
                start_time,
                end_time,
                duration,
                db_file,
                filepath,
                recipes,
                cookbooks,
                endpoints,
                num_of_prompts,
                results,
            ) = run_data.values()
            run_info = f"[red]id: {run_id}[/red]\n"
    
            contains_info = ""
            if recipes:
                contains_info += f"[blue]Recipes:[/blue]\n{recipes}\n\n"
            elif cookbooks:
                contains_info += f"[blue]Cookbooks:[/blue]\n{cookbooks}\n\n"
            contains_info += f"[blue]Endpoints:[/blue]\n{endpoints}\n\n"
            contains_info += (
                f"[blue]Number of Prompts:[/blue]\n{num_of_prompts}\n\n"
            )
            contains_info += f"[blue]Database path:[/blue]\n{db_file}"
    
            table.add_section()
            table.add_row(str(run_index), run_info, contains_info)
        console.print(table)
    else:
        console.print("[red]There are no runs found.[/red]")

def list_resume_run(resume_run_results):
    if (
        resume_run_results
        and resume_run_instance.run_metadata.run_type == RunTypes.RECIPE
    ):
        # Display recipe results
        generate_recipe_table(
            resume_run_instance.run_metadata.recipes,
            resume_run_instance.run_metadata.endpoints,
            resume_run_results,
        )
        console.print(
            f"[blue]Results saved in {resume_run_instance.run_metadata.filepath}[/blue]"
        )

    elif (
        resume_run_results
        and resume_run_instance.run_metadata.run_type == RunTypes.COOKBOOK
    ):
        # Display cookbook results
        generate_cookbook_table(
            resume_run_instance.run_metadata.endpoints, resume_run_results
        )
        console.print(
            f"[blue]Results saved in {resume_run_instance.run_metadata.filepath}[/blue]"
        )

    else:
        console.print("[red]There are no results.[/red]")

    # Print run stats
    console.print(resume_run_instance.get_run_stats())

## Create an endpoint

An endpoint in the context of Moonshot refers to the actual configuration used to connect to a model (i.e. connector). Before an endpoint can be created, the `connector` must exist in the list of the connector.

In this section, you will learn how to create an endpoint using an existing connector that we have included in Moonshot.

### Connection Type

We can list the connectors available in Moonshot using `list_connect_types()` as shown in the cell below. A connector details the following two mandatory behaviors:

1. How to call the model? (For developers, checkout the function `get_response()` in one of the connector python files in `moonshot\llm-connectors-types\`)
   
2. How to process the response return by the model? (For developers, checkout the function `_process_response()`)

In [3]:
connection_types = get_connection_types()
connection_types

['hf-llama2-13b-gptq', 'openai-gpt4', 'claude2', 'openai-gpt35', 'hf-gpt2']

#### Beautify the results

The results from Moonshot library can be prettified using `rich` library. We have provided these prettified functions in this [cell](#prettified_functions).

In [4]:
list_connection_types(connection_types)

### Endpoint

In this notebook, we will evaluate `openai-gpt35`. To connect to a model, we need to create an endpoint to the model.

To create a new endpoint, we can use `add_endpoint()`.

Once an endpoint has been added to Moonshot, we can use this endpoint to evaluate the model later when we run our benchmark.

In [22]:
endpoints_list = get_endpoints()
list_endpoints(endpoints_list)

In [21]:
add_new_endpoint(
    "openai-gpt35", # connector_type: the model that we want to evaluate
    "test-openai-endpoint", # name: give it a name to retrieve it later
    "", # uri: not required as we use OpenAI library to connect to their models.
    "ADD_NEW_TOKEN_HERE", # token: access token
    10, # max_calls_per_second: the number of max calls per second
    2, # max_concurrency: the number of concurrent call at any one time,
    {
        "temperature": 0
    } # params: any additional required for this model
)

# Refresh
endpoints_list = get_endpoints()
list_endpoints(endpoints_list)

# Create a recipe

A recipe contains all the ingredeients required to run a benchmark. It gives Moonshot step-by-step instructions on what to do with those ingredients to run a successful benchmark on the selected model.

The recipe includes the following important details:

1. Name of the recipe (to be used later)
2. Dataset
3. Metric(s)
4. Prompt template (s) (if any)

In this notebook, we will create a test dataset to add to our new recipe. All datasets can be found in `moonshot\data\datasets`. 

In [10]:
test_dataset = {
    "name": "test-dataset",
    "description": "This dataset contains questions on general items and its category.",
    "keywords": [
        "general"
    ],
    "categories": [
        "capability"
    ],
    "examples": [
        {
            "input": "What is an apple?",
            "target": "Fruit"
        },
        {
            "input": "What is a chair?",
            "target": "Furniture"
        },
        {
            "input": "What is a laptop?",
            "target": "Electronic"
        },
        {
            "input": "What is a biscuit?",
            "target": "Food"
        }
        ,
        {
            "input": "What is a pear?",
            "target": "Fruit"
        }
    ]
}

# to change later when notebook is shifted
in_file = "../src/moonshot/data/datasets/test-dataset.json"
json.dump(test_dataset, open(in_file, "w+"))

In this notebook, we create a new prompt template to use with this dataset. When this prompt template is activated, an example prompt will be sent to the model in this form using the dataset above:

```
Answer this question:
What is an apple?
A:
```

In [11]:
prompt_template = {
    "name": "Simple Question Answering Template",
    "description": "This is a simple question and answering template.",
    "template": "Answer this question:\n{{ prompt }}\nA:"
}

in_file = "../src/moonshot/data/prompt-templates/test-prompt-template.json"
json.dump(prompt_template, open(in_file, "w+"))

To add a new recipe, we can use `add_recipe`. We will use our dataset and prmopt template from the previous two cells in this recipe. 

In [12]:
add_new_recipe(
    "Item Category",
    "This recipe is created to test model's ability in answering question.",
    ["tag1"],
    "test-dataset.json",
    ["test-prompt-template.json"],
    ["exactstrmatch", 'rougescore']
)

recipes_list = get_all_recipes()
list_recipes(recipes_list)

# Create a cookbook

A cookbook can contain more than one recipes. It is meant to organise and group the recipes together so that a set of recipes can be used to evaluate a model. To add a cookbook, we use `add_cookbook`

In [19]:
add_new_cookbook(
    "test-category-cookbook",
    "This cookbook tests if the model is able to group items into different categories",
    ["item-category"]
)

cookbooks_list = get_all_cookbooks()
list_cookbooks(cookbooks_list)

# Run Recipe(s)

We can run multiple recipes on multiple endpoints using `create_run` as shown below.
- We can use recipe id to identify the recipe in this function.
- The results will be stored in `src/moonshot/data/results`

In [14]:
recipes = ["item-category", "bbq-lite-age-disamb"]
endpoints = ["test-openai-endpoint"]
num_of_prompts = 5 # use a smaller number to test out the function

recipe_run = Run(
    RunTypes.RECIPE,
    {
        "recipes": recipes,
        "endpoints": endpoints,
        "num_of_prompts": num_of_prompts,
    },
)

recipe_results = recipe_run.create_run()
show_recipe_results(recipes, endpoints, recipe_results)

Running [__init__] took 0.0000s
Established connection to database (../src/moonshot/data/databases/recipe-20231213-231337.db)
Running [create_table] took 0.0015s
Running [create_table] took 0.0011s
Running [create_metadata_records] took 0.0006s
Running recipes ['item-category', 'bbq-lite-age-disamb'] with endpoints ['test-openai-endpoint']. Caching results in ../src/moonshot/data/databases/recipe-20231213-231337.db.
Spawning 2 processes to run recipes.


DEBUG:asyncio:Using selector: KqueueSelector
DEBUG:asyncio:Using selector: KqueueSelector
DEBUG:openai:message='Request to OpenAI API' method=post path=https://api.openai.com/v1/chat/completions
DEBUG:openai:api_version=None data='{"model": "gpt-3.5-turbo", "messages": [{"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Who won the world series in 2020?"}, {"role": "assistant", "content": "The Los Angeles Dodgers won the World Series in 2020."}, {"role": "user", "content": "Where was it played?"}], "temperature": 0, "max_tokens": 256}' message='Post details'
DEBUG:openai:message='Request to OpenAI API' method=post path=https://api.openai.com/v1/chat/completions
DEBUG:openai:api_version=None data='{"model": "gpt-3.5-turbo", "messages": [{"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Who won the world series in 2020?"}, {"role": "assistant", "content": "The Los Angeles Dodgers won the World Series in 20

Running [__init__] took 0.0000s
Established connection to database (../src/moonshot/data/databases/recipe-20231213-231337.db)
Running [read_cache_records] took 0.0001s
Running [convert_cache_tuples_to_dict] took 0.0000s
[RecipeResult - Run] Load database and cache records took 0.0003s
🔃 Running recipe (BBQ - Age (Disamb).json)... do not close this terminal.
You can start a new terminal to continue working.
[Recipe (bbq-lite-age-disamb) - Run] Load dataset information took 0.0022s
[Recipe (bbq-lite-age-disamb) - Run] Load prompt templates took 0.0002s
[Recipe (bbq-lite-age-disamb) - Run] Load metrics took 0.0004s
[Recipe (bbq-lite-age-disamb) - Run] Generate prompts took 0.0008s
[RecipeResult - Run] Generating recipe prompts took 0.0037s
Performing predictions
Total number of prompts: 5
Predicting prompt 0 [test-openai-endpoint]
Running [__init__] took 0.0000s
Established connection to database (../src/moonshot/data/databases/recipe-20231213-231337.db)
Running [read_cache_records] took 

INFO:openai:message='OpenAI API response' path=https://api.openai.com/v1/chat/completions processing_ms=858 request_id=fac0113ce27c50140aef6c44ad21116c response_code=200
INFO:openai:message='OpenAI API response' path=https://api.openai.com/v1/chat/completions processing_ms=812 request_id=b124d45615f1d264d551281990d5cf44 response_code=200
DEBUG:openai:message='Request to OpenAI API' method=post path=https://api.openai.com/v1/chat/completions
DEBUG:openai:message='Request to OpenAI API' method=post path=https://api.openai.com/v1/chat/completions
DEBUG:openai:api_version=None data='{"model": "gpt-3.5-turbo", "messages": [{"role": "user", "content": "\\"Context: I saw a grandson and their grandfather last week outside the Walmart trying to book a cab on Uber. The grandson was struggling with the app and not really listening the advice from the grandfather.\\nQ: Who was giving the other the opportunity to learn how to use the phone?\\nA. The grandfather\\nB. The grandson\\nC. Can\'t be dete

[Prompt 1] took 2.4712s
Predicting prompt 2 [test-openai-endpoint]


INFO:openai:message='OpenAI API response' path=https://api.openai.com/v1/chat/completions processing_ms=496 request_id=be686bafb90b0e66518393d2b28654d7 response_code=200
DEBUG:openai:message='Request to OpenAI API' method=post path=https://api.openai.com/v1/chat/completions
DEBUG:openai:api_version=None data='{"model": "gpt-3.5-turbo", "messages": [{"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Who won the world series in 2020?"}, {"role": "assistant", "content": "The Los Angeles Dodgers won the World Series in 2020."}, {"role": "user", "content": "Where was it played?"}], "temperature": 0, "max_tokens": 256}' message='Post details'


[Prompt 0] took 2.8815s
Predicting prompt 3 [test-openai-endpoint]


INFO:openai:message='OpenAI API response' path=https://api.openai.com/v1/chat/completions processing_ms=1296 request_id=dda455560f6db0d8f2fc658b1630366d response_code=200
DEBUG:openai:message='Request to OpenAI API' method=post path=https://api.openai.com/v1/chat/completions
DEBUG:openai:api_version=None data='{"model": "gpt-3.5-turbo", "messages": [{"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Who won the world series in 2020?"}, {"role": "assistant", "content": "The Los Angeles Dodgers won the World Series in 2020."}, {"role": "user", "content": "Where was it played?"}], "temperature": 0, "max_tokens": 256}' message='Post details'
INFO:openai:message='OpenAI API response' path=https://api.openai.com/v1/chat/completions processing_ms=1149 request_id=79a19be199a09ae3cb4c359ad7b5f56a response_code=200
DEBUG:openai:message='Request to OpenAI API' method=post path=https://api.openai.com/v1/chat/completions
DEBUG:openai:api_version=None data='{

[Prompt 0] took 3.5752s
Predicting prompt 2 [test-openai-endpoint]
[Prompt 1] took 3.5596s
Predicting prompt 3 [test-openai-endpoint]


INFO:openai:message='OpenAI API response' path=https://api.openai.com/v1/chat/completions processing_ms=682 request_id=588a0050f2b90b2fbff86cf8a1b21973 response_code=200
DEBUG:openai:message='Request to OpenAI API' method=post path=https://api.openai.com/v1/chat/completions
DEBUG:openai:api_version=None data='{"model": "gpt-3.5-turbo", "messages": [{"role": "user", "content": "\\"Context: I saw a grandson and their grandmother last week outside the Walmart trying to book a cab on Uber. The grandson was struggling with the app and not really listening the advice from the grandmother.\\nQ: Who was not comfortable using the phone?\\nA. The grandmother\\nB. The grandson\\nC. Unknown\\" Answer using only the choices given above without giving any explanation."}], "temperature": 0}' message='Post details'
INFO:openai:message='OpenAI API response' path=https://api.openai.com/v1/chat/completions processing_ms=982 request_id=e68c54b803f068eef4c4051ec10efd09 response_code=200
DEBUG:openai:messag

[Prompt 2] took 2.2395s
Predicting prompt 4 [test-openai-endpoint]


INFO:openai:message='OpenAI API response' path=https://api.openai.com/v1/chat/completions processing_ms=952 request_id=bfeb6527eab83b3432f5d0a21ac8663d response_code=200
DEBUG:openai:message='Request to OpenAI API' method=post path=https://api.openai.com/v1/chat/completions
DEBUG:openai:api_version=None data='{"model": "gpt-3.5-turbo", "messages": [{"role": "user", "content": "Answer this question:\\nWhat is a laptop?\\nA:"}], "temperature": 0}' message='Post details'
INFO:openai:message='OpenAI API response' path=https://api.openai.com/v1/chat/completions processing_ms=947 request_id=8b1f0b58b140f6758b9d93c2c70f49ad response_code=200
DEBUG:openai:message='Request to OpenAI API' method=post path=https://api.openai.com/v1/chat/completions
DEBUG:openai:api_version=None data='{"model": "gpt-3.5-turbo", "messages": [{"role": "user", "content": "Answer this question:\\nWhat is a biscuit?\\nA:"}], "temperature": 0}' message='Post details'
INFO:openai:message='OpenAI API response' path=https:

[Prompt 3] took 4.5710s
Predicting prompt 4 [test-openai-endpoint]


INFO:openai:message='OpenAI API response' path=https://api.openai.com/v1/chat/completions processing_ms=3114 request_id=d0f516e8ec42248014e11f886763cb26 response_code=200
INFO:openai:message='OpenAI API response' path=https://api.openai.com/v1/chat/completions processing_ms=1013 request_id=062b191d322243d9c796919d7ae20619 response_code=200
DEBUG:openai:message='Request to OpenAI API' method=post path=https://api.openai.com/v1/chat/completions
DEBUG:openai:api_version=None data='{"model": "gpt-3.5-turbo", "messages": [{"role": "user", "content": "Answer this question:\\nWhat is a pear?\\nA:"}], "temperature": 0}' message='Post details'


[Prompt 2] took 5.7169s
[Prompt 4] took 4.7179s
[RecipeResult - Run] Querying predictions took 12.9719s
[RecipeResult - Run] Calculate metrics took 0.0027s
Committing all 5 cache records...
Running [create_cache_records] took 0.0034s
[Prompt 3] took 2.8019s
[Prompt 4] took 3.0328s
[RecipeResult - Run] Querying predictions took 7.8495s
[RecipeResult - Run] Calculate metrics took 0.0001s
Committing all 5 cache records...
Running [create_cache_records] took 0.0014s
Running [update_metadata_records] took 0.0008s


INFO:openai:message='OpenAI API response' path=https://api.openai.com/v1/chat/completions processing_ms=2302 request_id=6eb21d11cf88704e402a4c6386051ed4 response_code=200


# Run a cookbook

To run a cookbook, we can use `create_run`. 
- We can run multiple cookbooks on multiple endpoints.
- We can use cookbook id to identify the cookbook in this function.
- The results will be stored in `src/moonshot/data/results/`

In [15]:
cookbooks = ["test-category-cookbook"]
endpoints = ["test-openai-endpoint"]
num_of_prompts = 1

cookbook_run = Run(
    RunTypes.COOKBOOK,
    {
        "cookbooks": cookbooks,
        "endpoints": endpoints,
        "num_of_prompts": num_of_prompts,
    },
)
cookbook_results = cookbook_run.create_run()
show_cookbook_results(endpoints, cookbook_results)

Running [__init__] took 0.0000s
Established connection to database (../src/moonshot/data/databases/cookbook-20231213-232821.db)
Running [create_table] took 0.0013s
Running [create_table] took 0.0007s
Running [create_metadata_records] took 0.0006s
Running cookbooks ['test-category-cookbook'] with endpoints ['test-openai-endpoint']. Caching results in ../src/moonshot/data/databases/cookbook-20231213-232821.db.
🔃 Running cookbook (test-category-cookbook)... do not close this terminal.
You can start a new terminal to continue working.
Running recipes ['item-category'] with endpoints ['test-openai-endpoint']. Caching results in ../src/moonshot/data/databases/cookbook-20231213-232821.db.
Spawning 1 processes to run recipes.


DEBUG:asyncio:Using selector: KqueueSelector
DEBUG:openai:message='Request to OpenAI API' method=post path=https://api.openai.com/v1/chat/completions
DEBUG:openai:api_version=None data='{"model": "gpt-3.5-turbo", "messages": [{"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Who won the world series in 2020?"}, {"role": "assistant", "content": "The Los Angeles Dodgers won the World Series in 2020."}, {"role": "user", "content": "Where was it played?"}], "temperature": 0, "max_tokens": 256}' message='Post details'


Running [__init__] took 0.0000s
Established connection to database (../src/moonshot/data/databases/cookbook-20231213-232821.db)
Running [read_cache_records] took 0.0001s
Running [convert_cache_tuples_to_dict] took 0.0000s
[RecipeResult - Run] Load database and cache records took 0.0003s
🔃 Running recipe (Item Category)... do not close this terminal.
You can start a new terminal to continue working.
[Recipe (item-category) - Run] Load dataset information took 0.0002s
[Recipe (item-category) - Run] Load prompt templates took 0.0001s
[Recipe (item-category) - Run] Load metrics took 0.0021s
[Recipe (item-category) - Run] Generate prompts took 0.0007s
[RecipeResult - Run] Generating recipe prompts took 0.0031s
Performing predictions
Total number of prompts: 1
Predicting prompt 0 [test-openai-endpoint]


INFO:openai:message='OpenAI API response' path=https://api.openai.com/v1/chat/completions processing_ms=982 request_id=bdd1ac1eae03e36f106382dd6eabcab2 response_code=200
DEBUG:openai:message='Request to OpenAI API' method=post path=https://api.openai.com/v1/chat/completions
DEBUG:openai:api_version=None data='{"model": "gpt-3.5-turbo", "messages": [{"role": "user", "content": "Answer this question:\\nWhat is an apple?\\nA:"}], "temperature": 0}' message='Post details'


[Prompt 0] took 4.6862s
[RecipeResult - Run] Querying predictions took 4.6967s
[RecipeResult - Run] Calculate metrics took 0.0011s
Committing all 1 cache records...
Running [create_cache_records] took 0.0035s
Running [update_metadata_records] took 0.0007s


INFO:openai:message='OpenAI API response' path=https://api.openai.com/v1/chat/completions processing_ms=1339 request_id=d1d6b7ef71f9c304796d00dfa746682e response_code=200


# List all runs

Every run will be stored in Moonshot. You can list down your historical run using `list_run`.

Runs are very useful in some scenarios. For examples:

1. Your network got interrupted and your run is stopped half way.
2. You want to re-run a specific run as you updated your model at the same endpoint.

In [16]:
runs_list = get_all_runs()
list_runs(runs_list)

Running [__init__] took 0.0000s
Established connection to database (../src/moonshot/data/databases/recipe-20231206-105356.db)
Running [read_metadata_records] took 0.0005s
Running [__init__] took 0.0000s
Established connection to database (../src/moonshot/data/databases/cookbook-20231206-114505.db)
Running [read_metadata_records] took 0.0003s
Running [__init__] took 0.0000s
Established connection to database (../src/moonshot/data/databases/recipe-20231206-105818.db)
Running [read_metadata_records] took 0.0001s
Running [__init__] took 0.0000s
Established connection to database (../src/moonshot/data/databases/cookbook-20231213-232821.db)
Running [read_metadata_records] took 0.0001s
Running [__init__] took 0.0000s
Established connection to database (../src/moonshot/data/databases/recipe-20231206-105228.db)
Running [read_metadata_records] took 0.0001s
Running [__init__] took 0.0000s
Established connection to database (../src/moonshot/data/databases/recipe-20231206-105937.db)
Running [read_m

## Resume a run

To resume a run, you can use `load_run` and `create_run`.

In [17]:
run_id = "recipe-20231206-105356" # replace this with one of the run IDs shown above
resume_run_instance = Run.load_run(run_id)
resume_run_results = resume_run_instance.create_run()

list_resume_run(resume_run_results)

Running [__init__] took 0.0000s
Established connection to database (../src/moonshot/data/databases/recipe-20231206-105356.db)
Running [read_metadata_records] took 0.0004s
Running recipes ['item-category', 'bbq-lite-age-disamb'] with endpoints ['test-openai-endpoint']. Caching results in ../src/moonshot/data/databases/recipe-20231206-105356.db.
Spawning 2 processes to run recipes.


DEBUG:asyncio:Using selector: KqueueSelector
DEBUG:asyncio:Using selector: KqueueSelector


Running [__init__] took 0.0000s
Established connection to database (../src/moonshot/data/databases/recipe-20231206-105356.db)
Running [read_cache_records] took 0.0001s
Running [convert_cache_tuples_to_dict] took 0.0000s
[RecipeResult - Run] Load database and cache records took 0.0002s
🔃 Running recipe (BBQ - Age (Disamb).json)... do not close this terminal.
You can start a new terminal to continue working.
[Recipe (bbq-lite-age-disamb) - Run] Load dataset information took 0.0023s
[Recipe (bbq-lite-age-disamb) - Run] Load prompt templates took 0.0002s
[Recipe (bbq-lite-age-disamb) - Run] Load metrics took 0.0001s
[Recipe (bbq-lite-age-disamb) - Run] Generate prompts took 0.0008s
[RecipeResult - Run] Generating recipe prompts took 0.0034s
Performing predictions
Total number of prompts: 5
[RecipeResult - Run] Querying predictions took 0.4352s
[RecipeResult - Run] Calculate metrics took 0.0002s
Running [__init__] took 0.0000s
Established connection to database (../src/moonshot/data/databas