### DSPy: Framework for algorithmically optimizing LM prompts and weights
---

DSPy is a framework to algorithmically optimize for LM prompts and weights. In the current generative AI app development lifecycle, to build a complex system without DSPy, a generative AI engineering team would have to 1/ Break the problem down into relevant steps, 2/ Prompt the language model well until each step works well independently, 3/ tweak and have all the steps work well together 4/ generate data/synthetic examples to tune each step and 5/ use the examples and the data to tune smaller language models to optimize for cost.

However, the pain point associated to this is that with any need for change in the pipeline, the language model, or any of the prompts, all of the data or prompts might have to change. This is time-consuming and a repetitive/mechanical effort.

#### DSPy optimizes the generative AI application development lifecycle by doing as follows:

1. Separates the flow of the program (`modules`) from the parameters (LM prompts and weights) of each step.

1. DSPy introduces new `optimizers` which are LM-driven algorithms to tune the prompts and/or weights of your LM calls, given a `metric` that a user might want to maximize.

**Note**: DSPy optimizers can "compile" the same program into different instructions, few-shot prompts, and/or weight updates (finetunes) for each LM. This new method in which language models and their prompts fade into the background as optimizable pieces of a larger system that can learn from data leads to less prompting, higher scores, and a more systematic approach to solving hard tasks with LMs.

#### Step 1: Install DSPy

In [49]:
# Install all of the latest packages within DSPy using the following command
!pip install -Uq git+https://github.com/stanfordnlp/dspy.git

UnboundLocalError: cannot access local variable 'child' where it is not associated with a value

#### Step 2: Configure your DSPy environment
---

To configure the DSPy environment, you can import dspy and then configure a desired choice of language model. DSPy offers integration with a wide array of models that can be used across various steps of the GenAI Application Development lifecycle. Some supported models are `Anthropic Models`, `AWSAnthropic`, `AWSMeta` and more.

In [None]:
import os
import dspy
import logging

# Set a logger
logging.basicConfig(format='[%(asctime)s] p%(process)s {%(filename)s:%(lineno)d} %(levelname)s - %(message)s', level=logging.INFO)
logger = logging.getLogger(__name__)

In [None]:
# Define the region
AWS_REGION: str = "us-west-2"
BEDROCK_HAIKU_MODELID: str = "anthropic.claude-3-haiku-20240307-v1:0"
SONNET_3_5_MODELID: str = "anthropic.claude-3-5-sonnet-20240620-v1:0"
TITAN_TEXT_EMBED_MODELID: str = "amazon.titan-embed-text-v2:0"

In [None]:
def configure_dspy_with_bedrock_model(model_id: str, region: str):
    """
    Configure DSPy with the specified Bedrock model ID.

    Args:
    model_id (str): The Bedrock model ID to use.
    region (str): The AWS region to use. Defaults to "us-west-2".

    Returns:
    None
    """
    try:
        # Create a DSPy language model with the specified model ID
        bedrock_lm = dspy.LM(model_id)
        # Configure DSPy to use this language model
        dspy.configure(lm=bedrock_lm)
        logger.info(f"DSPy configured with model: {model_id}")
    except Exception as e:
        logger.error(f"Error configuring DSPy with model {model_id}: {e}")
        raise

In [None]:
configure_dspy_with_bedrock_model(BEDROCK_HAIKU_MODELID, AWS_REGION)

[2024-11-04 09:55:30,923] p68133 {1091664385.py:17} INFO - DSPy configured with model: anthropic.claude-3-haiku-20240307-v1:0


### More about DSPy
---

DSPy, developed by Stanford NLP, is an open-source library designed to streamline the process of creating and managing data science workflows. It is built around three core components: Signatures, Modules, and Optimizers.

1. **Signatures**: Declarative specs of input/ output behavior of a module. This cleanly segregates what we want the module to do from how to do it. You provide some description of the fields (which will be used to build prompt) and field names carry semantic meaning as explained below.

1. **Modules**: This is the core part of the program that manages the flow logic. DSPy provides built-in modules for basic Predict, Chain of Thought, ReAct, etc. You can create your own and compose multiple modules.

1. **Optimizers**: The framework provides few optimizers(e.g. `LabeledFewShot`, `BootstrapFewShotWithRandomSearch` etc.) that tune prompt (adds examples based on random selection) and model parameters (e.g. temperature). They evaluate the performance based on the metric to optimize

1. **Compiler**: Optimizes the instructions of Module and get relevant/ efficient examples for the task. The compiled program can be saved to disk and reloaded similar to checkpoints.

_**Note**: DSPy aims to address the challenges of programming with language models by providing similar building blocks. It offers a composable pattern to represent individual units. DSPy doesn’t eliminate language prompts altogether; instead, it builds the prompts based on the signatures, hints, and target models. DSPy helps crystallize our focus on writing the core logic along with signature field annotations and hints, rather than constructing lengthy prompts from scratch._

#### Exploring some basic DSPy `Module(s)`
--- 
DSPy module is a building block for programs that use Language Models (LMs). Modules are designed to abstract specific prompting techniques and can be combined to create larger programs.

In [None]:
# Predict module: It takes a DSPy signature (which is a structured intput/output schema, and gives you a callable
# function with a specified behaviour)

qa = dspy.Predict('question: str -> response: str')
qa(question="What is the capital of France?").response

# In this example, haiku is used since it is the configured LM. For all other predictions made, 
# the LM configured in the current environment will be used.

'The capital of France is Paris.'

In this example, the `qa` module passed the signature, choice of LM, and inputs into an `Adapter`, which is a layer that handles the structuring of the inputs and parsing structured outputs to fit the signature.

In [None]:
# Inspect the `n` last prompts sent by DSPy
dspy.inspect_history(n=1)





[34m[2024-11-04T09:26:05.417913][0m

[31mSystem message:[0m

Your input fields are:
1. `question` (str)

Your output fields are:
1. `response` (str)

All interactions will be structured in the following way, with the appropriate values filled in.

[[ ## question ## ]]
{question}

[[ ## response ## ]]
{response}

[[ ## completed ## ]]

In adhering to this structure, your objective is: 
        Given the fields `question`, produce the fields `response`.


[31mUser message:[0m

[[ ## question ## ]]
What is the capital of France?

Respond with the corresponding output fields, starting with the field `[[ ## response ## ]]`, and then ending with the marker for `[[ ## completed ## ]]`.


[31mResponse:[0m

[32m[[ ## response ## ]]
The capital of France is Paris.

[[ ## completed ## ]][0m







In [None]:
cot = dspy.ChainOfThought('question -> response')
cot(question="Explain the concept of refraction?")

Prediction(
    reasoning='Refraction is the bending of light as it passes from one medium to another with a different refractive index. This occurs because the speed of light changes when it moves from one medium to another. \n\nWhen light travels from a medium with a lower refractive index (e.g., air) to a medium with a higher refractive index (e.g., water or glass), the light bends towards the normal, which is an imaginary line perpendicular to the surface at the point of incidence. Conversely, when light travels from a medium with a higher refractive index to a medium with a lower refractive index, the light bends away from the normal.\n\nThe amount of bending, or refraction, is determined by the refractive index of the two media and the angle at which the light strikes the surface. This phenomenon is responsible for many optical effects, such as the apparent bending of a straw in a glass of water, the magnification of objects seen through a lens, and the twinkling of stars.',
    

Now, you can use these simple concepts to define signatures and modules into the python control flow to get work done with LLMs. However, to build a comprehensive and complex generative AI application, it is important to optimize for LLM performance and cost, and this can be done by integrating `DSPy Optimizers`.

#### Evaluations using DSPy

1. To measure the quality of the DSPy system, the following components are needed:

    1. Input values (for example some `input question`s)
    1. A `metric` that can score teh quality of an output from the system

In [None]:
import ujson

# Download the 500 question--answer pairs from the RAQ-QA arena "Tech" dataset
!wget https://huggingface.co/dspy/cache/resolve/main/ragqa_arena_tech_500.json

with open('ragqa_arena_tech_500.json') as f:
    data = ujson.load(f)

# Inspect one of the datapoints
data[0]

--2024-11-04 09:37:40--  https://huggingface.co/dspy/cache/resolve/main/ragqa_arena_tech_500.json
Resolving huggingface.co (huggingface.co)... 18.154.227.69, 18.154.227.67, 18.154.227.7, ...
Connecting to huggingface.co (huggingface.co)|18.154.227.69|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1430973 (1.4M) [text/plain]
Saving to: ‘ragqa_arena_tech_500.json.1’


2024-11-04 09:37:40 (9.19 MB/s) - ‘ragqa_arena_tech_500.json.1’ saved [1430973/1430973]



{'question': 'how to transfer whatsapp voice message to computer?',
 'response': 'To transfer voice notes from WhatsApp on your device to your computer, you have the option to select the "Share" feature within the app and send the files via Email, Gmail, Bluetooth, or other available services.  \nYou can also move the files onto your phone\'s SD card, connect your phone to your computer via a USB cable, then find and transfer the files via File Explorer on your PC. \nAlternatively, you can choose to attach all the desired voice notes to an email and, from your phone, send them to your own email address.  \nUpon receiving the email on your computer, you can then download the voice note attachments.'}

In [None]:
# Now we can create a list of examples (dspy.Example which is the datatype that carries the training or test datapoints to DSPy)
data = [dspy.Example(**d).with_inputs('question') for d in data]

example = data[10]
print(example)

Example({'question': 'what is the offical name of the third on-screen button?', 'response': 'This is a function commonly referred to as \'Recents\' in the context of Android devices and is officially named "Overview".  \nIt is also known by the term "Task Switcher or Recent Tasks".  \nThe Overview function displays a collection of thumbnails representing apps and Chrome tabs that have been recently accessed; touching a thumbnail will open the respective app, and swiping a thumbnail left or right will remove it from the list.  \nFor devices running on Android 4.4 or lower, the icon for the Overview function has a distinct appearance.'}) (input_keys={'question'})


Divide the data into: Training and Validation sets:

1. Once the data is split into training and validation sets, this is what is provided to the DSPy optimizers.

1. The optiizers will learn from the training examples and check the progress using the validation examples. An ideal size for training and validation each is `30--300` examples.

1. It is suggested to pass more validation examples that training in the case of prompt optimizers

In [None]:
trainset, valset, devset, testset = data[:50], data[50:150], data[150:300], data[300:500]

len(trainset), len(valset), len(devset), len(testset)

(50, 100, 150, 200)

#### Evaluation in DSPy
---

There are several metrics that can be used for question-answering tasks. However, what is the metric suitable for a specific question answering task? If the questions have long answers, we might want to check how well the answer covers all of the key elements. For this, we can use the `semantic F1` metric from DSPy.

In [None]:
from dspy.evaluate import SemanticF1

# Instantiate the metric.
metric = SemanticF1()

# Produce a prediction from our `cot` module, using the `example` above as input.
pred = cot(**example.inputs())

# Compute the metric score for the prediction.
score = metric(example, pred)

print(f"Question: \t {example.question}\n")
print(f"Gold Response: \t {example.response}\n")
print(f"Predicted Response: \t {pred.response}\n")
print(f"Semantic F1 Score: {score:.2f}")

Question: 	 what is the offical name of the third on-screen button?

Gold Response: 	 This is a function commonly referred to as 'Recents' in the context of Android devices and is officially named "Overview".  
It is also known by the term "Task Switcher or Recent Tasks".  
The Overview function displays a collection of thumbnails representing apps and Chrome tabs that have been recently accessed; touching a thumbnail will open the respective app, and swiping a thumbnail left or right will remove it from the list.  
For devices running on Android 4.4 or lower, the icon for the Overview function has a distinct appearance.

Predicted Response: 	 The official name of the third on-screen button is the Menu button.

Semantic F1 Score: 0.00


In [None]:
dspy.inspect_history(n=1)





[34m[2024-11-04T09:43:44.453997][0m

[31mSystem message:[0m

Your input fields are:
1. `question` (str)
2. `ground_truth` (str)
3. `system_response` (str)

Your output fields are:
1. `reasoning` (str)
2. `recall` (float): fraction (out of 1.0) of ground truth covered by the system response
3. `precision` (float): fraction (out of 1.0) of system response covered by the ground truth

All interactions will be structured in the following way, with the appropriate values filled in.

[[ ## question ## ]]
{question}

[[ ## ground_truth ## ]]
{ground_truth}

[[ ## system_response ## ]]
{system_response}

[[ ## reasoning ## ]]
{reasoning}

[[ ## recall ## ]]
{recall}        # note: the value you produce must be a single float value

[[ ## precision ## ]]
{precision}        # note: the value you produce must be a single float value

[[ ## completed ## ]]

In adhering to this structure, your objective is: 
        Compare a system's response to the ground truth to compute its recall and p

### DSPy: Basic RAG Example
---

At this point, we have set up the DSPy LM (In this case bedrock haiku), loaded some data (`question-answer` pairs), loaded a metic for evaluation (Semantic F1 score). Now we will set up a basic RAG system using DSPy

In [None]:
import os
import requests
from typing import List

urls: List = [
    'https://huggingface.co/dspy/cache/resolve/main/ragqa_arena_tech_500.json',
    'https://huggingface.co/datasets/colbertv2/lotte_passages/resolve/main/technology/test_collection.jsonl',
    'https://huggingface.co/dspy/cache/resolve/main/index.pt'
]

for url in urls:
    filename = os.path.basename(url)
    remote_size = int(requests.head(url, allow_redirects=True).headers.get('Content-Length', 0))
    local_size = os.path.getsize(filename) if os.path.exists(filename) else 0

    if local_size != remote_size:
        logger.info(f"Downloading '{filename}'...")
        with requests.get(url, stream=True) as r, open(filename, 'wb') as f:
            for chunk in r.iter_content(chunk_size=8192): f.write(chunk)

[2024-11-04 09:48:13,545] p68133 {3040005063.py:17} INFO - Downloading 'test_collection.jsonl'...
[2024-11-04 09:48:49,487] p68133 {3040005063.py:17} INFO - Downloading 'index.pt'...


In [None]:
# Now let's set up the data and other objects (LM, metric and evaluation module)
configure_dspy_with_bedrock_model(SONNET_3_5_MODELID, AWS_REGION)

[2024-11-04 09:55:39,629] p68133 {1091664385.py:17} INFO - DSPy configured with model: anthropic.claude-3-5-sonnet-20240620-v1:0


In [None]:
with open('ragqa_arena_tech_500.json') as f:
    data = [dspy.Example(**d).with_inputs('question') for d in ujson.load(f)]
    trainset, valset, devset, testset = data[:50], data[50:150], data[150:300], data[300:500]

metric = SemanticF1()
evaluate = dspy.Evaluate(devset=devset, metric=metric, num_threads=24, display_progress=True, display_table=3)

In [53]:
import torch
import functools
from litellm import embedding as Embed

with open("test_collection.jsonl") as f:
    corpus = [ujson.loads(line) for line in f]

max_characters = 4000 # >98th percentile of document lengths

index = torch.load('index.pt', weights_only=True)
embedding_dim = index.shape[1]

@functools.lru_cache(maxsize=None)
def search(query, k=5):
    query_embedding = torch.tensor(Embed(input=query, model=TITAN_TEXT_EMBED_MODELID).data[0]['embedding'])
    
    # Ensure the query embedding has the correct dimension
    if query_embedding.shape[0] != embedding_dim:
        # If dimensions don't match, pad or truncate the query embedding
        if query_embedding.shape[0] < embedding_dim:
            query_embedding = torch.nn.functional.pad(query_embedding, (0, embedding_dim - query_embedding.shape[0]))
        else:
            query_embedding = query_embedding[:embedding_dim]
    
    # Reshape query_embedding to match index dimensions
    query_embedding = query_embedding.reshape(-1, 1)
    
    # Perform the search
    topk_scores, topk_indices = torch.matmul(index, query_embedding).squeeze().topk(k)
    topK = [dict(score=score.item(), **corpus[idx]) for idx, score in zip(topk_indices, topk_scores)]
    return [doc['text'][:max_characters] for doc in topK]

In [54]:
class RAG(dspy.Module):
    def __init__(self, num_docs=5):
        self.num_docs = num_docs
        self.respond = dspy.ChainOfThought('context, question -> response')

    def forward(self, question):
        context = search(question, k=self.num_docs)
        return self.respond(context=context, question=question)

In [55]:
rag = RAG()
rag(question="what are high memory and low memory on linux?")

[92m10:06:19 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-5-sonnet-20240620-v1:0; provider = bedrock
[2024-11-04 10:06:19,021] p68133 {utils.py:3055} INFO - 
LiteLLM completion() model= anthropic.claude-3-5-sonnet-20240620-v1:0; provider = bedrock
[2024-11-04 10:06:19,045] p68133 {credentials.py:1278} INFO - Found credentials in shared credentials file: ~/.aws/credentials
[2024-11-04 10:06:28,042] p68133 {_client.py:1026} INFO - HTTP Request: POST https://bedrock-runtime.us-west-2.amazonaws.com/model/anthropic.claude-3-5-sonnet-20240620-v1:0/converse "HTTP/1.1 200 OK"
[92m10:06:28 - LiteLLM:INFO[0m: utils.py:1002 - Wrapper: Completed Call, calling success_handler
[2024-11-04 10:06:28,050] p68133 {utils.py:1002} INFO - Wrapper: Completed Call, calling success_handler


Prediction(
    reasoning="The given context does not contain any information about high memory and low memory on Linux. The context includes various unrelated topics such as email clients, file operations, and software installation/uninstallation. To answer this question, I'll need to provide a general explanation based on common knowledge about Linux memory management.",
    response='I apologize, but the provided context doesn\'t contain any information about high memory and low memory on Linux. However, I can provide a general explanation:\n\nIn Linux, "high memory" and "low memory" refer to different regions of physical memory:\n\n1. Low memory: This is the first 16 MB (on 32-bit systems) or 896 MB (on 64-bit systems) of physical RAM. It\'s directly accessible by the kernel and can be used for any purpose.\n\n2. High memory: This is the memory above the low memory threshold. On 32-bit systems, it requires special mapping techniques to be accessed by the kernel.\n\nThe distinction 

In [56]:
dspy.inspect_history()





[34m[2024-11-04T10:06:28.053935][0m

[31mSystem message:[0m

Your input fields are:
1. `context` (str)
2. `question` (str)

Your output fields are:
1. `reasoning` (str)
2. `response` (str)

All interactions will be structured in the following way, with the appropriate values filled in.

[[ ## context ## ]]
{context}

[[ ## question ## ]]
{question}

[[ ## reasoning ## ]]
{reasoning}

[[ ## response ## ]]
{response}

[[ ## completed ## ]]

In adhering to this structure, your objective is: 
        Given the fields `context`, `question`, produce the fields `response`.


[31mUser message:[0m

[[ ## context ## ]]
[1] «No it doesnt support RTL, anyway i dont know you would like to migrate to geary , But i test it and its very poor. You can use evolution or i suggest you stay in thunderbird.»
[2] «move ~/.config/transmission to external drive. create symbolic link on new system pointing to external drive. keep in mind that if you unplug drive Transmission would be misbehaving (its 

In [58]:
configure_dspy_with_bedrock_model(BEDROCK_HAIKU_MODELID, AWS_REGION)
evaluate(RAG())

[2024-11-04 10:08:26,604] p68133 {1091664385.py:17} INFO - DSPy configured with model: anthropic.claude-3-haiku-20240307-v1:0


[92m10:08:26 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:08:26,629] p68133 {utils.py:3055} INFO - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[92m10:08:26 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[92m10:08:26 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:08:26,633] p68133 {utils.py:3055} INFO - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[92m10:08:26 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:08:26,635] p68133 {utils.py:3055} INFO - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[92m10:08:26 - 

Unnamed: 0,question,example_response,reasoning,pred_response,SemanticF1
0,why is mercurial considered to be easier than git?,"Mercurial's syntax is considered more familiar, especially for those accustomed to SVN, and is well documented. It focuses on interface aspects, which initially makes learning...","Based on the provided context, there is no information about why Mercurial is considered easier than Git. The context discusses topics such as steganography, TrueCrypt,...","I apologize, but I do not have enough information in the provided context to determine why Mercurial is considered easier than Git. The given context...",
1,open finder window from current terminal location?,"If you type 'open .' in Terminal, it will open the current directory in a Finder window. Alternatively, you can execute the command open `pwd`...","To open a Finder window from the current terminal location, you can use the `open` command in the terminal. The `open` command allows you to...",open .,✔️ [1.000]
2,how to import secret gpg key (copied from one machine to another)?,It is advised that it is necessary to add `--import` to the command line to import the private key and that according to the man...,"To import a secret GPG key that has been copied from one machine to another, you can follow these steps: 1. Copy the secret key...","To import a secret GPG key from one machine to another: 1. Copy the secret key file (e.g., `secring.gpg` or `.sec`) to the new machine....",✔️ [1.000]


59.17

#### Use `DSPy Optimizer` to improve the RAG prompt
---

As given above, the RAG prompt scores a `59.17`%. To improve this, we can optimize the prompts in our RAG pipeline.

Now, we use the DSPy's MIPRO (v2) optimizer. The run below has a cost around $1.5 (for the medium auto setting) and may take some 20-30 minutes depending on your number of threads.


`MIPROv2 (Multiprompt Instruction PRoposal Optimizer Version 2)` is an prompt optimizer capable of optimizing both instructions and few-shot examples jointly. It does this by bootstrapping few-shot example candidates, proposing instructions grounded in different dynamics of the task, and finding an optimized combination of these options using Bayesian Optimization. It can be used for optimizing few-shot examples & instructions jointly, or just instructions for 0-shot optimization.

In [59]:
tp = dspy.MIPROv2(metric=metric, auto="medium", num_threads=24)  # use fewer threads if your rate limit is small

optimized_rag = tp.compile(RAG(), trainset=trainset, valset=valset,
                           max_bootstrapped_demos=2, max_labeled_demos=2,
                           requires_permission_to_run=False)


RUNNING WITH THE FOLLOWING MEDIUM AUTO RUN SETTINGS:
num_trials: 25
minibatch: True
num_candidates: 19
valset size: 100


==> STEP 1: BOOTSTRAP FEWSHOT EXAMPLES <==
These will be used as few-shot example candidates for our program and for creating instructions.

Bootstrapping N=19 sets of demonstrations...
Bootstrapping set 1/19
Bootstrapping set 2/19
Bootstrapping set 3/19


  0%|          | 0/50 [00:00<?, ?it/s][2024-11-04 10:11:57,514] p68133 {credentials.py:1278} INFO - Found credentials in shared credentials file: ~/.aws/credentials
[2024-11-04 10:11:57,926] p68133 {_client.py:1026} INFO - HTTP Request: POST https://bedrock-runtime.us-west-2.amazonaws.com/model/amazon.titan-embed-text-v2:0/invoke "HTTP/1.1 200 OK"
[92m10:11:58 - LiteLLM:INFO[0m: utils.py:1002 - Wrapper: Completed Call, calling success_handler
[2024-11-04 10:11:58,007] p68133 {utils.py:1002} INFO - Wrapper: Completed Call, calling success_handler
[92m10:11:58 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:11:58,737] p68133 {utils.py:3055} INFO - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:11:58,744] p68133 {credentials.py:1278} INFO - Found credentials in shared credentials file: ~/.aws/credentials
[2024-11-04 10:12:02,051] p68133 {_cl

Bootstrapped 2 full traces after 2 examples for up to 1 rounds, amounting to 2 attempts.
Bootstrapping set 4/19


  0%|          | 0/50 [00:00<?, ?it/s][2024-11-04 10:12:09,467] p68133 {credentials.py:1278} INFO - Found credentials in shared credentials file: ~/.aws/credentials
[2024-11-04 10:12:09,609] p68133 {_client.py:1026} INFO - HTTP Request: POST https://bedrock-runtime.us-west-2.amazonaws.com/model/amazon.titan-embed-text-v2:0/invoke "HTTP/1.1 200 OK"
[92m10:12:09 - LiteLLM:INFO[0m: utils.py:1002 - Wrapper: Completed Call, calling success_handler
[2024-11-04 10:12:09,856] p68133 {utils.py:1002} INFO - Wrapper: Completed Call, calling success_handler
[92m10:12:10 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:12:10,001] p68133 {utils.py:3055} INFO - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:12:10,006] p68133 {credentials.py:1278} INFO - Found credentials in shared credentials file: ~/.aws/credentials
[2024-11-04 10:12:11,392] p68133 {_cl

Bootstrapped 1 full traces after 2 examples for up to 1 rounds, amounting to 2 attempts.
Bootstrapping set 5/19


  0%|          | 0/50 [00:00<?, ?it/s][2024-11-04 10:12:17,581] p68133 {credentials.py:1278} INFO - Found credentials in shared credentials file: ~/.aws/credentials
[2024-11-04 10:12:17,717] p68133 {_client.py:1026} INFO - HTTP Request: POST https://bedrock-runtime.us-west-2.amazonaws.com/model/amazon.titan-embed-text-v2:0/invoke "HTTP/1.1 200 OK"
[92m10:12:17 - LiteLLM:INFO[0m: utils.py:1002 - Wrapper: Completed Call, calling success_handler
[2024-11-04 10:12:17,884] p68133 {utils.py:1002} INFO - Wrapper: Completed Call, calling success_handler
[92m10:12:18 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:12:18,031] p68133 {utils.py:3055} INFO - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:12:18,037] p68133 {credentials.py:1278} INFO - Found credentials in shared credentials file: ~/.aws/credentials
[2024-11-04 10:12:19,353] p68133 {_cl

Bootstrapped 1 full traces after 1 examples for up to 1 rounds, amounting to 1 attempts.
Bootstrapping set 6/19


  0%|          | 0/50 [00:00<?, ?it/s][2024-11-04 10:12:20,671] p68133 {credentials.py:1278} INFO - Found credentials in shared credentials file: ~/.aws/credentials
[2024-11-04 10:12:20,811] p68133 {_client.py:1026} INFO - HTTP Request: POST https://bedrock-runtime.us-west-2.amazonaws.com/model/amazon.titan-embed-text-v2:0/invoke "HTTP/1.1 200 OK"
[92m10:12:20 - LiteLLM:INFO[0m: utils.py:1002 - Wrapper: Completed Call, calling success_handler
[2024-11-04 10:12:20,973] p68133 {utils.py:1002} INFO - Wrapper: Completed Call, calling success_handler
[92m10:12:21 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:12:21,119] p68133 {utils.py:3055} INFO - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:12:21,124] p68133 {credentials.py:1278} INFO - Found credentials in shared credentials file: ~/.aws/credentials
[2024-11-04 10:12:23,550] p68133 {_cl

Bootstrapped 2 full traces after 2 examples for up to 1 rounds, amounting to 2 attempts.
Bootstrapping set 7/19


  0%|          | 0/50 [00:00<?, ?it/s][2024-11-04 10:12:30,646] p68133 {credentials.py:1278} INFO - Found credentials in shared credentials file: ~/.aws/credentials
[2024-11-04 10:12:30,862] p68133 {_client.py:1026} INFO - HTTP Request: POST https://bedrock-runtime.us-west-2.amazonaws.com/model/amazon.titan-embed-text-v2:0/invoke "HTTP/1.1 200 OK"
[92m10:12:31 - LiteLLM:INFO[0m: utils.py:1002 - Wrapper: Completed Call, calling success_handler
[2024-11-04 10:12:31,025] p68133 {utils.py:1002} INFO - Wrapper: Completed Call, calling success_handler
[92m10:12:31 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:12:31,173] p68133 {utils.py:3055} INFO - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:12:31,179] p68133 {credentials.py:1278} INFO - Found credentials in shared credentials file: ~/.aws/credentials
[2024-11-04 10:12:33,710] p68133 {_cl

Bootstrapped 2 full traces after 2 examples for up to 1 rounds, amounting to 2 attempts.
Bootstrapping set 8/19


  0%|          | 0/50 [00:00<?, ?it/s][2024-11-04 10:12:38,842] p68133 {credentials.py:1278} INFO - Found credentials in shared credentials file: ~/.aws/credentials
[2024-11-04 10:12:38,981] p68133 {_client.py:1026} INFO - HTTP Request: POST https://bedrock-runtime.us-west-2.amazonaws.com/model/amazon.titan-embed-text-v2:0/invoke "HTTP/1.1 200 OK"
[92m10:12:39 - LiteLLM:INFO[0m: utils.py:1002 - Wrapper: Completed Call, calling success_handler
[2024-11-04 10:12:39,147] p68133 {utils.py:1002} INFO - Wrapper: Completed Call, calling success_handler
[92m10:12:39 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:12:39,296] p68133 {utils.py:3055} INFO - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:12:39,301] p68133 {credentials.py:1278} INFO - Found credentials in shared credentials file: ~/.aws/credentials
[2024-11-04 10:12:40,774] p68133 {_cl

Bootstrapped 2 full traces after 3 examples for up to 1 rounds, amounting to 3 attempts.
Bootstrapping set 9/19


  0%|          | 0/50 [00:00<?, ?it/s][2024-11-04 10:12:52,152] p68133 {credentials.py:1278} INFO - Found credentials in shared credentials file: ~/.aws/credentials
[2024-11-04 10:12:52,377] p68133 {_client.py:1026} INFO - HTTP Request: POST https://bedrock-runtime.us-west-2.amazonaws.com/model/amazon.titan-embed-text-v2:0/invoke "HTTP/1.1 200 OK"
[92m10:12:52 - LiteLLM:INFO[0m: utils.py:1002 - Wrapper: Completed Call, calling success_handler
[2024-11-04 10:12:52,534] p68133 {utils.py:1002} INFO - Wrapper: Completed Call, calling success_handler
[92m10:12:52 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:12:52,676] p68133 {utils.py:3055} INFO - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:12:52,682] p68133 {credentials.py:1278} INFO - Found credentials in shared credentials file: ~/.aws/credentials
[2024-11-04 10:12:55,416] p68133 {_cl

Bootstrapped 1 full traces after 1 examples for up to 1 rounds, amounting to 1 attempts.
Bootstrapping set 10/19


  0%|          | 0/50 [00:00<?, ?it/s][2024-11-04 10:12:57,801] p68133 {credentials.py:1278} INFO - Found credentials in shared credentials file: ~/.aws/credentials
[2024-11-04 10:12:58,183] p68133 {_client.py:1026} INFO - HTTP Request: POST https://bedrock-runtime.us-west-2.amazonaws.com/model/amazon.titan-embed-text-v2:0/invoke "HTTP/1.1 200 OK"
[92m10:12:58 - LiteLLM:INFO[0m: utils.py:1002 - Wrapper: Completed Call, calling success_handler
[2024-11-04 10:12:58,342] p68133 {utils.py:1002} INFO - Wrapper: Completed Call, calling success_handler
[92m10:12:58 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:12:58,531] p68133 {utils.py:3055} INFO - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:12:58,537] p68133 {credentials.py:1278} INFO - Found credentials in shared credentials file: ~/.aws/credentials
[2024-11-04 10:13:01,883] p68133 {_cl

Bootstrapped 2 full traces after 2 examples for up to 1 rounds, amounting to 2 attempts.
Bootstrapping set 11/19


  0%|          | 0/50 [00:00<?, ?it/s][2024-11-04 10:13:09,380] p68133 {credentials.py:1278} INFO - Found credentials in shared credentials file: ~/.aws/credentials
[2024-11-04 10:13:09,592] p68133 {_client.py:1026} INFO - HTTP Request: POST https://bedrock-runtime.us-west-2.amazonaws.com/model/amazon.titan-embed-text-v2:0/invoke "HTTP/1.1 200 OK"
[92m10:13:09 - LiteLLM:INFO[0m: utils.py:1002 - Wrapper: Completed Call, calling success_handler
[2024-11-04 10:13:09,752] p68133 {utils.py:1002} INFO - Wrapper: Completed Call, calling success_handler
[92m10:13:09 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:13:09,896] p68133 {utils.py:3055} INFO - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:13:09,901] p68133 {credentials.py:1278} INFO - Found credentials in shared credentials file: ~/.aws/credentials
[2024-11-04 10:13:12,916] p68133 {_cl

Bootstrapped 2 full traces after 4 examples for up to 1 rounds, amounting to 4 attempts.
Bootstrapping set 12/19


  0%|          | 0/50 [00:00<?, ?it/s][2024-11-04 10:13:30,017] p68133 {credentials.py:1278} INFO - Found credentials in shared credentials file: ~/.aws/credentials
[2024-11-04 10:13:30,368] p68133 {_client.py:1026} INFO - HTTP Request: POST https://bedrock-runtime.us-west-2.amazonaws.com/model/amazon.titan-embed-text-v2:0/invoke "HTTP/1.1 200 OK"
[92m10:13:30 - LiteLLM:INFO[0m: utils.py:1002 - Wrapper: Completed Call, calling success_handler
[2024-11-04 10:13:30,605] p68133 {utils.py:1002} INFO - Wrapper: Completed Call, calling success_handler
[92m10:13:30 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:13:30,749] p68133 {utils.py:3055} INFO - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:13:30,755] p68133 {credentials.py:1278} INFO - Found credentials in shared credentials file: ~/.aws/credentials
[2024-11-04 10:13:34,125] p68133 {_cl

Bootstrapped 1 full traces after 1 examples for up to 1 rounds, amounting to 1 attempts.
Bootstrapping set 13/19


  0%|          | 0/50 [00:00<?, ?it/s][2024-11-04 10:13:36,507] p68133 {credentials.py:1278} INFO - Found credentials in shared credentials file: ~/.aws/credentials
[2024-11-04 10:13:36,892] p68133 {_client.py:1026} INFO - HTTP Request: POST https://bedrock-runtime.us-west-2.amazonaws.com/model/amazon.titan-embed-text-v2:0/invoke "HTTP/1.1 200 OK"
[92m10:13:37 - LiteLLM:INFO[0m: utils.py:1002 - Wrapper: Completed Call, calling success_handler
[2024-11-04 10:13:37,064] p68133 {utils.py:1002} INFO - Wrapper: Completed Call, calling success_handler
[92m10:13:37 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:13:37,209] p68133 {utils.py:3055} INFO - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:13:37,215] p68133 {credentials.py:1278} INFO - Found credentials in shared credentials file: ~/.aws/credentials
[2024-11-04 10:13:38,807] p68133 {_cl

Bootstrapped 1 full traces after 1 examples for up to 1 rounds, amounting to 1 attempts.
Bootstrapping set 14/19


  0%|          | 0/50 [00:00<?, ?it/s][92m10:13:40 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:13:40,458] p68133 {utils.py:3055} INFO - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:13:40,471] p68133 {credentials.py:1278} INFO - Found credentials in shared credentials file: ~/.aws/credentials
[2024-11-04 10:13:42,569] p68133 {_client.py:1026} INFO - HTTP Request: POST https://bedrock-runtime.us-west-2.amazonaws.com/model/anthropic.claude-3-haiku-20240307-v1:0/converse "HTTP/1.1 200 OK"
[92m10:13:42 - LiteLLM:INFO[0m: utils.py:1002 - Wrapper: Completed Call, calling success_handler
[2024-11-04 10:13:42,571] p68133 {utils.py:1002} INFO - Wrapper: Completed Call, calling success_handler
[92m10:13:42 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10

Bootstrapped 2 full traces after 3 examples for up to 1 rounds, amounting to 3 attempts.
Bootstrapping set 15/19


  0%|          | 0/50 [00:00<?, ?it/s][92m10:13:52 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:13:52,946] p68133 {utils.py:3055} INFO - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:13:52,952] p68133 {credentials.py:1278} INFO - Found credentials in shared credentials file: ~/.aws/credentials
[2024-11-04 10:13:55,938] p68133 {_client.py:1026} INFO - HTTP Request: POST https://bedrock-runtime.us-west-2.amazonaws.com/model/anthropic.claude-3-haiku-20240307-v1:0/converse "HTTP/1.1 200 OK"
[92m10:13:55 - LiteLLM:INFO[0m: utils.py:1002 - Wrapper: Completed Call, calling success_handler
[2024-11-04 10:13:55,940] p68133 {utils.py:1002} INFO - Wrapper: Completed Call, calling success_handler
[92m10:13:55 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10

Bootstrapped 2 full traces after 2 examples for up to 1 rounds, amounting to 2 attempts.
Bootstrapping set 16/19


  0%|          | 0/50 [00:00<?, ?it/s][92m10:14:01 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:14:01,863] p68133 {utils.py:3055} INFO - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:14:01,873] p68133 {credentials.py:1278} INFO - Found credentials in shared credentials file: ~/.aws/credentials
[2024-11-04 10:14:03,624] p68133 {_client.py:1026} INFO - HTTP Request: POST https://bedrock-runtime.us-west-2.amazonaws.com/model/anthropic.claude-3-haiku-20240307-v1:0/converse "HTTP/1.1 200 OK"
[92m10:14:03 - LiteLLM:INFO[0m: utils.py:1002 - Wrapper: Completed Call, calling success_handler
[2024-11-04 10:14:03,630] p68133 {utils.py:1002} INFO - Wrapper: Completed Call, calling success_handler
[92m10:14:03 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10

Bootstrapped 2 full traces after 2 examples for up to 1 rounds, amounting to 2 attempts.
Bootstrapping set 17/19


  0%|          | 0/50 [00:00<?, ?it/s][92m10:14:09 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:14:09,753] p68133 {utils.py:3055} INFO - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:14:09,764] p68133 {credentials.py:1278} INFO - Found credentials in shared credentials file: ~/.aws/credentials
[2024-11-04 10:14:12,435] p68133 {_client.py:1026} INFO - HTTP Request: POST https://bedrock-runtime.us-west-2.amazonaws.com/model/anthropic.claude-3-haiku-20240307-v1:0/converse "HTTP/1.1 200 OK"
[92m10:14:12 - LiteLLM:INFO[0m: utils.py:1002 - Wrapper: Completed Call, calling success_handler
[2024-11-04 10:14:12,436] p68133 {utils.py:1002} INFO - Wrapper: Completed Call, calling success_handler
[92m10:14:12 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10

Bootstrapped 1 full traces after 1 examples for up to 1 rounds, amounting to 1 attempts.
Bootstrapping set 18/19


  0%|          | 0/50 [00:00<?, ?it/s][92m10:14:13 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:14:13,960] p68133 {utils.py:3055} INFO - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:14:13,973] p68133 {credentials.py:1278} INFO - Found credentials in shared credentials file: ~/.aws/credentials
[2024-11-04 10:14:17,149] p68133 {_client.py:1026} INFO - HTTP Request: POST https://bedrock-runtime.us-west-2.amazonaws.com/model/anthropic.claude-3-haiku-20240307-v1:0/converse "HTTP/1.1 200 OK"
[92m10:14:17 - LiteLLM:INFO[0m: utils.py:1002 - Wrapper: Completed Call, calling success_handler
[2024-11-04 10:14:17,151] p68133 {utils.py:1002} INFO - Wrapper: Completed Call, calling success_handler
[92m10:14:17 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10

Bootstrapped 1 full traces after 1 examples for up to 1 rounds, amounting to 1 attempts.
Bootstrapping set 19/19


  0%|          | 0/50 [00:00<?, ?it/s][2024-11-04 10:14:18,518] p68133 {credentials.py:1278} INFO - Found credentials in shared credentials file: ~/.aws/credentials
[2024-11-04 10:14:18,884] p68133 {_client.py:1026} INFO - HTTP Request: POST https://bedrock-runtime.us-west-2.amazonaws.com/model/amazon.titan-embed-text-v2:0/invoke "HTTP/1.1 200 OK"
[92m10:14:19 - LiteLLM:INFO[0m: utils.py:1002 - Wrapper: Completed Call, calling success_handler
[2024-11-04 10:14:19,039] p68133 {utils.py:1002} INFO - Wrapper: Completed Call, calling success_handler
[92m10:14:19 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:14:19,184] p68133 {utils.py:3055} INFO - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:14:19,189] p68133 {credentials.py:1278} INFO - Found credentials in shared credentials file: ~/.aws/credentials
[2024-11-04 10:14:21,024] p68133 {_cl

Bootstrapped 2 full traces after 4 examples for up to 1 rounds, amounting to 4 attempts.

==> STEP 2: PROPOSE INSTRUCTION CANDIDATES <==
We will use the few-shot examples from the previous step, a generated dataset summary, a summary of the program code, and a randomly selected prompting tip to propose instructions.


[2024-11-04 10:14:39,946] p68133 {_client.py:1026} INFO - HTTP Request: POST https://bedrock-runtime.us-west-2.amazonaws.com/model/anthropic.claude-3-haiku-20240307-v1:0/converse "HTTP/1.1 200 OK"
[92m10:14:39 - LiteLLM:INFO[0m: utils.py:1002 - Wrapper: Completed Call, calling success_handler
[2024-11-04 10:14:39,948] p68133 {utils.py:1002} INFO - Wrapper: Completed Call, calling success_handler
[92m10:14:39 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:14:39,953] p68133 {utils.py:3055} INFO - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:14:39,964] p68133 {credentials.py:1278} INFO - Found credentials in shared credentials file: ~/.aws/credentials
[2024-11-04 10:14:43,738] p68133 {_client.py:1026} INFO - HTTP Request: POST https://bedrock-runtime.us-west-2.amazonaws.com/model/anthropic.claude-3-haiku-20240307-v1:0/converse "HTTP/1.1 2


Proposing instructions...



[2024-11-04 10:14:55,327] p68133 {_client.py:1026} INFO - HTTP Request: POST https://bedrock-runtime.us-west-2.amazonaws.com/model/anthropic.claude-3-haiku-20240307-v1:0/converse "HTTP/1.1 200 OK"
[92m10:14:55 - LiteLLM:INFO[0m: utils.py:1002 - Wrapper: Completed Call, calling success_handler
[2024-11-04 10:14:55,329] p68133 {utils.py:1002} INFO - Wrapper: Completed Call, calling success_handler
[92m10:14:55 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:14:55,331] p68133 {utils.py:3055} INFO - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:14:55,340] p68133 {credentials.py:1278} INFO - Found credentials in shared credentials file: ~/.aws/credentials
[2024-11-04 10:14:57,478] p68133 {_client.py:1026} INFO - HTTP Request: POST https://bedrock-runtime.us-west-2.amazonaws.com/model/anthropic.claude-3-haiku-20240307-v1:0/converse "HTTP/1.1 2

Proposed Instructions for Predictor 0:

0: Given the fields `context`, `question`, produce the fields `response`.

1: Given a technical question and relevant context information, provide a step-by-step reasoning process to arrive at a clear and accurate response that addresses the user's query.

2: Given a context containing technical information and a question about a specific task, generate a detailed, step-by-step response that directly addresses the user's question, even if the context does not contain information directly relevant to the question.

3: You are an AI assistant tasked with providing critical technical support to a user who is facing a serious computer issue. The user has provided you with relevant context about the problem, as well as a specific question they need answered. Your response could mean the difference between the user being able to resolve the issue or potentially losing important data or functionality on their system. 

Given the fields `context`, `quest

  0%|          | 0/100 [00:00<?, ?it/s][2024-11-04 10:17:06,499] p68133 {credentials.py:1278} INFO - Found credentials in shared credentials file: ~/.aws/credentials
[2024-11-04 10:17:06,504] p68133 {credentials.py:1278} INFO - Found credentials in shared credentials file: ~/.aws/credentials
[2024-11-04 10:17:06,506] p68133 {credentials.py:1278} INFO - Found credentials in shared credentials file: ~/.aws/credentials
[2024-11-04 10:17:06,543] p68133 {credentials.py:1278} INFO - Found credentials in shared credentials file: ~/.aws/credentials
[2024-11-04 10:17:06,575] p68133 {credentials.py:1278} INFO - Found credentials in shared credentials file: ~/.aws/credentials
[2024-11-04 10:17:06,576] p68133 {credentials.py:1278} INFO - Found credentials in shared credentials file: ~/.aws/credentials
[2024-11-04 10:17:06,577] p68133 {credentials.py:1278} INFO - Found credentials in shared credentials file: ~/.aws/credentials
[2024-11-04 10:17:06,577] p68133 {credentials.py:1278} INFO - Found cred

Default program score: 61.92

==> STEP 3: FINDING OPTIMAL PROMPT PARAMETERS <==
We will evaluate the program over a series of trials with different combinations of instructions and few-shot examples to find the optimal combination using Bayesian Optimization.

== Minibatch Trial 1 / 25 ==


[2024-11-04 10:17:34,629] p68133 {utils.py:3055} INFO - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[92m10:17:34 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
  0%|          | 0/25 [00:00<?, ?it/s][92m10:17:34 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:17:34,638] p68133 {credentials.py:1278} INFO - Found credentials in shared credentials file: ~/.aws/credentials
[92m10:17:34 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[92m10:17:34 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:17:34,631] p68133 {utils.py:3055} INFO - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bed

Score: 89.96 on minibatch of size 25 with parameters ['Predictor 1: Instruction 12', 'Predictor 1: Few-Shot Set 7'].
Minibatch scores so far: [89.96]
Full eval scores so far: [61.92]
Best full score so far: 61.92


== Minibatch Trial 2 / 25 ==


[92m10:17:44 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
  0%|          | 0/25 [00:00<?, ?it/s][2024-11-04 10:17:44,088] p68133 {utils.py:3055} INFO - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[92m10:17:44 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[92m10:17:44 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:17:44,108] p68133 {utils.py:3055} INFO - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[92m10:17:44 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:17:44,113] p68133 {utils.py:3055} INFO - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:

Score: 77.26 on minibatch of size 25 with parameters ['Predictor 1: Instruction 10', 'Predictor 1: Few-Shot Set 7'].
Minibatch scores so far: [89.96, 77.26]
Full eval scores so far: [61.92]
Best full score so far: 61.92


== Minibatch Trial 3 / 25 ==


  0%|          | 0/25 [00:00<?, ?it/s][92m10:17:54 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:17:54,417] p68133 {utils.py:3055} INFO - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[92m10:17:54 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:17:54,422] p68133 {credentials.py:1278} INFO - Found credentials in shared credentials file: ~/.aws/credentials
[92m10:17:54 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:17:54,419] p68133 {utils.py:3055} INFO - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[92m10:17:54 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bed

Score: 83.82 on minibatch of size 25 with parameters ['Predictor 1: Instruction 7', 'Predictor 1: Few-Shot Set 18'].
Minibatch scores so far: [89.96, 77.26, 83.82]
Full eval scores so far: [61.92]
Best full score so far: 61.92


== Minibatch Trial 4 / 25 ==


  0%|          | 0/25 [00:00<?, ?it/s][2024-11-04 10:18:02,891] p68133 {utils.py:3055} INFO - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[92m10:18:02 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[92m10:18:02 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:18:02,899] p68133 {utils.py:3055} INFO - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[92m10:18:02 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:18:02,916] p68133 {credentials.py:1278} INFO - Found credentials in shared credentials file: ~/.aws/credentials
[92m10:18:02 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bed

Score: 80.63 on minibatch of size 25 with parameters ['Predictor 1: Instruction 15', 'Predictor 1: Few-Shot Set 2'].
Minibatch scores so far: [89.96, 77.26, 83.82, 80.63]
Full eval scores so far: [61.92]
Best full score so far: 61.92


== Minibatch Trial 5 / 25 ==


[2024-11-04 10:18:12,604] p68133 {credentials.py:1278} INFO - Found credentials in shared credentials file: ~/.aws/credentials
[92m10:18:12 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
  0%|          | 0/25 [00:00<?, ?it/s][92m10:18:12 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:18:12,580] p68133 {utils.py:3055} INFO - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[92m10:18:12 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:18:12,593] p68133 {utils.py:3055} INFO - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[92m10:18:12 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bed

Score: 83.86 on minibatch of size 25 with parameters ['Predictor 1: Instruction 8', 'Predictor 1: Few-Shot Set 18'].
Minibatch scores so far: [89.96, 77.26, 83.82, 80.63, 83.86]
Full eval scores so far: [61.92]
Best full score so far: 61.92


== Minibatch Trial 6 / 25 ==


  0%|          | 0/25 [00:00<?, ?it/s][92m10:18:20 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:18:20,979] p68133 {credentials.py:1278} INFO - Found credentials in shared credentials file: ~/.aws/credentials
[92m10:18:20 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:18:20,977] p68133 {utils.py:3055} INFO - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[92m10:18:20 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:18:20,978] p68133 {utils.py:3055} INFO - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:18:20,978] p68133 {utils.py:3055} INFO - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider

Score: 81.46 on minibatch of size 25 with parameters ['Predictor 1: Instruction 7', 'Predictor 1: Few-Shot Set 1'].
Minibatch scores so far: [89.96, 77.26, 83.82, 80.63, 83.86, 81.46]
Full eval scores so far: [61.92]
Best full score so far: 61.92


== Minibatch Trial 7 / 25 ==


[2024-11-04 10:18:30,313] p68133 {credentials.py:1278} INFO - Found credentials in shared credentials file: ~/.aws/credentials
  0%|          | 0/25 [00:00<?, ?it/s][92m10:18:30 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[92m10:18:30 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:18:30,263] p68133 {utils.py:3055} INFO - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[92m10:18:30 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[92m10:18:30 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:18:30,274] p68133 {utils.py:3055} INFO - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bed

Score: 81.12 on minibatch of size 25 with parameters ['Predictor 1: Instruction 7', 'Predictor 1: Few-Shot Set 12'].
Minibatch scores so far: [89.96, 77.26, 83.82, 80.63, 83.86, 81.46, 81.12]
Full eval scores so far: [61.92]
Best full score so far: 61.92


== Minibatch Trial 8 / 25 ==


[92m10:18:39 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
  0%|          | 0/25 [00:00<?, ?it/s][2024-11-04 10:18:39,369] p68133 {utils.py:3055} INFO - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[92m10:18:39 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:18:39,371] p68133 {utils.py:3055} INFO - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[92m10:18:39 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:18:39,380] p68133 {utils.py:3055} INFO - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[92m10:18:39 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:

Score: 79.92 on minibatch of size 25 with parameters ['Predictor 1: Instruction 11', 'Predictor 1: Few-Shot Set 13'].
Minibatch scores so far: [89.96, 77.26, 83.82, 80.63, 83.86, 81.46, 81.12, 79.92]
Full eval scores so far: [61.92]
Best full score so far: 61.92


== Minibatch Trial 9 / 25 ==


[2024-11-04 10:18:48,763] p68133 {credentials.py:1278} INFO - Found credentials in shared credentials file: ~/.aws/credentials
[92m10:18:48 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
  0%|          | 0/25 [00:00<?, ?it/s][92m10:18:48 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:18:48,758] p68133 {utils.py:3055} INFO - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:18:48,763] p68133 {utils.py:3055} INFO - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[92m10:18:48 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:18:48,764] p68133 {utils.py:3055} INFO - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider

Score: 89.71 on minibatch of size 25 with parameters ['Predictor 1: Instruction 5', 'Predictor 1: Few-Shot Set 4'].
Minibatch scores so far: [89.96, 77.26, 83.82, 80.63, 83.86, 81.46, 81.12, 79.92, 89.71]
Full eval scores so far: [61.92]
Best full score so far: 61.92


== Minibatch Trial 10 / 25 ==


[2024-11-04 10:18:56,715] p68133 {credentials.py:1278} INFO - Found credentials in shared credentials file: ~/.aws/credentials
  0%|          | 0/25 [00:00<?, ?it/s][92m10:18:56 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[92m10:18:56 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:18:56,706] p68133 {utils.py:3055} INFO - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[92m10:18:56 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:18:56,708] p68133 {utils.py:3055} INFO - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:18:56,715] p68133 {utils.py:3055} INFO - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider

Score: 70.36 on minibatch of size 25 with parameters ['Predictor 1: Instruction 14', 'Predictor 1: Few-Shot Set 1'].
Minibatch scores so far: [89.96, 77.26, 83.82, 80.63, 83.86, 81.46, 81.12, 79.92, 89.71, 70.36]
Full eval scores so far: [61.92]
Best full score so far: 61.92


===== Full Eval 1 =====
Doing full eval on next top averaging program (Avg Score: 89.96) from minibatch trials...


Average Metric: 1.0 / 1  (100.0):   0%|          | 0/100 [00:00<?, ?it/s][92m10:19:06 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
Average Metric: 2.0 / 2  (100.0):   1%|          | 1/100 [00:00<00:00, 316.58it/s][2024-11-04 10:19:06,892] p68133 {utils.py:3055} INFO - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[92m10:19:06 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
Average Metric: 2.9 / 3  (96.7):   2%|▏         | 2/100 [00:00<00:00, 361.28it/s] [92m10:19:06 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:19:06,894] p68133 {utils.py:3055} INFO - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
Average Metric: 3.5857142857142854 / 4  (89.6):   3%|▎       

[92mNew best full eval score![0m Score: 85.86
Full eval scores so far: [61.92, 85.86]
Best full score so far: 85.86


== Minibatch Trial 11 / 25 ==


[92m10:19:27 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
  0%|          | 0/25 [00:00<?, ?it/s][92m10:19:27 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:19:27,115] p68133 {credentials.py:1278} INFO - Found credentials in shared credentials file: ~/.aws/credentials
[92m10:19:27 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:19:27,107] p68133 {utils.py:3055} INFO - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[92m10:19:27 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:19:27,108] p68133 {utils.py:3055} INFO - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bed

Score: 82.19 on minibatch of size 25 with parameters ['Predictor 1: Instruction 12', 'Predictor 1: Few-Shot Set 3'].
Minibatch scores so far: [89.96, 77.26, 83.82, 80.63, 83.86, 81.46, 81.12, 79.92, 89.71, 70.36, 82.19]
Full eval scores so far: [61.92, 85.86]
Best full score so far: 85.86


== Minibatch Trial 12 / 25 ==


Average Metric: 22.102018504371447 / 25  (88.4): 100%|██████████| 25/25 [00:00<00:00, 2541.45it/s]
[2024-11-04 10:19:35,307] p68133 {evaluate.py:222} INFO - [2m2024-11-04T15:19:35.307735Z[0m [[32m[1minfo     [0m] [1mAverage Metric: 22.102018504371447 / 25 (88.4%)[0m [[0m[1m[34mdspy.evaluate.evaluate[0m][0m [36mfilename[0m=[35mevaluate.py[0m [36mlineno[0m=[35m222[0m


Score: 88.41 on minibatch of size 25 with parameters ['Predictor 1: Instruction 12', 'Predictor 1: Few-Shot Set 7'].
Minibatch scores so far: [89.96, 77.26, 83.82, 80.63, 83.86, 81.46, 81.12, 79.92, 89.71, 70.36, 82.19, 88.41]
Full eval scores so far: [61.92, 85.86]
Best full score so far: 85.86


== Minibatch Trial 13 / 25 ==


Average Metric: 1.8470588235294119 / 2  (92.4):   4%|▍         | 1/25 [00:00<00:00, 345.89it/s][92m10:19:35 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
Average Metric: 3.309502262443439 / 4  (82.7):  12%|█▏        | 3/25 [00:00<00:00, 542.41it/s] [2024-11-04 10:19:35,337] p68133 {utils.py:3055} INFO - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[92m10:19:35 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
Average Metric: 4.30950226244344 / 5  (86.2):  16%|█▌        | 4/25 [00:00<00:00, 586.62it/s] [2024-11-04 10:19:35,338] p68133 {utils.py:3055} INFO - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[92m10:19:35 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10

Score: 83.17 on minibatch of size 25 with parameters ['Predictor 1: Instruction 5', 'Predictor 1: Few-Shot Set 4'].
Minibatch scores so far: [89.96, 77.26, 83.82, 80.63, 83.86, 81.46, 81.12, 79.92, 89.71, 70.36, 82.19, 88.41, 83.17]
Full eval scores so far: [61.92, 85.86]
Best full score so far: 85.86


== Minibatch Trial 14 / 25 ==


  0%|          | 0/25 [00:00<?, ?it/s][92m10:19:41 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:19:41,571] p68133 {credentials.py:1278} INFO - Found credentials in shared credentials file: ~/.aws/credentials
[92m10:19:41 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:19:41,573] p68133 {credentials.py:1278} INFO - Found credentials in shared credentials file: ~/.aws/credentials
[92m10:19:41 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:19:41,538] p68133 {utils.py:3055} INFO - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[92m10:19:41 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 

Score: 86.08 on minibatch of size 25 with parameters ['Predictor 1: Instruction 3', 'Predictor 1: Few-Shot Set 17'].
Minibatch scores so far: [89.96, 77.26, 83.82, 80.63, 83.86, 81.46, 81.12, 79.92, 89.71, 70.36, 82.19, 88.41, 83.17, 86.08]
Full eval scores so far: [61.92, 85.86]
Best full score so far: 85.86


== Minibatch Trial 15 / 25 ==


[2024-11-04 10:19:50,388] p68133 {credentials.py:1278} INFO - Found credentials in shared credentials file: ~/.aws/credentials
[92m10:19:50 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[92m10:19:50 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:19:50,363] p68133 {utils.py:3055} INFO - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
  0%|          | 0/25 [00:00<?, ?it/s][2024-11-04 10:19:50,373] p68133 {utils.py:3055} INFO - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[92m10:19:50 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:19:50,399] p68133 {credentials.py:1278} INFO - Found credentials in shared credentials file: ~/.aws/credentials
[92m1

Score: 85.3 on minibatch of size 25 with parameters ['Predictor 1: Instruction 4', 'Predictor 1: Few-Shot Set 4'].
Minibatch scores so far: [89.96, 77.26, 83.82, 80.63, 83.86, 81.46, 81.12, 79.92, 89.71, 70.36, 82.19, 88.41, 83.17, 86.08, 85.3]
Full eval scores so far: [61.92, 85.86]
Best full score so far: 85.86


== Minibatch Trial 16 / 25 ==


[92m10:19:59 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:19:59,605] p68133 {utils.py:3055} INFO - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
  0%|          | 0/25 [00:00<?, ?it/s][92m10:19:59 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:19:59,615] p68133 {credentials.py:1278} INFO - Found credentials in shared credentials file: ~/.aws/credentials
[2024-11-04 10:19:59,607] p68133 {utils.py:3055} INFO - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[92m10:19:59 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:19:59,620] p68133 {credentials.py:1278} INFO - Found credentials in shared credentials file: ~/.aws/credentials
[92m1

Score: 91.27 on minibatch of size 25 with parameters ['Predictor 1: Instruction 6', 'Predictor 1: Few-Shot Set 6'].
Minibatch scores so far: [89.96, 77.26, 83.82, 80.63, 83.86, 81.46, 81.12, 79.92, 89.71, 70.36, 82.19, 88.41, 83.17, 86.08, 85.3, 91.27]
Full eval scores so far: [61.92, 85.86]
Best full score so far: 85.86


== Minibatch Trial 17 / 25 ==


[2024-11-04 10:20:07,467] p68133 {utils.py:3055} INFO - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[92m10:20:07 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[92m10:20:07 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
  0%|          | 0/25 [00:00<?, ?it/s][2024-11-04 10:20:07,470] p68133 {utils.py:3055} INFO - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
Average Metric: 1.0 / 1  (100.0):   0%|          | 0/25 [00:00<?, ?it/s][2024-11-04 10:20:07,471] p68133 {utils.py:3055} INFO - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[92m10:20:07 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
Average Metric: 1.8470588235294119 / 2  (92.

Score: 80.07 on minibatch of size 25 with parameters ['Predictor 1: Instruction 6', 'Predictor 1: Few-Shot Set 6'].
Minibatch scores so far: [89.96, 77.26, 83.82, 80.63, 83.86, 81.46, 81.12, 79.92, 89.71, 70.36, 82.19, 88.41, 83.17, 86.08, 85.3, 91.27, 80.07]
Full eval scores so far: [61.92, 85.86]
Best full score so far: 85.86


== Minibatch Trial 18 / 25 ==


[92m10:20:15 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:20:15,906] p68133 {utils.py:3055} INFO - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
  0%|          | 0/25 [00:00<?, ?it/s][2024-11-04 10:20:15,912] p68133 {utils.py:3055} INFO - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[92m10:20:15 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:20:15,913] p68133 {utils.py:3055} INFO - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[92m10:20:15 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[92m10:20:15 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:

Score: 77.82 on minibatch of size 25 with parameters ['Predictor 1: Instruction 1', 'Predictor 1: Few-Shot Set 6'].
Minibatch scores so far: [89.96, 77.26, 83.82, 80.63, 83.86, 81.46, 81.12, 79.92, 89.71, 70.36, 82.19, 88.41, 83.17, 86.08, 85.3, 91.27, 80.07, 77.82]
Full eval scores so far: [61.92, 85.86]
Best full score so far: 85.86


== Minibatch Trial 19 / 25 ==


[2024-11-04 10:20:24,842] p68133 {utils.py:3055} INFO - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[92m10:20:24 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:20:24,843] p68133 {utils.py:3055} INFO - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
  0%|          | 0/25 [00:00<?, ?it/s][92m10:20:24 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:20:24,855] p68133 {utils.py:3055} INFO - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[92m10:20:24 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:20:24,862] p68133 {credentials.py:1278} INFO - Found credentials in shared credentials file: ~/.aws/c

Score: 78.89 on minibatch of size 25 with parameters ['Predictor 1: Instruction 13', 'Predictor 1: Few-Shot Set 14'].
Minibatch scores so far: [89.96, 77.26, 83.82, 80.63, 83.86, 81.46, 81.12, 79.92, 89.71, 70.36, 82.19, 88.41, 83.17, 86.08, 85.3, 91.27, 80.07, 77.82, 78.89]
Full eval scores so far: [61.92, 85.86]
Best full score so far: 85.86


== Minibatch Trial 20 / 25 ==


  0%|          | 0/25 [00:00<?, ?it/s][2024-11-04 10:20:34,418] p68133 {utils.py:3055} INFO - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[92m10:20:34 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:20:34,835] p68133 {utils.py:3055} INFO - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[92m10:20:34 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[92m10:20:34 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:20:34,846] p68133 {credentials.py:1278} INFO - Found credentials in shared credentials file: ~/.aws/credentials
[92m10:20:34 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bed

Score: 87.24 on minibatch of size 25 with parameters ['Predictor 1: Instruction 6', 'Predictor 1: Few-Shot Set 2'].
Minibatch scores so far: [89.96, 77.26, 83.82, 80.63, 83.86, 81.46, 81.12, 79.92, 89.71, 70.36, 82.19, 88.41, 83.17, 86.08, 85.3, 91.27, 80.07, 77.82, 78.89, 87.24]
Full eval scores so far: [61.92, 85.86]
Best full score so far: 85.86


===== Full Eval 2 =====
Doing full eval on next top averaging program (Avg Score: 87.24) from minibatch trials...


  0%|          | 0/100 [00:00<?, ?it/s][92m10:20:43 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:20:43,156] p68133 {utils.py:3055} INFO - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
Average Metric: 1.5833333333333335 / 2  (79.2):   1%|          | 1/100 [00:00<00:00, 164.77it/s][92m10:20:43 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
Average Metric: 2.4303921568627453 / 3  (81.0):   2%|▏         | 2/100 [00:00<00:00, 182.57it/s][2024-11-04 10:20:43,166] p68133 {utils.py:3055} INFO - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[92m10:20:43 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
Average Metric: 3.3303921568627453 / 4  (83.3):   3%|▎         | 3/

Full eval scores so far: [61.92, 85.86, 85.03]
Best full score so far: 85.86


== Minibatch Trial 21 / 25 ==


  0%|          | 0/25 [00:00<?, ?it/s][92m10:21:04 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:21:04,767] p68133 {utils.py:3055} INFO - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[92m10:21:04 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:21:04,783] p68133 {credentials.py:1278} INFO - Found credentials in shared credentials file: ~/.aws/credentials
[2024-11-04 10:21:04,767] p68133 {utils.py:3055} INFO - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[92m10:21:04 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:21:04,776] p68133 {utils.py:3055} INFO - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider

Score: 79.9 on minibatch of size 25 with parameters ['Predictor 1: Instruction 16', 'Predictor 1: Few-Shot Set 15'].
Minibatch scores so far: [89.96, 77.26, 83.82, 80.63, 83.86, 81.46, 81.12, 79.92, 89.71, 70.36, 82.19, 88.41, 83.17, 86.08, 85.3, 91.27, 80.07, 77.82, 78.89, 87.24, 79.9]
Full eval scores so far: [61.92, 85.86, 85.03]
Best full score so far: 85.86


== Minibatch Trial 22 / 25 ==


  0%|          | 0/25 [00:00<?, ?it/s][2024-11-04 10:21:13,905] p68133 {utils.py:3055} INFO - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[92m10:21:13 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:21:13,907] p68133 {utils.py:3055} INFO - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[92m10:21:13 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:21:13,916] p68133 {credentials.py:1278} INFO - Found credentials in shared credentials file: ~/.aws/credentials
[92m10:21:13 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:21:13,907] p68133 {utils.py:3055} INFO - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider

Score: 50.27 on minibatch of size 25 with parameters ['Predictor 1: Instruction 5', 'Predictor 1: Few-Shot Set 8'].
Minibatch scores so far: [89.96, 77.26, 83.82, 80.63, 83.86, 81.46, 81.12, 79.92, 89.71, 70.36, 82.19, 88.41, 83.17, 86.08, 85.3, 91.27, 80.07, 77.82, 78.89, 87.24, 79.9, 50.27]
Full eval scores so far: [61.92, 85.86, 85.03]
Best full score so far: 85.86


== Minibatch Trial 23 / 25 ==


[92m10:21:22 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
  0%|          | 0/25 [00:00<?, ?it/s][92m10:21:22 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:21:22,102] p68133 {utils.py:3055} INFO - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:21:22,111] p68133 {credentials.py:1278} INFO - Found credentials in shared credentials file: ~/.aws/credentials
[92m10:21:22 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:21:22,113] p68133 {credentials.py:1278} INFO - Found credentials in shared credentials file: ~/.aws/credentials
[92m10:21:22 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 

Score: 80.23 on minibatch of size 25 with parameters ['Predictor 1: Instruction 5', 'Predictor 1: Few-Shot Set 9'].
Minibatch scores so far: [89.96, 77.26, 83.82, 80.63, 83.86, 81.46, 81.12, 79.92, 89.71, 70.36, 82.19, 88.41, 83.17, 86.08, 85.3, 91.27, 80.07, 77.82, 78.89, 87.24, 79.9, 50.27, 80.23]
Full eval scores so far: [61.92, 85.86, 85.03]
Best full score so far: 85.86


== Minibatch Trial 24 / 25 ==


[92m10:21:31 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:21:31,226] p68133 {utils.py:3055} INFO - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
  0%|          | 0/25 [00:00<?, ?it/s][2024-11-04 10:21:31,237] p68133 {credentials.py:1278} INFO - Found credentials in shared credentials file: ~/.aws/credentials
[92m10:21:31 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:21:31,244] p68133 {credentials.py:1278} INFO - Found credentials in shared credentials file: ~/.aws/credentials
[92m10:21:31 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:21:31,238] p68133 {utils.py:3055} INFO - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[92m1

Score: 77.32 on minibatch of size 25 with parameters ['Predictor 1: Instruction 2', 'Predictor 1: Few-Shot Set 0'].
Minibatch scores so far: [89.96, 77.26, 83.82, 80.63, 83.86, 81.46, 81.12, 79.92, 89.71, 70.36, 82.19, 88.41, 83.17, 86.08, 85.3, 91.27, 80.07, 77.82, 78.89, 87.24, 79.9, 50.27, 80.23, 77.32]
Full eval scores so far: [61.92, 85.86, 85.03]
Best full score so far: 85.86


== Minibatch Trial 25 / 25 ==


[2024-11-04 10:21:40,041] p68133 {utils.py:3055} INFO - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[92m10:21:40 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
  0%|          | 0/25 [00:00<?, ?it/s][92m10:21:40 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:21:40,045] p68133 {utils.py:3055} INFO - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:21:40,048] p68133 {utils.py:3055} INFO - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[92m10:21:40 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[92m10:21:40 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:

Score: 81.95 on minibatch of size 25 with parameters ['Predictor 1: Instruction 0', 'Predictor 1: Few-Shot Set 5'].
Minibatch scores so far: [89.96, 77.26, 83.82, 80.63, 83.86, 81.46, 81.12, 79.92, 89.71, 70.36, 82.19, 88.41, 83.17, 86.08, 85.3, 91.27, 80.07, 77.82, 78.89, 87.24, 79.9, 50.27, 80.23, 77.32, 81.95]
Full eval scores so far: [61.92, 85.86, 85.03]
Best full score so far: 85.86


===== Full Eval 3 =====
Doing full eval on next top averaging program (Avg Score: 86.44) from minibatch trials...


[92m10:21:47 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:21:47,170] p68133 {credentials.py:1278} INFO - Found credentials in shared credentials file: ~/.aws/credentials
  0%|          | 0/100 [00:00<?, ?it/s][92m10:21:47 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:21:47,157] p68133 {utils.py:3055} INFO - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[92m10:21:47 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[92m10:21:47 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
Average Metric: 1.0 / 1  (100.0):   0%|          | 0/100 [00:00<?, ?it/s][2024-11-04 10:21:47,867] p68133 {credentials.py:1278} INFO - Fou

Full eval scores so far: [61.92, 85.86, 85.03, 83.56]
Best full score so far: 85.86


Returning best identified program with score 85.86!


#### Check for baseline RAG versus optimized RAG approach

In [61]:
baseline = rag(question="cmd+tab does not work on hidden or minimized windows")
logger.info(baseline.response)

[2024-11-04 10:22:28,057] p68133 {2536420804.py:2} INFO - To switch between hidden or minimized windows, you could try the following alternative methods:

1. Use the task manager or window switcher provided by your operating system to view and select the desired window.
2. Utilize keyboard shortcuts specific to your operating system for managing windows, such as Alt+Tab or Windows+Tab on Windows, or Mission Control on macOS.
3. Explore third-party window management tools that may provide more advanced features for handling hidden or minimized windows.

The specific steps may vary depending on your operating system and the applications you are using. I recommend researching the available window management options and keyboard shortcuts for your specific setup.


In [62]:
pred = optimized_rag(question="cmd+tab does not work on hidden or minimized windows")
logger.info(pred.response)

[92m10:22:30 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:22:30,120] p68133 {utils.py:3055} INFO - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:22:30,136] p68133 {credentials.py:1278} INFO - Found credentials in shared credentials file: ~/.aws/credentials
[2024-11-04 10:22:34,281] p68133 {_client.py:1026} INFO - HTTP Request: POST https://bedrock-runtime.us-west-2.amazonaws.com/model/anthropic.claude-3-haiku-20240307-v1:0/converse "HTTP/1.1 200 OK"
[92m10:22:34 - LiteLLM:INFO[0m: utils.py:1002 - Wrapper: Completed Call, calling success_handler
[2024-11-04 10:22:34,285] p68133 {utils.py:1002} INFO - Wrapper: Completed Call, calling success_handler
[2024-11-04 10:22:34,289] p68133 {3450493969.py:2} INFO - The cmd+tab keyboard shortcut is used on macOS to switch between open applications. However, this shortcut typically only works for 

In [63]:
# Now evaluate the optimized RAG approach
evaluate(optimized_rag)

[92m10:22:37 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:22:37,456] p68133 {utils.py:3055} INFO - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[92m10:22:37 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
  0%|          | 0/150 [00:00<?, ?it/s][2024-11-04 10:22:37,479] p68133 {utils.py:3055} INFO - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[92m10:22:37 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:22:37,484] p68133 {utils.py:3055} INFO - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[92m10:22:37 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1


[1;31mProvider List: https://docs.litellm.ai/docs/providers[0m


[1;31mProvider List: https://docs.litellm.ai/docs/providers[0m



[2024-11-04 10:22:52,570] p68133 {_client.py:1026} INFO - HTTP Request: POST https://bedrock-runtime.us-west-2.amazonaws.com/model/anthropic.claude-3-haiku-20240307-v1:0/converse "HTTP/1.1 200 OK"
[92m10:22:52 - LiteLLM:INFO[0m: utils.py:1002 - Wrapper: Completed Call, calling success_handler
[2024-11-04 10:22:52,571] p68133 {utils.py:1002} INFO - Wrapper: Completed Call, calling success_handler
[92m10:22:52 - LiteLLM:INFO[0m: utils.py:3055 - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:22:52,574] p68133 {utils.py:3055} INFO - 
LiteLLM completion() model= anthropic.claude-3-haiku-20240307-v1:0; provider = bedrock
[2024-11-04 10:22:52,580] p68133 {credentials.py:1278} INFO - Found credentials in shared credentials file: ~/.aws/credentials
[2024-11-04 10:22:52,684] p68133 {_client.py:1026} INFO - HTTP Request: POST https://bedrock-runtime.us-west-2.amazonaws.com/model/anthropic.claude-3-haiku-20240307-v1:0/converse "HTTP/1.1 2

Unnamed: 0,question,example_response,reasoning,pred_response,SemanticF1
0,why is mercurial considered to be easier than git?,"Mercurial's syntax is considered more familiar, especially for those accustomed to SVN, and is well documented. It focuses on interface aspects, which initially makes learning...","The provided context does not contain any information about the relative ease of using Mercurial versus Git. The context discusses topics such as steganography, TrueCrypt,...",I do not have enough information in the provided context to determine why Mercurial is considered easier to use than Git. The context does not...,✔️ [0.800]
1,open finder window from current terminal location?,"If you type 'open .' in Terminal, it will open the current directory in a Finder window. Alternatively, you can execute the command open `pwd`...",The provided context does not contain any information about opening a Finder window from the current terminal location. The context discusses topics such as changing...,"To open a Finder window from the current terminal location on macOS, you can use the following command: ``` open . ``` This will open...",✔️ [1.000]
2,how to import secret gpg key (copied from one machine to another)?,It is advised that it is necessary to add `--import` to the command line to import the private key and that according to the man...,The provided context does not contain any information about importing a secret GPG key from one machine to another. The context discusses various technical topics...,"To import a secret GPG key that has been copied from one machine to another, you can follow these steps: 1. Copy the secret key...",✔️ [0.889]


84.28

As seen from the above example, using the DSPy Optimizer approach improved the score from 59% to 84%