# Dynamic few-shot prompting

What we do in this notebook:

1. Load synthetic data and embedding (generate with `model_name`).
2. Define a function for dynamic few-shot prompting (i.e., dynamically select few-shot examples based on input similarity).
3. Generate a response using `gp3-3.5-turbo` model.
4. Compare the responses with and without dynamic few-shot prompting.

In [None]:
import numpy as np

## aaaaa

## Cosine similarity

The **cosine similarity** between two vectors A and B is calculated as:

$$
\text{cosine\_similarity}(A,B) = \frac{A \cdot B}{\lVert A \rVert \lVert B \rVert}
$$

Where:

- $A \cdot B$ is the dot product of vectors $A$ and $B$.
- $\lVert A \rVert$ and $\lVert B \rVert$ are the Euclidean norms of vectors $A$ and $B$.

In [None]:
def cosine_similarity(vec1: np.ndarray, vec2: np.ndarray) -> np.float64:
    """Compute the cosine similarity between two vectors."""
    dot_product = np.dot(vec1, vec2)
    norm_vec1 = np.linalg.norm(vec1)
    norm_vec2 = np.linalg.norm(vec2)
    return dot_product / (norm_vec1 * norm_vec2)

In [None]:
v1 = np.array([1, 2, 3])
v2 = np.array([4, 5, 6])
cos_sim = cosine_similarity(v1, v2)
print(f"Cosine Similarity: {cos_sim}")
print(f"type: {type(cos_sim)}")

In [None]:
input_text_1 = "These are not the droids you are looking for"
input_text_2 = "This is an example to test the function"
input_text_3 = "This sentence is used as example to test the function"
input_embedding_1 = get_embedding(text=input_text_1, model=emb_model_name)
input_embedding_2 = get_embedding(text=input_text_2, model=emb_model_name)
input_embedding_3 = get_embedding(text=input_text_3, model=emb_model_name)

print(f"1 vs. 2 = {cosine_similarity(input_embedding_1, input_embedding_2)}")
print(f"1 vs. 3 = {cosine_similarity(input_embedding_1, input_embedding_3)}")
print(f"2 vs. 3 = {cosine_similarity(input_embedding_2, input_embedding_3)}")

## Select closest examples

In [None]:
def select_examples(input_text:str, examples:list , example_embeddings: list, emb_model_name:str, num_examples:int=3):
    """
    Select the most relevant few-shot examples based on cosine similarity.
    """
    input_embedding = get_embedding(text=input_text, model=emb_model_name)

    # TODO: THIS MUST BE CHANGED TO AN EMBEDDING DATABASE     
    similarities = [cosine_similarity(input_embedding, np.array(embedding)) for embedding in example_embeddings]
    selected_indices = np.argsort(similarities)[-num_examples:][::-1]
    return [examples[i] for i in selected_indices]

In [None]:
input_text = "You're so dumb, no wonder we can't have a conversation!"
input_embedding = get_embedding(text=input_text, model=emb_model_name)

In [None]:
select_examples(
    input_text=input_text,
    examples=syn_data_gpt_subset,
    example_embeddings=embeddings_gpt,
    emb_model_name=emb_model_name,
    num_examples=3)