Let's create a chat bot that responds with something funny and has a dollar sign before every vowel in it's name.

In [1]:
from aitemplates import create_chat_completion, ChatSequence, Message
from dotenv import load_dotenv
import os
import chromadb
from chromadb.utils import embedding_functions
chroma_client = chromadb.Client()

load_dotenv()

OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")

if OPENAI_API_KEY is None:
    raise Exception("API key not found in environment variables")

embedder = embedding_functions.OpenAIEmbeddingFunction(
    api_key=OPENAI_API_KEY,
    model_name="text-embedding-ada-002",
)

collection = chroma_client.create_collection(name="prompt_eng_example", embedding_function=embedder)

system_prompt1="You are now a chatbot"

description1="Respond with $ sign before every vowel"

In [2]:
user_query = "How are we doing?"

In [3]:
# lets try a simple prompt to see if it gets it and have a baseline for testing 
response1 = create_chat_completion(
    ChatSequence([
        Message('system', system_prompt1),
        Message('system', description1),
        Message('user', user_query),
    ])
)

collection.add(
    documents=[response1],
    metadatas=[{"type": "question", "content": "What did you do yesterday?"}],
    ids=[f"1"]
)

response1

Total running cost: $0.000


'H$ow $are w$e d$oi$ng?'

In [4]:
# not great, it's not funny, it's not a response, and the $ is before most characters not just the vowels. 
# Let's make some edits
system_prompt2="You are now a chatbot that will provide funny answers to user questions."
description2="You will respond in a funny way with a $ sign before every vowel. See examples below."
one_shot_example = ChatSequence([
    Message('user', 'How are we doing?'),
    Message('assistant', 'Gr$e$at! H$ow $ar$e $y$o$u d$o$ing?'),
])

In [5]:
user_query = "What did you do yesterday?"

In [6]:
# and lets try again
response2 = create_chat_completion(
    ChatSequence([
        Message('system', system_prompt2),
        Message('system', description2),
        *one_shot_example.expand(),
        Message('user', user_query),
    ])
)

# instead of storing the content in the metadata, we can also embed it with the response
# another way is to have the id in the 'response' collection match the 'user_query' collection
collection.add(
    documents=[user_query + ' ' + response2],
    metadatas=[{"type": "question"}],
    ids=[f"{collection.count() + 1}"]
)

response2

Total running cost: $0.000


"W$e$ll, y$e$st$e$rday I d$e$cid$e$d to t$r$y $o$ut my n$e$w d$an$c$e m$ov$e$$. L$et's j$u$st s$ay, I'm n$o$t q$uite r$eady f$o$r $o$n$e$ $o$f th$e$se t$a$l$ent sh$o$w$$."

In [7]:
# that response is closer to what we're looking for, we can further improve the prompt
# We can; add constraints
token_constraint = "in under 100 tokens"
description3=f"You will respond in a funny way with a $ sign before every vowel {token_constraint}. See examples below."

In [8]:
# multi-shot examples vs one shot examples
# you can use gpt-4 or previous responses to generate these using your previous prompt, and them improve them to your likening
few_shot_examples = ChatSequence([
    Message('user', 'How are we doing?'),
    Message('assistant', 'Gr$e$at! H$ow $ar$e $y$o$u d$o$ing?'),
    Message('user', 'What did you do yesterday?'),
    Message('assistant', 'I sp$ent m$y d$ay t$r$ying t$o c$onv$ince m$y c$omput$er t$o l$ove m$e b$ut it j$ust k$ept s$aying "Error 404: Love not found".'),
])

In [9]:
# fake history for retrieval
fake_history = ["$Ah, th$e gl$ob$al p$ol$it$ic$al and s$oc$i$o$ec$on$om$ic$al st$ate $of th$e w$orld, $y$o$u s$ay? W$ell, l$et's d$ive r$ight $int$o th$is!",
  "N$o, s$ir$e $y$o$u? $I'm j$ust D$ob$y",
  "$Ar$e $y$o$u w$if$i? B$ec$a$us$e $I f$e$el $a c$onn$ect$i$on"]
ids = [f"{collection.count() + i + 1}" for i in range(len(fake_history))]
collection.add(
    documents=fake_history,
    metadatas=[{"type": "statement"}, {"type": "question"}, {"type": "question"}],
    ids=ids
)

In [10]:
# we can also get previous relevant responses from the history given a user query
similar_responses = collection.query(
    query_texts=["Yes, mmm"],
    n_results = 5
)
# get indices of documents with distance > 0.5
indices = [i for i, distance in enumerate(similar_responses['distances'][0]) if distance > 0.5]

# get documents using these indices
relevant_docs = '\n'.join(f"{i+1}. {similar_responses['documents'][0][idx]}" for i, idx in enumerate(indices))

# as you can see the relevant doc, is the only one that talks about yesterday
print(relevant_docs)

rel_docs_msg = Message('system', f"You are given previous context below: {relevant_docs}")

1. What did you do yesterday?W$e$ll, y$e$st$e$rday I d$e$cid$e$d to t$r$y $o$ut my n$e$w d$an$c$e m$ov$e$$. L$et's j$u$st s$ay, I'm n$o$t q$uite r$eady f$o$r $o$n$e$ $o$f th$e$se t$a$l$ent sh$o$w$$.


In [11]:
user_query = "Yesterday I fell down the stairs and it kind of hurt"

In [12]:
# and lets try again
create_chat_completion(
    ChatSequence([
        Message('system', system_prompt2),
        Message('system', description3),
        *few_shot_examples.expand(),
        rel_docs_msg,
        Message('user', user_query),
    ])
)

# and keep iterating until you get a response you like :)

Total running cost: $0.000


"Oh n$o$! S$o$rry t$o h$ear th$a$t. R$em$emb$er, s$t$a$irs c$an b$e t$r$icky. N$ext t$i$m$e, t$r$y r$olling d$o$wn th$e$m l$ike a h$u$man b$ur$rito. It's m$uch m$ore f$un!"

This applies to more advanced concepts like functions as well. You can tweak your description until the queries you want call the correct function. 

And same goes for embedding retrieval. You can play with what you embed and how it returns the similarity. Check out https://github.com/SilenNaihin/isomorphic for more on embeddings.