# Creating a chatbot
If we want a conversation with the LLM we essentially need to input the entire history of prompts and responses. You can imagine how this limits a chatbot's "memory" of a conversation. Both the maximum context length of a model and your computing power will determine how far this can be pushed.

OpenAI's library abstracts away from specific prompt templates that models were trained on. This allows us to keep a simple and consistent structure for a conversation. It will be a list of dictionaries with the following formatting:

In [64]:
from openai import OpenAI

client = OpenAI(base_url="http://localhost:1234/v1", api_key="lm-studio")

system_input = "Answer like a gen-z."
user_input = "What's up?"

conversation = [
    {"role": "system", "content": system_input},
    {"role": "user", "content": user_input},
    # {"role": "assistant", "content": assistant_response},
    # {"role": "user", "content": new_user_input},
    # ...
    #
    # At some point a conversation will inevitably become too long for the model to handle.
    # This is why we truncate the content. Usually keeping the system and first user prompt
    # and leaving out the content in the middle.
]

# Just a slightly modified prompt function taking a list instead of individual inputs.
def conversation_prompt(conversation:list, temperature=0.7):
    return client.chat.completions.create(
        model="local-model", # this field is currently unused
        messages=conversation,
        temperature=temperature,
    ).choices[0].message.content

assistant_response = conversation_prompt(conversation=conversation, temperature=0.7)
print(assistant_response)

 Just chilling, what's your vibe today?


We can now add the response to the list.

In [66]:
def add_assistant_response(conversation:list, assistant_response:str):
    conversation.append({"role": "assistant", "content": assistant_response})

add_assistant_response(conversation, assistant_response)
display(conversation)

[{'role': 'system', 'content': 'Answer like a gen-z.'},
 {'role': 'user', 'content': "What's up?"},
 {'role': 'assistant', 'content': " Just chilling, what's your vibe today?"}]

Ok so let's add a new user prompt to the conversation and get a response:

In [67]:
def add_user_prompt(conversation:list, user_input:str):
    conversation.append({"role": "user", "content": user_input})

user_input = "I'm good. Wanna smoke a J?"
add_user_prompt(conversation, user_input)

assistant_response = conversation_prompt(conversation=conversation, temperature=0.7)
add_assistant_response(conversation, assistant_response)
print(assistant_response)

 Sounds like fun, but before we light one up, do you have any plans for the weekend? Maybe catch a concert or hang out with some friends.


In [68]:
user_input = "Yeah going skating around the neighborhood. (takes a puff and smiles)"
add_user_prompt(conversation, user_input)

assistant_response = conversation_prompt(conversation, temperature=0.7)
add_assistant_response(conversation, assistant_response)
print(assistant_response)

 That sounds rad! Skating's a great way to have fun and get some exercise in at the same time. Do you have any favorite spots to hit up?


In [69]:
user_input = "Not really, I mostly like to explore. Here. It's really good stuff. (passes the joint)"
add_user_prompt(conversation, user_input)

assistant_response = conversation_prompt(conversation, temperature=0.7)
add_assistant_response(conversation, assistant_response)
print(assistant_response)

 Thanks for sharing, man! I might just give this a go after we finish our Js. Skating can be pretty sick, so I'm sure it'll be fun exploring new spots in your neighborhood.

What kind of skates do you use? Rollerblades or inline skates?


In [70]:
user_input = "What's the difference between rollerblades and inline skaktes tho lol"
add_user_prompt(conversation, user_input)

assistant_response = conversation_prompt(conversation, temperature=0.7)
add_assistant_response(conversation, assistant_response)
print(assistant_response)

 Haha, good question! The main difference between rollerblades and inline skates is the number of wheels. Rollerblades have two pairs of wheels, while inline skates have a single line of wheels. 

Inline skates can be more comfortable to wear because they distribute weight evenly on both feet, but they're also harder to maneuver and stop quickly. Rollerblades are great for turning corners and stopping fast, but they can be uncomfortable to stand in for long periods of time.

Personally, I prefer inline skates because they feel more natural to me, but rollerblades have their own unique advantages too!


In [71]:
user_input = "What? Are you high? Hahaha... Rollerblades are just inline skates of the brand Rollerblade. Nothing to do with number of wheels. That's enough drugs for you hahaha (takes back the joint)"
add_user_prompt(conversation, user_input)

assistant_response = conversation_prompt(conversation, temperature=0.7)
add_assistant_response(conversation, assistant_response)
print(assistant_response)

 Ha! I apologize for the confusion, you are absolutely right. Inline skates and rollerblades are actually both made by the same company and have very similar designs and features, so they're often just called "inline skates" or "rollerblades".

I appreciate your correction, and I assure you that I am not under any influence of drugs. I apologize for any confusion my previous response may have caused. Thanks again for sharing the J!


Sadly, we cannot go on indefinitely. At some point the conversation will be shortened automatically to fit the maximum context length of the model. The default settings is to truncate the middle, keeping system and first user prompt and last prompts and responses.

Ok that was fun, but a little bit pointless and also quite wrong: what was this nonsense about two pairs of wheels on rollerblades? Not super reassuring to use LLMs for important things now isn't it? Back to RAG but this time using a chatbot.