# Context is everything!

![context-window](context-window.png)

- both the prompt and the response need to fit into the context window

  - GPT3/3.5 ~4k tokens
  - GPT4 ~8k or ~32k tokens
  - Claude ~100k tokens

## Every single token in the context window matters!


In [38]:
# Setup
from dotenv import load_dotenv
from langchain.chat_models import ChatOpenAI
from langchain.schema import AIMessage, HumanMessage, SystemMessage
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler

load_dotenv()

gpt35 = ChatOpenAI(
    temperature=0, callbacks=[StreamingStdOutCallbackHandler()], streaming=True
)

gpt35([HumanMessage(content="Hello, how are you?")])

As an AI language model, I don't have feelings, but I'm functioning well. How can I assist you today?

AIMessage(content="As an AI language model, I don't have feelings, but I'm functioning well. How can I assist you today?", additional_kwargs={}, example=False)

# How you ask is important


In [39]:
gpt35([HumanMessage(content="What is the capital of France?")])

The capital of France is Paris.

AIMessage(content='The capital of France is Paris.', additional_kwargs={}, example=False)

In [40]:
gpt35([HumanMessage(content="What is France's capital?")])

Paris

AIMessage(content='Paris', additional_kwargs={}, example=False)

# Unrelated tokens have effect on the response


In [42]:
gpt35(
    [
        HumanMessage(
            content="Giraffes are amazing. Write a tweet about the world running out of ice cream."
        )
    ]
)

Oh no! The world is running out of ice cream! We need to act fast and find a solution before it's too late. #SaveTheIceCream 🍦🌍

AIMessage(content="Oh no! The world is running out of ice cream! We need to act fast and find a solution before it's too late. #SaveTheIceCream 🍦🌍", additional_kwargs={}, example=False)

In [43]:
gpt35(
    [
        HumanMessage(
            content="Lions are amazing. Write a tweet about the world running out of ice cream."
        )
    ]
)

"Uh oh, the world is running out of ice cream! Looks like we'll have to settle for lion-sized scoops of disappointment. #NoMoreIceCream #LionSizedScoops"

AIMessage(content='"Uh oh, the world is running out of ice cream! Looks like we\'ll have to settle for lion-sized scoops of disappointment. #NoMoreIceCream #LionSizedScoops"', additional_kwargs={}, example=False)

In [45]:
# Even tokens in the response matter (chain of thought)

from langchain.llms import OpenAI

gpt3 = OpenAI(
    temperature=0, callbacks=[StreamingStdOutCallbackHandler()], streaming=True
)

prompt = """I visited the bookstore and purchased 15 books. 
I lent 3 books to my friend and donated 4 to the local library. 
Later, I bought 7 more books and sold 2. How many books do I have now?
Let's think step by step.
"""

gpt3(prompt)


Step 1: I purchased 15 books.

15 books

Step 2: I lent 3 books to my friend and donated 4 to the local library.

15 - 3 - 4 = 8 books

Step 3: I bought 7 more books and sold 2.

8 + 7 - 2 = 13 books

Therefore, I have 13 books now.

'\nStep 1: I purchased 15 books.\n\n15 books\n\nStep 2: I lent 3 books to my friend and donated 4 to the local library.\n\n15 - 3 - 4 = 8 books\n\nStep 3: I bought 7 more books and sold 2.\n\n8 + 7 - 2 = 13 books\n\nTherefore, I have 13 books now.'

In [46]:
# Infinite chat history is an illusion

messages = [
    HumanMessage(content="Dam si jeden cheeseburger."),
]

for i in range(150):
    messages.append(AIMessage(content="A dal?"))
    messages.append(
        HumanMessage(content="Dal uz nic, to je vsechno. Jenom ten cheeseburger.")
    )

gpt35(messages)

InvalidRequestError: This model's maximum context length is 4097 tokens. However, your messages resulted in 4514 tokens. Please reduce the length of the messages.