This video was used as inspiration for the notebook: https://www.youtube.com/watch?v=_4K20tOsXK8

To generate responses from the ollama model we can either use the 'chat' or 'generate' endpoints. These does the exact same thing, but which one to use depends on our use case.
If were just asking the model a question as a 'one-off' request and not hold a conversation, then use the 'generate' endpoint
If If we need to go back and forth with the model and manage memory and context, then we use the 'chat' endpoint
Everything can be done with both - so choosing either depends on efficiency.

In the following are examples of how to use the Ollama api in Python.

### Example 1 Simplest example of using Ollama in python

In [8]:
import ollama

output = ollama.generate(model='llama3.1') #This resets the keepalive parameter with the model to the default which is 5 minutes.

print(output)

model='llama3.1' created_at='2026-02-06T09:12:34.805888Z' done=True done_reason='load' total_duration=None load_duration=None prompt_eval_count=None prompt_eval_duration=None eval_count=None eval_duration=None response='' thinking=None context=None logprobs=None


### Example 2: Send a prompt to the model

In [9]:
import ollama

prompt = "why is the sky blue"
output = ollama.generate(model='llama3.1', prompt=prompt) #This returns an object

print(output)

model='llama3.1' created_at='2026-02-06T09:12:41.810423Z' done=True done_reason='stop' total_duration=4826450200 load_duration=128798900 prompt_eval_count=15 prompt_eval_duration=13034400 eval_count=437 eval_duration=4429564800 response="The sky appears blue to us because of a phenomenon called **scattering**, which occurs when sunlight interacts with the tiny molecules of gases in the Earth's atmosphere.\n\nHere's what happens:\n\n1. **Sunlight enters the atmosphere**: When the sun shines, it emits a wide range of electromagnetic radiation, including visible light.\n2. **Short wavelengths are scattered**: As sunlight travels through the atmosphere, it encounters tiny molecules of gases such as nitrogen (N2) and oxygen (O2). These molecules scatter the shorter-wavelength colors, like blue and violet, more than the longer-wavelength colors, like red and orange.\n3. **Blue light is dispersed**: Because blue light has a shorter wavelength (around 450-495 nanometers), it is scattered in al

### Example 3: Streaming the response of a model

In [6]:
import ollama

prompt = "why is the sky blue"
output = ollama.generate(model='llama3.1', prompt=prompt, stream=True) #This returns a generator

# Print each chunk as it is generated and flush empty strings so that the output appears on the same lines
for chunk in output:
    print(chunk['response'], end='', flush=True)

    #Add the stats of the response
    if chunk["done"] == True:
        tokens_per_second = chunk["eval_count"] / (chunk["eval_duration"] / 1000000000) # Durations in Ollama is measured in nanoseconds, so we need to multiply it out
        print("\nstats: {:.2f} tokens per second".format(tokens_per_second))

ModuleNotFoundError: No module named 'ollama'

### Example 4: Make a second generate call with context

Make a second generate call that remembers the first cal, then save the valeur of the context from the last blob and then feed it into the context of the second call.

In [13]:
import ollama
model = 'llama3.1'

prompt = "why is the sky blue"

output = ollama.generate(model=model, prompt=prompt, stream=True)

for chunk in output:
  if chunk["done"] == True:
    print("First Generate Complete")
    context = chunk["context"]

output2 = ollama.generate(model=model, prompt="can it be another?", context=context)
# print(f"output with context\n\n{output2['response']}\n\noutput complete\n");


# output3 = ollama.generate(model=model, prompt="can it be another?");
# print(f"output without context\n\n{output3['response']}\n\noutput complete");

First Generate Complete


In [14]:
print(f"output with context\n\n{output2['response']}\n\noutput complete\n");

output with context

The apparent color of the sky can vary depending on several factors. Here are some reasons why the sky might appear to be other colors:

**Why does the sky sometimes appear red?**

1. **Sunrise and sunset:** During these times, the sun's light has to travel through more of the Earth's atmosphere, scattering off more particles and molecules. This scatters shorter wavelengths like blue and violet, leaving mainly longer wavelengths (red and orange) to reach our eyes.
2. **Dust and pollution:** Tiny particles in the air can scatter light in a way that favors longer wavelengths, making the sky appear reddish or hazy.
3. **Volcanic eruptions or wildfires:** Large amounts of particulate matter in the air can cause a reddish or orange hue.

**Why does the sky sometimes appear green?**

1. **Heavy rain or mist:** When the atmosphere is filled with water droplets, they can scatter light in a way that favors green wavelengths.
2. **Severe thunderstorms:** The intense updrafts

### Example 5: Using the chat endpoint

In [23]:
import ollama

model = 'llama3.1'

# Define an array of messages. Each message includes a role and the content.
msgs = [
  {"role": "system", "content": "The user will give you a concept. Explain it to a 5 year old, using descriptive imagery and interesting and fun stories."},
  {"role": "user", "content": "Quantum Physics" }
]

output = ollama.chat(model=model, messages=msgs)

print(output['message'])

model='llama3.1' created_at='2026-02-02T17:15:42.2520352Z' done=True done_reason='stop' total_duration=36723725500 load_duration=5330259100 prompt_eval_count=44 prompt_eval_duration=177123900 eval_count=500 eval_duration=30806843800 message=Message(role='assistant', content='Let me tell you about Quantum Physics in a way that\'s SUPER cool!\n\nImagine you have a toy box full of different colored balls. Each ball represents a tiny particle like an electron or a photon. Now, imagine these particles are playing hide-and-seek with each other.\n\nIn the normal world, we know where our toys are when we put them away. But in the Quantum World, it\'s like they have a magic trick! They can be in many places at the same time!\n\nImagine you close your eyes, reach into the toy box, and grab a ball. When you open your eyes, you expect to see which color ball is in your hand. But with Quantum Physics, it\'s like the ball says, "Hey, I\'m red... or maybe blue... or perhaps purple!" And when you look