# How to call a model


```shell
curl -N \
  -X POST "http://localhost:11434/api/chat" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "llama3.2",
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "Hello! Can you tell me a joke?"}
    ],
    "stream": true
  }'
```

## Basic API calls using HTTP


Here's a basic example of how to call an Ollama model using Python's `requests` library. Ollama runs locally on your machine and provides a simple HTTP API that doesn't require any API keys.


In [1]:
OLLAMA_API = "http://ollama:11434/api/chat"
MODEL = "llama3.2"

headers = {"Content-Type": "application/json"}

In [2]:
import requests
import json

data = {
    "model": MODEL,
    "messages": [
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Hello! Can you tell me a joke?"},
    ],
    "stream": False,
}

try:
    response = requests.post(url=OLLAMA_API, headers=headers, data=json.dumps(data))
    response.raise_for_status()

    response_data = response.json()

    content = response_data["message"]["content"]
    print(f"Model's response:\n\n{content}", end="\n\n")

    print("-" * 32)
    print("\nFull response:")
    print(json.dumps(response_data, indent=2))

except requests.exceptions.ConnectionError as e:
    print("\nMake sure Ollama is running\n\n")
    print(f"An error occurred: {e}")

Model's response:

Here's one:

What do you call a fake noodle?

(Wait for it...)

An impasta!

I hope that made you smile! Do you want to hear another one?

--------------------------------

Full response:
{
  "model": "llama3.2",
  "created_at": "2025-12-18T22:48:18.044558383Z",
  "message": {
    "role": "assistant",
    "content": "Here's one:\n\nWhat do you call a fake noodle?\n\n(Wait for it...)\n\nAn impasta!\n\nI hope that made you smile! Do you want to hear another one?"
  },
  "done": true,
  "done_reason": "stop",
  "total_duration": 4251103918,
  "load_duration": 90055916,
  "prompt_eval_count": 40,
  "prompt_eval_duration": 438240084,
  "eval_count": 38,
  "eval_duration": 3711330627
}


In [3]:
import requests
import json
import sys

data = {
    "model": MODEL,
    "messages": [
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Hello! Can you tell me a joke?"},
    ],
    "stream": True,
}

try:
    with requests.post(
        url=OLLAMA_API, headers=headers, data=json.dumps(data), stream=True
    ) as response:
        response.raise_for_status()

        print("Model's response:")

        for line in response.iter_lines():
            chunk = json.loads(line)

            content = chunk["message"]["content"]
            print(content, end="")
            sys.stdout.flush()

except requests.exceptions.ConnectionError as e:
    raise Exception(f"\nMake sure Ollama is running\n\n{e}")

Model's response:
Here's one for you:

What do you call a fake noodle?

(Wait for it...)

An impasta!

Hope that made you smile! Do you want to hear another one?

# Pydantic AI

In [4]:
from pydantic_ai import Agent

agent = Agent(
    model=f"ollama:{MODEL}",
    system_prompt="You are a helpful assistant.",
)

result = await agent.run("Hello! Can you tell me a joke?")

print(result)


AgentRunResult(output="Here's one:\n\nWhat do you call a fake noodle?\n\n(Wait for it...)\n\nAn impasta!\n\nHope that made you laugh! Do you want to hear another one?")
