Open Responses API Demo

Demo code for the Open Responses API using Hugging Face Inference Providers and local models via Ollama.

Video Tutorial: Open Responses API Overview

Quick Setup (using UV)

1. Install UV (if not already installed)

# macOS/Linux
curl -LsSf https://astral.sh/uv/install.sh | sh

# Or with Homebrew
brew install uv

2. Create venv and install dependencies

cd demo
uv sync
source .venv/bin/activate  # On Windows: .venv\Scripts\activate

3. Configure environment variables

cp .env.example .env

Edit .env with your settings:

# Required for HF demo
HF_TOKEN=hf_your_token_here

# Optional: Ollama settings (defaults shown)
OLLAMA_HOST=http://localhost:11434
OLLAMA_MODEL=llama3.2

Get your HF token at: https://huggingface.co/settings/tokens (Enable "Make calls to Inference Providers" permission)

4. Run the demos

Hugging Face hosted models:

python open_responses_demo.py

Local models with Ollama:

python ollama_demo.py

Or run with UV directly:

uv run python open_responses_demo.py
uv run python ollama_demo.py

Demo Scripts

`open_responses_demo.py` - HF Inference Providers

Uses the Open Responses API with Hugging Face's hosted models.

Demo	Description
Basic Call	Simple request/response
Streaming	Event-based streaming with semantic events
Tool Calling	Let the model call functions
Reasoning	View raw reasoning traces
Multi-turn	Continue conversations

`ollama_demo.py` - Local Models

Test the Open Responses API with locally-running models via Ollama.

Demo	Description
Basic Call	Simple request/response
Streaming	Event-based streaming with semantic events
Tool Calling	Let the model call functions
Reasoning	View raw reasoning traces
Multi-turn	Continue conversations

Note: If Ollama doesn't yet support the Open Responses API, the script shows the API format for when support is added. Ollama is part of the Open Responses initiative.

`ollama_tools_reasoning.py` - Tools & Reasoning Deep Dive

Focused demo on tool calling and reasoning with gpt-oss:20b.

Demo	Description
Basic Tool Call	Single tool calling example
Multiple Tools	Model chooses from several tools
Tool Loop	Complete request → tool → response cycle
Reasoning Basic	Step-by-step reasoning
Reasoning Streaming	Stream reasoning tokens live
Tools + Reasoning	Combine both capabilities

Ollama Setup

1. Install Ollama

Download from https://ollama.com/download or:

# macOS
brew install ollama

# Linux
curl -fsSL https://ollama.com/install.sh | sh

2. Pull a model

ollama pull llama3.2

Other good options:

ollama pull mistral
ollama pull qwen2.5
ollama pull llama3.3

3. Ollama starts automatically

If not, run:

ollama serve

4. Run the local demo

python ollama_demo.py

Available Models

Hugging Face Inference Providers

Specify a provider using model:provider syntax:

# Default routing
model = "moonshotai/Kimi-K2-Instruct-0905"

# Specific provider
model = "moonshotai/Kimi-K2-Instruct-0905:groq"
model = "Qwen/Qwen2.5-72B-Instruct:together"
model = "meta-llama/Llama-3.3-70B-Instruct:fireworks"

Browse available models: https://huggingface.co/inference/models

Ollama Local Models

# Set in .env or use defaults
OLLAMA_MODEL=llama3.2
OLLAMA_MODEL=mistral
OLLAMA_MODEL=qwen2.5:14b

List installed models: ollama list

File Structure

demo/
├── .env.example              # Environment template
├── .env                      # Your config (gitignored)
├── open_responses_demo.py    # HF hosted demo (Open Responses API)
├── ollama_demo.py            # Ollama demo (Open Responses API)
├── ollama_tools_reasoning.py # Ollama tools & reasoning deep dive
├── pyproject.toml            # UV project config
├── requirements.txt          # Pip fallback
└── README.md

Resources

Open Responses Spec: https://openresponses.org
HF Responses API Docs: https://huggingface.co/docs/inference-providers/guides/responses-api
HF Blog Post: https://huggingface.co/blog/open-responses
Ollama: https://ollama.com
GitHub: https://github.com/openresponses/openresponses

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Open Responses API Demo

Quick Setup (using UV)

1. Install UV (if not already installed)

2. Create venv and install dependencies

3. Configure environment variables

4. Run the demos

Demo Scripts

`open_responses_demo.py` - HF Inference Providers

`ollama_demo.py` - Local Models

`ollama_tools_reasoning.py` - Tools & Reasoning Deep Dive

Ollama Setup

1. Install Ollama

2. Pull a model

3. Ollama starts automatically

4. Run the local demo

Available Models

Hugging Face Inference Providers

Ollama Local Models

File Structure

Resources

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
ollama_demo.py		ollama_demo.py
ollama_tools_reasoning.py		ollama_tools_reasoning.py
open_responses_demo.py		open_responses_demo.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

samwit/Open-Responses-Demo

Folders and files

Latest commit

History

Repository files navigation

Open Responses API Demo

Quick Setup (using UV)

1. Install UV (if not already installed)

2. Create venv and install dependencies

3. Configure environment variables

4. Run the demos

Demo Scripts

open_responses_demo.py - HF Inference Providers

ollama_demo.py - Local Models

ollama_tools_reasoning.py - Tools & Reasoning Deep Dive

Ollama Setup

1. Install Ollama

2. Pull a model

3. Ollama starts automatically

4. Run the local demo

Available Models

Hugging Face Inference Providers

Ollama Local Models

File Structure

Resources

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

`open_responses_demo.py` - HF Inference Providers

`ollama_demo.py` - Local Models

`ollama_tools_reasoning.py` - Tools & Reasoning Deep Dive

Packages