A dead-simple Rust wrapper around Ollama. Auto-installs Ollama, auto-pulls models, and lets you prompt a local LLM in two lines of code.
let mut model = OddOnkey::new("mistral").await?;
let answer = model.prompt("What is the capital of France?").await?;No config files. No API keys. Just add the crate and go.
- Features
- Quick Start
- Optional Features
- Usage
- Architecture
- Examples
- Minimum Supported Rust Version
- Contributing
- License
| Zero setup | Automatically installs Ollama and pulls the requested model if needed. |
| Conversation history | Multi-turn chat with full context, out of the box. |
| Streaming | Token-by-token output via a standard Stream implementation. |
| Embeddings | Single or batch embedding vectors in one call. |
| Generation options | Temperature, top-p, top-k, context size, repeat penalty, seed, and more. |
| Progress bar (opt-in) | Visual feedback during model download and server start. |
| Per-prompt report (opt-in) | Duration, estimated tokens, throughput, request/response sizes. |
| Docker mode (opt-in) | Run Ollama in a container — zero local install outside Docker. |
| Hexagonal architecture | Swap the Ollama backend for any LLM by implementing one trait. |
Add to your Cargo.toml:
[dependencies]
oddonkey = "0.2"
tokio = { version = "1", features = ["full"] }Then:
use oddonkey::OddOnkey;
#[tokio::main]
async fn main() {
let mut model = OddOnkey::new("mistral").await.unwrap();
model.add_preprompt("You are a helpful assistant.");
let answer = model.prompt("What is 2+2?").await.unwrap();
println!("{answer}");
}That's it — Ollama is installed and the model is pulled automatically on first run.
Enable in Cargo.toml:
oddonkey = { version = "0.2", features = ["progress", "report"] }| Feature | Description |
|---|---|
progress |
Shows an indicatif progress bar during model pull and a spinner while the server starts. |
report |
Enables the PromptReport struct (also togglable at runtime via .enable_report(true)). |
docker |
Run Ollama inside a Docker container — zero local install outside Docker. Requires Docker on the host. |
For fine-grained control, use the builder:
let mut model = OddOnkey::builder("mistral")
.base_url("http://localhost:11434") // custom Ollama URL
.progress(true) // show progress bar
.report(true) // collect per-prompt stats
.build()
.await?;With the docker feature enabled, Ollama runs entirely inside a Docker container — nothing is installed on the host except Docker itself.
oddonkey = { version = "0.2", features = ["docker"] }let mut model = OddOnkey::builder("mistral")
.docker(true) // run Ollama in Docker
.docker_gpu(true) // optional: GPU passthrough (NVIDIA Container Toolkit)
.docker_port(11434) // optional: custom host port
.docker_cleanup(true) // optional: remove container + data on drop
.progress(true)
.build()
.await?;
// Same API as always
let answer = model.prompt("Hello!").await?;
// When `model` is dropped, the container and its volume are destroyed automatically.The container (oddonkey-ollama) persists pulled models across restarts by default. Enable docker_cleanup(true) for zero-trace disposable runs.
You can also manage the container directly:
use oddonkey::DockerManager;
let mgr = DockerManager::new().gpu(true);
mgr.stop()?; // stop the container (models persist)
mgr.destroy()?; // stop + remove container and volumemodel.add_preprompt("You are a friendly pirate.");
// or replace all pre-prompts:
model.set_preprompt("You are a concise assistant.");use oddonkey::GenerationOptions;
model.set_options(
GenerationOptions::default()
.temperature(0.3)
.num_ctx(8192)
.top_p(0.9)
.top_k(40)
.repeat_penalty(1.1)
.seed(42)
);use tokio_stream::StreamExt;
let mut stream = model.prompt_stream("Tell me a joke").await?;
let mut full = String::new();
while let Some(tok) = stream.next().await {
let tok = tok?;
print!("{tok}");
full.push_str(&tok);
}
// Save the exchange in history for follow-up context
model.push_assistant_message("Tell me a joke", &full);// Single text
let vec = model.embed("Rust is awesome").await?;
// Batch
let vecs = model.embed_batch(&["hello", "world"]).await?;Enable with .report(true) on the builder or .enable_report(true) at runtime:
let answer = model.prompt("Explain borrow checking.").await?;
if let Some(report) = model.last_report() {
println!("{report}");
}Output:
── report ──────────────────────────────────
model : mistral
duration : 1423 ms
prompt tokens : ~12 (est.)
completion tkns : ~87 (est.)
tokens/sec : 61.1
request size : 245 bytes
response size : 534 bytes
────────────────────────────────────────────
OddOnkey uses a hexagonal (ports & adapters) architecture. The core logic knows nothing about HTTP or Docker — it depends only on the LlmProvider trait.
src/
├── lib.rs # public re-exports
├── core/
│ ├── oddonkey.rs # OddOnkey struct (backend-agnostic)
│ └── builder.rs # OddOnkeyBuilder
├── domain/ # pure value objects (no I/O)
│ ├── error.rs # OddOnkeyError
│ ├── message.rs # ChatMessage
│ ├── options.rs # GenerationOptions
│ └── report.rs # PromptReport
├── ports/
│ └── llm_provider.rs # LlmProvider trait
└── adapters/
├── ollama/ # Ollama HTTP adapter
│ ├── client.rs # LlmProvider implementation
│ ├── installer.rs # auto-install & server start
│ ├── pull.rs # model pull with optional progress
│ ├── stream.rs # TokenStream
│ └── types.rs # Ollama JSON DTOs
└── docker/ # Docker adapter (feature-gated)
└── manager.rs # container lifecycle management
To add a new backend (e.g. llama.cpp, vLLM, a remote API), implement the LlmProvider trait — no changes to core/ required.
Run the bundled examples:
# Interactive pirate chat
cargo run --example chat
# Streaming token-by-token
cargo run --example stream
# Embeddings + cosine similarity
cargo run --example embeddings
# Per-prompt timing report
cargo run --example report --features report
# Use a specific model
cargo run --example chat -- llama3OddOnkey targets Rust 1.75+ (edition 2021).
Contributions are welcome! Please open an issue or submit a pull request on GitHub.
- Fork the repo
- Create a feature branch (
git checkout -b feat/my-feature) - Commit your changes (
git commit -m "feat: add my feature") - Push to the branch (
git push origin feat/my-feature) - Open a Pull Request