🫏 OddOnkey

A dead-simple Rust wrapper around Ollama. Auto-installs Ollama, auto-pulls models, and lets you prompt a local LLM in two lines of code.

let mut model = OddOnkey::new("mistral").await?;
let answer = model.prompt("What is the capital of France?").await?;

No config files. No API keys. Just add the crate and go.

Features


Zero setup	Automatically installs Ollama and pulls the requested model if needed.
Conversation history	Multi-turn chat with full context, out of the box.
Streaming	Token-by-token output via a standard `Stream` implementation.
Embeddings	Single or batch embedding vectors in one call.
Generation options	Temperature, top-p, top-k, context size, repeat penalty, seed, and more.
Progress bar (opt-in)	Visual feedback during model download and server start.
Per-prompt report (opt-in)	Duration, estimated tokens, throughput, request/response sizes.
Docker mode (opt-in)	Run Ollama in a container — zero local install outside Docker.
Hexagonal architecture	Swap the Ollama backend for any LLM by implementing one trait.

Quick Start

Add to your Cargo.toml:

[dependencies]
oddonkey = "0.2"
tokio = { version = "1", features = ["full"] }

Then:

use oddonkey::OddOnkey;

#[tokio::main]
async fn main() {
    let mut model = OddOnkey::new("mistral").await.unwrap();
    model.add_preprompt("You are a helpful assistant.");
    let answer = model.prompt("What is 2+2?").await.unwrap();
    println!("{answer}");
}

That's it — Ollama is installed and the model is pulled automatically on first run.

Optional Features

Enable in Cargo.toml:

oddonkey = { version = "0.2", features = ["progress", "report"] }

Feature	Description
`progress`	Shows an `indicatif` progress bar during model pull and a spinner while the server starts.
`report`	Enables the `PromptReport` struct (also togglable at runtime via `.enable_report(true)`).
`docker`	Run Ollama inside a Docker container — zero local install outside Docker. Requires Docker on the host.

Usage

Builder Pattern

For fine-grained control, use the builder:

let mut model = OddOnkey::builder("mistral")
    .base_url("http://localhost:11434") // custom Ollama URL
    .progress(true)                     // show progress bar
    .report(true)                       // collect per-prompt stats
    .build()
    .await?;

Docker Mode (zero local install)

With the docker feature enabled, Ollama runs entirely inside a Docker container — nothing is installed on the host except Docker itself.

oddonkey = { version = "0.2", features = ["docker"] }

let mut model = OddOnkey::builder("mistral")
    .docker(true)           // run Ollama in Docker
    .docker_gpu(true)       // optional: GPU passthrough (NVIDIA Container Toolkit)
    .docker_port(11434)     // optional: custom host port
    .docker_cleanup(true)   // optional: remove container + data on drop
    .progress(true)
    .build()
    .await?;

// Same API as always
let answer = model.prompt("Hello!").await?;
// When `model` is dropped, the container and its volume are destroyed automatically.

The container (oddonkey-ollama) persists pulled models across restarts by default. Enable docker_cleanup(true) for zero-trace disposable runs.

You can also manage the container directly:

use oddonkey::DockerManager;

let mgr = DockerManager::new().gpu(true);
mgr.stop()?;    // stop the container (models persist)
mgr.destroy()?; // stop + remove container and volume

System Pre-prompts

model.add_preprompt("You are a friendly pirate.");
// or replace all pre-prompts:
model.set_preprompt("You are a concise assistant.");

Generation Options

use oddonkey::GenerationOptions;

model.set_options(
    GenerationOptions::default()
        .temperature(0.3)
        .num_ctx(8192)
        .top_p(0.9)
        .top_k(40)
        .repeat_penalty(1.1)
        .seed(42)
);

Streaming

use tokio_stream::StreamExt;

let mut stream = model.prompt_stream("Tell me a joke").await?;
let mut full = String::new();
while let Some(tok) = stream.next().await {
    let tok = tok?;
    print!("{tok}");
    full.push_str(&tok);
}
// Save the exchange in history for follow-up context
model.push_assistant_message("Tell me a joke", &full);

Embeddings

// Single text
let vec = model.embed("Rust is awesome").await?;

// Batch
let vecs = model.embed_batch(&["hello", "world"]).await?;

Per-prompt Report

Enable with .report(true) on the builder or .enable_report(true) at runtime:

let answer = model.prompt("Explain borrow checking.").await?;
if let Some(report) = model.last_report() {
    println!("{report}");
}

Output:

── report ──────────────────────────────────
model           : mistral
duration        : 1423 ms
prompt tokens   : ~12 (est.)
completion tkns : ~87 (est.)
tokens/sec      : 61.1
request size    : 245 bytes
response size   : 534 bytes
────────────────────────────────────────────

Architecture

OddOnkey uses a hexagonal (ports & adapters) architecture. The core logic knows nothing about HTTP or Docker — it depends only on the LlmProvider trait.

src/
├── lib.rs                  # public re-exports
├── core/
│   ├── oddonkey.rs          # OddOnkey struct (backend-agnostic)
│   └── builder.rs           # OddOnkeyBuilder
├── domain/                 # pure value objects (no I/O)
│   ├── error.rs             # OddOnkeyError
│   ├── message.rs           # ChatMessage
│   ├── options.rs           # GenerationOptions
│   └── report.rs            # PromptReport
├── ports/
│   └── llm_provider.rs      # LlmProvider trait
└── adapters/
    ├── ollama/              # Ollama HTTP adapter
    │   ├── client.rs         # LlmProvider implementation
    │   ├── installer.rs      # auto-install & server start
    │   ├── pull.rs           # model pull with optional progress
    │   ├── stream.rs         # TokenStream
    │   └── types.rs          # Ollama JSON DTOs
    └── docker/              # Docker adapter (feature-gated)
        └── manager.rs        # container lifecycle management

To add a new backend (e.g. llama.cpp, vLLM, a remote API), implement the LlmProvider trait — no changes to core/ required.

Examples

Run the bundled examples:

# Interactive pirate chat
cargo run --example chat

# Streaming token-by-token
cargo run --example stream

# Embeddings + cosine similarity
cargo run --example embeddings

# Per-prompt timing report
cargo run --example report --features report

# Use a specific model
cargo run --example chat -- llama3

Minimum Supported Rust Version

OddOnkey targets Rust 1.75+ (edition 2021).

Contributing

Contributions are welcome! Please open an issue or submit a pull request on GitHub.

Fork the repo
Create a feature branch (git checkout -b feat/my-feature)
Commit your changes (git commit -m "feat: add my feature")
Push to the branch (git push origin feat/my-feature)
Open a Pull Request

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
examples		examples
src		src
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🫏 OddOnkey

Table of Contents

Features

Quick Start

Optional Features

Usage

Builder Pattern

Docker Mode (zero local install)

System Pre-prompts

Generation Options

Streaming

Embeddings

Per-prompt Report

Architecture

Examples

Minimum Supported Rust Version

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🫏 OddOnkey

Table of Contents

Features

Quick Start

Optional Features

Usage

Builder Pattern

Docker Mode (zero local install)

System Pre-prompts

Generation Options

Streaming

Embeddings

Per-prompt Report

Architecture

Examples

Minimum Supported Rust Version

Contributing

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages