Ollive Inference Chatbot

A full-stack LLM chatbot with a lightweight inference logging SDK, near-real-time ingestion API, and PostgreSQL storage for messages and inference metadata.

Features

Requirement	Implementation
Multi-turn chatbot	Conversation history (last 20 messages) sent to the model
Simple UI	React app — list, resume, cancel conversations
Inference SDK	`@ollive/inference-sdk` wraps LLM calls, captures metadata, POSTs to ingest
Ingestion pipeline	`POST /api/ingest` — Zod validation, persistence
Database	PostgreSQL — `conversations`, `messages`, `inference_logs`

Bonus

Multi-provider: Google Gemini (default), OpenAI, Anthropic
Streaming responses (SSE)
Latency / throughput / errors dashboard (live panel)
Docker Compose one-command setup
PII redaction in log previews (email, phone, SSN, cards, API keys)
Event-style decoupling: SDK → HTTP ingest (same process locally; separable in production)

Quick start (Docker)

cp .env.example .env
# Add at least one key:
# OPENAI_API_KEY=sk-...
# ANTHROPIC_API_KEY=sk-ant-...
# GEMINI_API_KEY=your-gemini-api-key-here

docker compose up --build

UI: http://localhost:5173
API: http://localhost:3001/health

Local development

Prerequisites: Node 20+, PostgreSQL 16 (or use Docker for Postgres only).

npm install
cp .env.example .env
# Start Postgres (or: docker compose up postgres -d)

npm run db:migrate
npm run dev

Web: http://localhost:5173 (proxies /api → :3001)
API: http://localhost:3001

Project structure

├── packages/inference-sdk/   # Logging wrapper (publishable SDK)
├── apps/api/                 # Chat API + ingestion + dashboard
├── apps/web/                 # React UI
├── docker-compose.yml
├── ARCHITECTURE.md           # Design notes
└── README.md

API overview

Endpoint	Description
`GET /api/conversations`	List conversations
`POST /api/conversations`	Create conversation
`GET /api/conversations/:id`	Resume — messages + metadata
`POST /api/conversations/:id/cancel`	Cancel conversation
`POST /api/chat/message`	Send message (`stream: true` for SSE)
`POST /api/chat/cancel-stream`	Abort in-flight stream
`GET /api/chat/providers`	Available providers/models
`POST /api/ingest`	Inference log ingestion (SDK target)
`GET /api/dashboard/metrics`	Latency, throughput, errors

Schema design

conversations — session container; status (active | cancelled), optional provider/model defaults.

messages — append-only chat history; FK to conversation with ON DELETE CASCADE.

inference_logs — one row per inference attempt; previews only (not full payloads) to limit storage and PII exposure. Indexed by conversation_id, provider, status for dashboard queries.

Tradeoffs

Previews capped at 500 chars in the SDK; full messages live in messages only.
Ingestion is synchronous HTTP (202 Accepted); failed ingest logs are warned, not retried (see ARCHITECTURE.md).
Context window fixed at 20 messages — simple and predictable; not token-aware.

Environment variables

See .env.example.

What we'd improve with more time

Retry queue (Redis/SQS) for failed ingest events
Token-based context trimming
Auth + multi-tenant conversation isolation
Grafana dashboards from inference_logs
Separate ingest worker service and read replicas
Kubernetes manifests (Helm) for self-hosted deploy

Demo

Screenshots from a local run (Gemini gemini-2.0-flash, inference logging + live metrics).

Chat UI — multi-turn conversation

Multi-turn chat with provider/model selection and streaming support.

Model selection (multi-provider)

Switch between Gemini models (gemini-2.0-flash, gemini-2.5-flash, etc.) from the header.

Streaming response

Token-by-token streaming while the assistant generates a reply.

Inference metrics dashboard

Live 24h panel: latency, throughput, error breakdown, and per-provider stats (fed by the ingestion pipeline).

Run it yourself

docker compose up postgres -d
npm run dev

Open http://localhost:5173 — see Quick start for full setup.

Architecture notes: ARCHITECTURE.md

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
apps		apps
docs/screenshots		docs/screenshots
packages/inference-sdk		packages/inference-sdk
.env.example		.env.example
.gitignore		.gitignore
ARCHITECTURE.md		ARCHITECTURE.md
README.md		README.md
docker-compose.yml		docker-compose.yml
package-lock.json		package-lock.json
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Ollive Inference Chatbot

Features

Quick start (Docker)

Local development

Project structure

API overview

Schema design

Environment variables

What we'd improve with more time

Demo

Chat UI — multi-turn conversation

Model selection (multi-provider)

Streaming response

Inference metrics dashboard

Run it yourself

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Ollive Inference Chatbot

Features

Quick start (Docker)

Local development

Project structure

API overview

Schema design

Environment variables

What we'd improve with more time

Demo

Chat UI — multi-turn conversation

Model selection (multi-provider)

Streaming response

Inference metrics dashboard

Run it yourself

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages