Skip to content

Add Reference Architecture Diagrams & Component Overview #1194

@Classic298

Description

@Classic298

Add Reference Architecture Diagrams & Component Overview

Summary

Add a conceptual architecture overview and reference architecture diagrams showing how Open WebUI's components relate to each other. Diagrams should use component roles (not specific product names) to stay tool-agnostic.

Gap Analysis — What Exists vs What's Missing

What the docs already cover thoroughly

  • Quick Start — excellent procedural docs for the 99% hobbyist case. Covers Docker, Compose, Podman, pip, Helm, Kubernetes, WSL, etc.
  • Scaling guide — thorough step-by-step progression (SQLite → PostgreSQL, add Redis, external vector DB, shared storage, observability). Already includes an ASCII architecture diagram for the fully-scaled production deployment.
  • Features overview — comprehensive capability listing with links to every subsection.
  • Extensibility docs — extremely well-documented. The hierarchy is clear:
    • Extensibility overview has an "Architecture at a Glance" table mapping layers (in-process Python, external HTTP, pipeline workers) to their use cases and trade-offs.
    • Plugin overview explains Tools vs Functions vs Pipelines with clear metaphors and a TL;DR.
    • Tools page includes a Tooling Taxonomy table (native features, workspace tools, MCP, OpenAPI, Open Terminal), full documentation of both tool calling modes (Default/Legacy vs Native/Agentic), and an extensive built-in system tools reference with parameter/output schemas, knowledge tool availability matrices, and per-model granular category toggles.
    • Functions pages individually document Pipes, Filters, and Actions with code skeletons, use cases, and architecture explanations.
    • Development pages cover events (__event_emitter__, __event_call__), valves, rich UI embedding, and reserved arguments.
  • Hardening, logging, and development docs round out the advanced topics.

What's actually missing

1. "What am I deploying?" conceptual overview — this is the primary gap.

The docs go from docker run (quick-start) straight to production scaling (advanced topics). There's no page that explains the logical components inside a running Open WebUI instance and how they interact at a high level.

A user who runs docker run successfully still doesn't understand what's inside the container or how it works architecturally: the web UI server, the inference connection layer, the RAG pipeline (content extraction → chunking → embedding → vector storage → retrieval), the database, file storage, the extensibility layer. The scaling guide mentions these components when telling you to swap them out (e.g., "switch from pypdf to Tika", "switch from SentenceTransformers to external embeddings"), but it assumes you already know what they are and why they exist.

This is the gap that tripped up the sysadmin in the Discord conversation. He could follow the commands but couldn't reason about the system — he didn't have a mental model of the data flow. That's why he was confused about whether his bottleneck was Tika or the LLM.

2. Reference architecture diagrams for common deployment patterns.

The scaling guide has one diagram, but it's the fully-scaled production setup (load balancer → multiple pods → PostgreSQL + PGVector → Redis → shared storage). There are no diagrams for intermediate or common patterns. Most users' deployments sit somewhere between "single docker run" and "full enterprise HA" — and there's no visual representation of those middle-ground architectures.

3. Need-to-extension-type mapping (minor gap).

The extensibility docs are thorough in explaining what each extension type is and how to build one. The Tools page has a Tooling Taxonomy that distinguishes between types of tools. The Functions page explains the three function types with use cases. The extensibility overview has an Architecture at a Glance table.

What's slightly missing is a single, prominent "I want to do X → use Y" decision guide that a non-developer can scan. The existing tables map extension types to their properties, but don't map user problems to extension types. This is the "~90% of feature requests can be solved by a tool/filter/action" observation — people know what they want to accomplish but don't know which extension mechanism to reach for. This is a minor gap since the information is effectively there across multiple pages, but consolidating it into one quick-reference would reduce unnecessary feature requests.

What to Add

Part 1: Component Overview Page (primary addition)

A new page that explains the architecture of a running Open WebUI instance. Written for someone who successfully ran docker run and now wants to understand what they deployed. Should cover:

  • Frontend — the web interface (SvelteKit)
  • Backend — the Python/FastAPI server handling all logic
  • Inference connections — how Open WebUI talks to model providers (Ollama, OpenAI-compatible APIs, Anthropic, vLLM, etc.) — this is a connection pattern, not a single component
  • RAG pipeline — the document ingestion and retrieval chain: content extraction (default: pypdf; alternatives: Tika, Docling) → chunking → embedding (default: SentenceTransformers; alternatives: Ollama, OpenAI) → vector storage (default: ChromaDB; alternatives: PGVector, Milvus, Qdrant, etc.) → retrieval at query time
  • Database — where chats, users, settings live (default: SQLite; production: PostgreSQL)
  • File storage — where uploaded files are persisted (default: local filesystem; scaled: NFS, S3, GCS, Azure Blob)
  • Extensibility layer — brief mention of Tools, Functions, Pipes, Filters, Actions, MCP, Pipelines with links to the existing thorough docs
    A single overview diagram with 1-2 paragraphs per component explaining what it does, why it exists, and what the default vs production alternatives are. This page should link forward to the scaling guide for production hardening.

Part 2: Reference Architecture Diagrams

5-6 simplified architecture diagrams using component roles (not specific tools):

  1. Minimal (cloud API) — Open WebUI → cloud API provider. No local inference, no RAG.
  2. Local with Ollama — Open WebUI ↔ Ollama on same machine or network. No RAG.
  3. Local with RAG — Open WebUI ↔ Inference Server ↔ Document Parser ↔ Embedding Engine ↔ Vector Store. The "I want the LLM to answer questions about my documents" setup.
  4. Hybrid — Open WebUI in cloud, inference on-prem (or vice versa). Common for enterprises where GPU hardware is separate.
  5. Multi-model — Open WebUI ↔ multiple inference backends. For teams using different models for different tasks.
  6. Full production — link to the existing scaling guide diagram (load balancer → multiple pods → PostgreSQL + Redis + shared storage + external vector DB).
    Each diagram should show component boxes labeled by role, arrows showing data flow, relevant default ports, and clear system boundaries. Brief text under each explaining when/why you'd pick that pattern.

Part 3: Extensibility Quick-Reference (minor addition)

A small addition to the existing extensibility overview page — a "What do you want to do?" quick-reference that maps user needs to extension types. Something like:

I want to... Use a... Example
Give the LLM access to live external data Tool Weather API, stock prices, CRM queries
Modify every message before it reaches the LLM Filter (inlet) PII redaction, prompt injection, context enrichment
Modify every response before it reaches the user Filter (outlet/stream) Formatting, compliance filtering, logging
Add a custom button to the chat UI Action "Summarize", "Translate", graph visualization
Add a new model provider Pipe Anthropic, Vertex AI, custom API
Connect an existing service with an API spec OpenAPI / MCP Any REST service, MCP-compatible tools
Run heavy/GPU processing separately Pipeline Cross-encoder reranking, custom ML models

This doesn't replace the existing thorough documentation — it's a quick-scan entry point that helps users identify which docs to read. The information is already spread across the plugin overview, tools page, functions page, and extensibility overview, but having it in one scannable table on the extensibility landing page would help route people faster.

Suggested Location

  • Component overview + reference architectures: new page at getting-started/architecture or getting-started/how-it-works, linked from quick-start's "After You Install" section
  • Extensibility quick-reference: small addition to the existing features/extensibility/ overview page, near the top

Notes

  • Keep diagrams simple — box-and-arrow, not detailed network topology
  • Label components by role, not product name
  • These are reference patterns, not prescriptive
  • Component overview should link forward to scaling guide for production hardening
  • Consider linking from quick-start for users who want more context before diving in

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions