Your AI Assistant, Unlimited.
A supercharged fork of Vellum Assistant — runs on 15+ LLM providers, works for free with Ollama, Groq, and OpenRouter, and ships with firecrawl web scraping and agentic workflows built in.
| Area | Summary |
|---|---|
| Memory | Learns what matters and forgets what doesn't. Structured memory items — identity, preferences, projects, events — extracted with source attribution and deduplication. Hybrid retrieval (dense + sparse) ranks results semantically and lexically, with staleness windows per memory type. Per-user and per-channel isolation. Embeddings run locally by default. |
| Identity | Becomes its own. Behavior lives in SOUL.md, and during onboarding the assistant observes how you communicate and writes its own personality files. A per-user journal captures its reflections on past interactions. NOW.md acts as an ephemeral scratchpad for current focus and active threads. |
| Proactivity | Reaches out when something matters, without being asked. Every hour it checks in with itself: re-reads its notes, notices what's unfinished or due soon, and sends a message if needed. Notifications are routed to the right channel and won't interrupt you if you're already talking. |
| Security | Fail-closed by design. Actor identity is resolved once (guardian, trusted, or unknown) and enforced everywhere. Untrusted actors cannot read or write memory, trigger tools, or escalate. Credentials live in a separate process and never reach the model. Every tool runs in a sandbox. |
| Multi-Provider | 15+ LLM providers out of the box. Anthropic, OpenAI, Google Gemini, Groq, Mistral, Cohere, Together AI, Fireworks, Perplexity, DeepSeek, xAI Grok, Ollama, LM Studio, OpenRouter, and more. Swap models without changing anything else. |
| Free Tiers | Run CLUD for $0. Ollama and LM Studio run models fully locally. Groq's free tier covers most daily usage. OpenRouter :free suffix unlocks dozens of free cloud models. |
| Web Scraping | Firecrawl integration for intelligent web scraping. Extract clean content from any URL, crawl entire sites, and feed the results directly into your agentic workflows. |
| Agentic Workflows | Multi-step tasks, autonomously. CLUD can plan, execute tool chains, browse the web, read files, write code, and loop until the job is done — without hand-holding. |
| Provider | Free Tier | Local | Best Models | Notes |
|---|---|---|---|---|
| Ollama | ✅ Fully free | ✅ Yes | Llama 3, Mistral, Phi-3, Gemma | No API key needed |
| LM Studio | ✅ Fully free | ✅ Yes | Any GGUF model | Desktop app required |
| Groq | ✅ Generous free tier | ❌ No | Llama 3.3, Gemma 2, Mixtral | Fastest inference available |
| OpenRouter | ✅ :free models |
❌ No | 50+ free models | Single key, many providers |
| Anthropic | ❌ Paid | ❌ No | Claude 3.5 Sonnet, Claude 3 Opus | Best reasoning |
| OpenAI | ❌ Paid | ❌ No | GPT-4o, o1, o3 | Widest ecosystem |
| Google Gemini | ✅ Free tier | ❌ No | Gemini 2.0 Flash, Gemini 1.5 Pro | Huge context window |
| Mistral | ❌ Paid | ❌ No | Mistral Large, Codestral | Best for code |
| Groq | ✅ Free tier | ❌ No | Llama 3.3 70B, Mixtral 8x7B | Ultra-low latency |
| Cohere | ✅ Trial credits | ❌ No | Command R+ | RAG-optimized |
| Together AI | ✅ Trial credits | ❌ No | 100+ open models | Best for open-source models |
| Fireworks AI | ✅ Trial credits | ❌ No | Llama 3, Mixtral | Fast serverless inference |
| Perplexity | ❌ Paid | ❌ No | Sonar, Sonar Pro | Search-augmented |
| DeepSeek | ✅ Low cost | ❌ No | DeepSeek-V3, DeepSeek-R1 | Cost-effective reasoning |
| xAI Grok | ❌ Paid | ❌ No | Grok-2 | Real-time X/Twitter data |
See assets/PROVIDERS.md for the full provider reference with setup links and context windows.
1. Clone CLUD
git clone https://github.com/clud-ai/clud.git
cd clud
./setup.sh2. Choose your provider
- Free & local — install Ollama or LM Studio, no API key needed
- Free cloud — get a Groq API key (free tier) or OpenRouter key (use
:freemodels) - Paid cloud — Anthropic, OpenAI, Google, or any other supported provider
3. Hatch your assistant
clud hatchGive it a name, a personality, and the keys to your work.
See CLUD.md for a detailed getting-started guide and provider setup walkthrough.
higher-frame-video.mp4
Install and common commands
The CLI works but the desktop app is our primary focus. Available for advanced users, contributors, and non-macOS environments.
Install
bun install -g clud
clud hatchInstall from source
git clone https://github.com/clud-ai/clud.git
cd clud
./setup.sh
clud hatchCommon commands
clud wake # start services
clud sleep # stop services, keep data
clud client # interact through the terminal
clud ps # view running assistants
clud terminal # open a shell into a managed assistant container
clud upgrade # upgrade to latest versionAll commands target the default assistant. If you have multiple, pass the assistant ID as the second argument.
| Area | Summary |
|---|---|
| Trust engine | Decides who can do what, and defaults to no. Fail-closed trust system that resolves actor identity once (guardian, trusted, or unknown) and enforces it everywhere. Untrusted actors cannot read or write memory, trigger tools, or escalate. Your credentials live in a separate process and never reach the model. |
| Skills | Add new capabilities through sandboxed plugins. Manifest-driven plugins (SKILL.md + TOOLS.json) that inject tools and prompt sections at runtime. Skills can be bundled, installed from a catalog, or added from the workspace. |
| Channels | One assistant, everywhere you need it. Use it from the macOS app, Telegram, or Slack, with shared memory across all of them. More channels coming soon. |
| Multi-provider support | Swap models without changing anything else. Supports 15+ providers including Anthropic Claude, OpenAI, Google Gemini, Groq, Mistral, Cohere, DeepSeek, xAI Grok, Ollama, LM Studio, and OpenRouter. Embeddings follow the same pattern: local ONNX by default, with automatic fallback to cloud providers. |
| Web Scraping | Firecrawl powers intelligent web extraction. Scrape, crawl, and parse web content into clean markdown, ready for the model to reason over. Configurable depth, domain filters, and rate limits. |
| Section | What's covered |
|---|---|
| CLUD Guide | What's new vs Vellum Assistant, free providers, setup walkthrough, CLUD philosophy |
| Providers Reference | Full provider table with free tiers, context windows, and setup links |
| Glossary | Shared vocabulary — concepts, roles, and terms used across the docs and codebase |
| Architecture | Platform domains, repo structure, runtime · clients · gateway |
| Contributing | Development setup, PR workflow, code standards |
| Security | Sandbox, credentials, trust rules, permission modes |
We welcome contributions from everyone.
- Development: The contributing guide will help you get started.
- Make sure to check out our Code of Conduct.
MIT — see License. Integration logos from Simple Icons, licensed CC0 1.0.
CLUD is an open-source fork of Vellum Assistant, built to be provider-agnostic and free-tier friendly. Free to use and modify under MIT.
Built with 💚 — forked from Vellum Assistant, supercharged for everyone

