Weave together the intelligence of multiple open-source language models — Mistral, Llama, Phi, and more — into a single, cohesive desktop coding environment. No cloud dependencies, no data leaving your machine, no subscription walls. CodeLoom transforms your local workstation into a private, offline AI mesh that assists you through every stage of software creation.
Unlike monolithic assistants tied to one provider, CodeLoom treats AI models as interchangeable threads. Pull in a model for quick autocompletion, switch to a larger model for deep architectural reasoning, and run them all side‑by‑side without uploading a single line of code to an external server.
Most local coding assistants lock you into one model family or force you to choose between speed and depth. CodeLoom breaks that trade‑off by orchestrating a “loom” of models — each thread specialized for a different cognitive task.
Think of it as a chef’s kitchen: you wouldn’t use a cleaver to fillet a fish or a paring knife to chop bones. Why use the same AI model for syntax suggestions and system architecture? CodeLoom lets you assign the right tool to each step of development.
- Weave responses from three models simultaneously and compare their reasoning.
- Chain commands across models: one suggests code, another reviews it, a third refractors it.
- Stay entirely offline — every inference runs on your GPU/CPU via local backends (llama.cpp, ExLlama, MLX).
Spin up to five concurrent model sessions in one workspace. Each thread maintains its own context window, conversation history, and parameter set. Watch as models debate solutions or collaborate on a single file — you decide which thread becomes the final answer.
Manually or automatically merge context from one model thread into another. Example: have a small model generate a quick first draft, then pass that draft into a larger model for deep expansion — without copying/pasting.
No telemetry, no cloud relays, no user accounts. All models are downloaded from verified Hugging Face repositories and executed locally. Your source code never touches an external network.
Import models from Hugging Face, Ollama, or local GGUF/GPTQ files. CodeLoom automatically optimizes quantization (4-bit, 8-bit, FP16) based on your VRAM and RAM.
Create reusable prompt templates that distribute instructions across multiple models. For example: a “Code Review” loom sends code to Model A for security analysis, Model B for performance profiling, and Model C for style consistency — all in one click.
Pull in entire project directories as context. CodeLoom indexes your codebase (up to 100k tokens) and splits it across model threads intelligently, so each model sees a relevant slice of your project.
Share a loom session over your local network. Team members connect via a lightweight WebSocket — no internet required. Edits and model outputs sync instantly, as if you’re pairing with an AI swarm.
Windows (x64), macOS (Apple Silicon + Intel), and Linux (x64/ARM). Install once, run offline, update only when you choose.
Interface available in 12 languages: English, Spanish, French, German, Mandarin, Japanese, Korean, Portuguese, Arabic, Hindi, Russian, Italian. The underlying models respond in whatever language you prompt them.
Since no servers are involved, your loom never goes down. Power outages? Use a laptop battery. No internet? Works in tunnels, airplanes, or remote cabins.
CodeLoom creates a virtual loom — a container that holds one or more active model instances. You start by selecting a warp (the main model for heavy lifting) and a weft (auxiliary models for quick context or validation). As you code, the loom dynamically decides which thread to call based on the task:
- Fast weft: tiny 1–3B parameter models respond in <200ms for autocomplete and linting.
- Warp thread: 7–70B models handle complex refactoring, explanation, and design decisions.
- Cross-thread shuttle: passes relevant tokens between threads without full context duplication.
You remain in control of every thread: pause, kill, merge, or redirect them at any time.
CodeLoom covers:
- Offline AI coding assistant
- Local LLM desktop application
- Multi-model code generation
- Privacy-focused development tool
- Cross-platform local AI
- Collaborative local LLM editor
- Open-source model orchestrator
- Threaded AI workspace
- No-cloud code assistant
| Component | Minimum | Recommended |
|---|---|---|
| RAM | 16 GB | 32 GB |
| VRAM | 6 GB | 12 GB+ |
| Storage | 10 GB | 50 GB (for multiple models) |
| OS | Windows 10, macOS 12+, Linux Kernel 5.x | Same, plus Vulkan/ROCm support |
- Download the appropriate asset from the release page.
- Extract the archive to a folder of your choice.
- Run the executable (
codeloomon Linux/macOS,Codeloom.exeon Windows). - Select models from the built-in model browser (fetches from Hugging Face) or point to local files.
- Start weaving — no internet, no registration, no license key.
CodeLoom exposes a loom_config.yaml file in its data directory. You can adjust:
- Default model assignments per task type (code completion, explanation, debugging)
- Token budgets per thread
- Temperature, top_k, top_p per model
- Context sharing rules between threads
- UI theme (dark, light, high contrast)
- Remote loom access (LAN only)
Advanced users can write custom loom scripts in Lua — tiny programs that define how models interact, when to switch threads, and how to merge outputs.
| Scenario | How CodeLoom Helps |
|---|---|
| Solo developer building a web app | Use a tiny model for real-time autocomplete, a medium model for debugging, a large model for architecture. All offline. |
| Open-source contributor reviewing PRs | Create a “PR review” loom that feeds code to security, style, and logic models simultaneously. |
| Researcher prototyping ML pipelines | Keep full codebase context in a large model, while a small model handles token-by-token autocomplete. |
| Team in a secure facility | No data leaves the building. Run a shared loom over LAN with no external dependencies. |
| Traveling developer | Works 100% offline on a laptop. Load 1–2 small models to conserve battery. |
- Q1 2026: Multi-user loom sessions with role-based access (viewer, editor, admin).
- Q2 2026: Custom fine-tuning lite — train small adapters on your codebase without leaving the app.
- Q3 2026: Voice loom — dictate prompts and hear model responses (on-device speech-to-text).
- Q4 2026: Plugin API for external tool integration (Git, Docker, CI/CD).
Q: Do I need a powerful GPU?
A: Minimum 6 GB VRAM for 7B models at 4‑bit. For 1B–3B models, CPU-only works on modern machines.
Q: Can I use CodeLoom without any model downloads?
A: Yes — bring your own GGUF/GPTQ files. The app includes a local file picker.
Q: Is there a cloud version?
A: No. CodeLoom is intentionally offline-only. No plan to offer a cloud tier.
Q: How do updates work?
A: Manual download only. The app checks a known URL for new version strings (no data sent).
CodeLoom is released under the MIT License. You are free to use, modify, and distribute the application for any purpose, including commercial use.
MIT License
Copyright (c) 2026 CodeLoom Contributors
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions...
[full text omitted for brevity – see link above]
The models you choose to run within CodeLoom are subject to their own licenses and usage terms. CodeLoom itself is a neutral orchestrator — it does not endorse, filter, or guarantee the output of any third-party model. Users are responsible for compliance with model-specific terms (e.g., Llama 3 license, Mistral research-only limitations). The developers of CodeLoom assume no liability for damages arising from the use of this software or the models it orchestrates.
Weave smarter, not harder. CodeLoom — your local AI loom for the modern developer.