Skip to content

shubham-vaishnav-13/visioncode

Repository files navigation

npm version npm downloads MIT License Node.js >= 18 Bun compatible TypeScript 5

🧠 VisionCode CLI

AI-powered coding agent for your terminal and Telegram — analyze, modify, and plan your codebase with full approval control.

VisionCode is a developer-first CLI tool that brings the power of LLMs directly into your codebase workflow. Ask questions about your code, let an AI agent create, modify, and delete files with a full approval workflow, or generate step-by-step implementation plans — all from your terminal or a Telegram bot.


✨ Why VisionCode?

Most AI coding tools lock you into a specific editor or a web UI. VisionCode works where you already are — your terminal — and gives you full control over every change the AI proposes.

  • 🛡️ Zero-trust mutations — Every file creation, modification, and deletion is staged in memory. Nothing touches your disk until you explicitly approve it.
  • 🔍 Diff-before-apply — Review rich, colorized diffs of every proposed change before accepting.
  • 🤖 Multi-mode AI — Choose between Agent, Ask, or Plan mode depending on your task.
  • 💬 Telegram integration — Run the same AI agent from a Telegram bot with inline approval buttons.
  • 🌐 Any LLM via OpenRouter — Use GPT-4, Claude, Gemini, Llama, Mistral, or any model available on OpenRouter.
  • 🔌 Web-aware tools — Optionally enable web search, URL scraping, and crawling via Firecrawl.
  • 📝 Beautiful terminal output — Markdown rendered natively in your terminal with syntax highlighting.

📦 Installation

Quick install (recommended)

npm install -g visioncode-cli

Then launch:

visioncode wakeup

Using npx (no install)

npx visioncode-cli wakeup

From source (for contributors)

git clone https://github.com/shubham-vaishnav-13/visioncode.git
cd visioncode
bun install
bun run dev

See CONTRIBUTING.md for the full development guide.


⚙️ Configuration

Create a .env file in the project root (or wherever you run visioncode):

# Required
OPENROUTER_API_KEY=sk-or-v1-your-api-key-here
OPENROUTER_DEFAULT_MODEL=openrouter/free

# Optional — Telegram bot
TELEGRAM_BOT_TOKEN=your-telegram-bot-token
TELEGRAM_OWNER_ID=your-telegram-user-id

# Optional — Web tools (search, crawl, scrape)
FIRECRAWL_API_KEY=your-firecrawl-api-key
Variable Required Description
OPENROUTER_API_KEY Your OpenRouter API key
OPENROUTER_DEFAULT_MODEL Model ID (e.g. openrouter/free, anthropic/claude-sonnet-4, google/gemini-2.5-pro)
TELEGRAM_BOT_TOKEN BotFather token for Telegram mode
TELEGRAM_OWNER_ID Your Telegram numeric user ID (only the owner can use the bot)
FIRECRAWL_API_KEY Firecrawl API key for web search & scraping tools

Tip: Copy .env.example to .env and fill in your keys.


🚀 Usage

# Launch the interactive mode picker
visioncode wakeup

# Show help
visioncode --help

# Show version
visioncode --version

Mode Picker

When you run visioncode wakeup, you'll see a beautiful ASCII banner and a mode selector:

┌  Which mode you want to proceed with?
│  ○ CLI
│  ○ Telegram
│  ○ Exit
└

Choosing CLI opens the sub-mode menu:

┌  Choose CLI sub-mode
│  ○ Agent Mode
│  ○ Plan Mode
│  ○ Ask Mode
│  ○ ← Back to Main menu
└

🤖 Modes

Agent Mode

The full-power AI coding assistant. Describe a task in natural language — the agent reads your codebase, reasons about it, and proposes file changes.

What would you have in mind today for me?
> Add input validation to the user signup endpoint

The agent has access to these tools:

Tool Description
read_file Read file contents
create_file Stage new file creation
modify_file Stage file modification (full replacement)
delete_file Stage file deletion
create_folder Stage folder creation
list_files List directory contents
search_files Glob pattern + content search
analyze_codebase Structure summary (file counts, types)
queue_shell Queue shell commands for approval
list_skills / read_skill Access SKILL.md definitions from configured directories
web_search Search the web (requires Firecrawl)
web_crawl Scrape a URL into markdown (requires Firecrawl)
fetch_url HTTP GET any URL

🛡️ All mutations are staged — nothing touches your disk until you approve.

After the agent finishes, you enter the approval flow:

┌  Apply staged changes?
│  ○ Approve and apply all
│  ○ Review one by one        ← view diffs per file
│  ○ Cancel
└

Ask Mode

A read-only Q&A mode. Ask any question about your codebase — the agent reads files and searches for answers, but cannot modify anything.

What do you want to ask the codebase?
> How does the authentication middleware work?

The answer is rendered as beautiful markdown in your terminal. You can optionally save it to a .md file in the ask-mode/ directory.

Plan Mode

Describe a high-level goal and the AI generates a structured, step-by-step implementation plan — with complexity ratings. You select which steps to execute, and the agent carries them out one by one.

What is your goal?
> Migrate the database layer from Prisma to Drizzle ORM

The plan includes:

  • 📊 Research summary — what the AI learned about your codebase
  • 📋 Numbered steps — with title, description, hints, and complexity (low / medium / high)
  • Step selector — toggle which steps to execute
  • 🔧 Auto-execution — each selected step runs through the agent with full approval flow

💬 Telegram Bot

Run VisionCode as a Telegram bot — same AI capabilities, but from your phone.

Setup

  1. Create a bot with @BotFather and get the token
  2. Get your numeric Telegram user ID (use @userinfobot)
  3. Add both to your .env:
    TELEGRAM_BOT_TOKEN=123456:ABC-DEF1234ghIkl-zyx57W2v1u123ew11
    TELEGRAM_OWNER_ID=123456789
  4. Start VisionCode and select Telegram mode

Bot Commands

Command Description
/start Show welcome message and available commands
/ask <question> Ask a question about the codebase (read-only)
/agent <task> Run the full agent with file mutation capabilities
/plan <goal> Generate and execute a step-by-step plan

Telegram Approval Flow

When the agent proposes file changes, you get inline buttons:

  • 📋 Show Diff — view the unified diff of all changes
  • ✅ Accept All — apply all changes to disk
  • ❌ Reject All — discard everything

For Plan mode, you get an interactive step selector with toggle buttons, Select All / Deselect All, and a Proceed button.

Security: Only the TELEGRAM_OWNER_ID can interact with the bot. All other users are silently ignored.


🏗️ Project Structure

visioncode/
├── index.ts                         # CLI entry point (Commander.js)
│
├── ai/
│   ├── ai.config.ts                 # OpenRouter model setup
│   └── index.ts                     # AI module exports
│
├── modes/
│   ├── cli.ts                       # CLI sub-mode selector
│   ├── agent/
│   │   ├── orchestrator.ts          # Agent loop & tool orchestration
│   │   ├── agent-tools.ts           # Tool definitions (Vercel AI SDK)
│   │   ├── tool-executor.ts         # File/shell operations with in-memory overlay
│   │   ├── action-tracker.ts        # Append-only action log
│   │   ├── approval.ts             # Interactive terminal approval flow
│   │   ├── diff-view.ts             # Unified diff generation
│   │   └── types.ts                 # Shared types & default config
│   ├── ask/
│   │   ├── orchestrator.ts          # Ask mode loop (read-only agent)
│   │   └── agent-tools.ts           # Read-only tool subset
│   └── plan/
│       ├── orchestrator.ts          # Plan generation → step execution
│       ├── planner.ts               # AI-powered plan generator (structured output)
│       ├── selection.ts             # Interactive step selector UI
│       └── types.ts                 # Plan & PlanStep interfaces
│
├── telegram/
│   ├── index.ts                     # Telegraf bot startup
│   ├── handlers.ts                  # Command & callback handlers
│   ├── agent-run.ts                 # Telegram agent/ask/plan runners
│   ├── approval-session.ts          # Inline approval buttons & diff
│   ├── plan-session.ts              # Interactive plan step toggle
│   ├── auth.ts                      # Owner-only guard
│   ├── constants.ts                 # Welcome message
│   └── text.ts                      # Text utilities (clip, replyMd)
│
├── feature/
│   └── web-tools.ts                 # Firecrawl-powered web search/crawl/fetch
│
├── tui/
│   ├── wakeup.ts                    # ASCII banner & mode picker
│   ├── terminal-md.ts               # Markdown → terminal renderer
│   └── loader.ts                    # Braille spinner animation
│
├── scripts/
│   └── add-shebang.js               # Post-build shebang injector
│
├── tsconfig.json                    # Dev config (Bun, bundler resolution)
├── tsconfig.build.json              # Build config (Node.js, NodeNext)
└── package.json

🧩 Architecture

In-Memory Overlay

The agent never writes directly to disk. All file mutations (create, modify, delete) are staged in an in-memory Map<string, string> overlay. The overlay acts as a virtual filesystem layer — reads fall through to disk, but writes are captured.

Action Tracker

An append-only log records every operation the agent performs. Each entry tracks:

  • Type: file_create, file_modify, file_delete, folder_create, tool_execute, code_analysis
  • Status: pendingapproved / rejectedexecuted
  • Before/After content for diffs

Approval Flow

Pending mutations are grouped by file path. For each group, you can:

  1. View the unified diff
  2. Accept or reject the change
  3. Or approve/reject everything in bulk

Only after approval does applyApprovedFromTracker() write to disk.

AI Pipeline

User prompt
    ↓
ToolLoopAgent (Vercel AI SDK)
    ↓
LLM (via OpenRouter) ←→ Tool calls (read_file, create_file, etc.)
    ↓
Staged mutations in overlay
    ↓
Approval flow (terminal or Telegram)
    ↓
Apply to disk

🔗 Tech Stack

Technology Purpose
Vercel AI SDK ToolLoopAgent, structured output, tool definitions
OpenRouter LLM provider gateway (GPT-4, Claude, Gemini, etc.)
Commander.js CLI argument parsing
@clack/prompts Beautiful terminal prompts (select, text, confirm)
Telegraf Telegram bot framework
Firecrawl Web search & scraping
marked + marked-terminal Markdown → terminal rendering
diff Unified diff generation
figlet ASCII art banner
chalk Terminal colors
Bun Development runtime
TypeScript Type-safe codebase

📄 License

MIT


🤝 Contributing

We welcome contributions! See CONTRIBUTING.md for:

  • Local development setup
  • Import conventions
  • Build system details
  • Publishing to npm
  • Architecture notes

🔗 Links


Built with ❤️ by Shubham Vaishnav

About

AI-powered CLI and Telegram bot for code analysis, autonomous file modifications, and implementation planning — powered by any LLM.

Topics

Resources

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors