MD Docs Agent

AI-Powered Markdown Documentation Management System

A comprehensive documentation management system that combines semantic search, intelligent code exploration, and multiple interaction interfaces. Manage your markdown documentation with vector-based search while having the ability to fetch additional context from source code.

Core Features

Documentation Management

Markdown Documentation Storage: Central repository for markdown documentation in md-docs/ directory
Vector-Based Semantic Search: Natural language queries with AI-powered query expansion and filtering
Smart Chunking: Intelligent content splitting that preserves code blocks and tables
File Operations: Create, read, edit, rename, and delete documentation files
Auto-Update Embeddings: Automatically update search index after file changes

Source Code Exploration

Code Search: Powerful ripgrep/grep integration for pattern matching across source code
AI-Powered Exploration: LLM-generated summaries of files and folders
Folder Visualization: Directory tree structure with file sizes
File Pattern Matching: Find files using glob patterns

Interfaces & Integration

Telegram Bot: Interactive chat-based documentation assistant
MCP Servers: 2 Model Context Protocol servers for Claude Desktop and other MCP clients
CLI Tools: 11+ standalone command-line tools for all operations
Programmatic API: TypeScript API for custom integrations
Docker Support: Production-ready containerized deployment

Use Cases

Documentation Management

Technical Documentation Repository: Central storage for API docs, guides, tutorials
Knowledge Base: Company wikis, internal documentation, process guides
Semantic Search: Find relevant documentation using natural language queries
Documentation Editing: Create, edit, rename, delete documentation files programmatically

Code Exploration

Context Enhancement: Fetch source code context to supplement documentation answers
Code Search: Find implementations, patterns, and examples in your codebase
Project Discovery: AI-powered exploration of unfamiliar codebases
Architecture Visualization: Generate folder structure views of projects

Integrated Workflows

AI Documentation Assistant: Telegram bot that answers questions using docs + code
Claude Desktop Integration: MCP servers for seamless Claude integration
Custom Tools: Build your own documentation workflows using the API

Quick Start

1. Install Dependencies

npm install

2. Configure Environment

Create a .env file with your OpenAI API key:

cp .env.example .env

Edit .env and add your API key:

OPENAI_API_KEY=your_openai_api_key_here

3. Add Your Documentation

Place your markdown documentation files in the md-docs/ directory:

mkdir -p md-docs
# Add your .md files to md-docs/

Or create files programmatically:

npm run docs:create -- my-guide.md "# My Guide\n\nContent here..."

4. Generate Embeddings

Process your markdown files to create vector embeddings for semantic search:

npm run embed

This creates a SQLite vector database at ./vector.db.

5. Search Documentation

Search your documentation using natural language queries:

npm run docs:search -- "How to configure authentication"

6. Start Using the Tools

Explore your documentation and code:

# Search your codebase
npm run code:search -- "class.*Service" src --file-pattern="*.ts"

# Explore a folder with AI summaries
npm run code:explore -- src/tools 1

# Visualize directory structure
npm run code:tree -- src 3 --show-sizes

Usage

Embedding Generation

npm run embed

Reads all .md files from md-docs/
Intelligently chunks content using semantic boundaries (paragraphs, sentences, code blocks, tables)
Preserves code blocks and markdown tables intact for better context
Generates embeddings using OpenAI API
Stores in SQLite vector database with rich metadata (source, content type, size)
Metadata enables filtering by code/table presence and source tracking

Semantic Search

npm run search -- "your query"

Options:

--limit=N - Return top N results (default: 5)

Example:

npm run search -- "user authentication setup" --limit=10

MCP Server

Run as an MCP server for integration with Claude Desktop or other MCP clients:

npm run mcp

The MCP server implements an intelligent three-stage RAG pipeline:

Query Expansion: LLM generates 3-5 variations of your query with related terms
Vector Search: Searches database with all query variations (up to 20 results)
LLM Filtering: Analyzes and returns only the 3-5 most relevant results

Configure in Claude Desktop

Edit your Claude Desktop config file:

macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
Windows: %APPDATA%\Claude\claude_desktop_config.json

Add this configuration (use absolute paths):

{
  "mcpServers": {
    "docs-rag": {
      "command": "npx",
      "args": [
        "tsx",
        "/absolute/path/to/md-docs-agent/src/mcp/mcp-server.ts"
      ],
      "env": {
        "OPENAI_API_KEY": "your_openai_api_key_here",
        "VECTOR_DB_PATH": "/absolute/path/to/md-docs-agent/vector.db",
        "OPENAI_EMBEDDING_MODEL": "text-embedding-3-small",
        "OPENAI_MODEL": "gpt-4o-mini"
      }
    }
  }
}

Restart Claude Desktop, then ask: "Can you search the docs for authentication setup?"

Telegram Bot

Run the interactive Telegram bot to provide documentation assistance:

npm run bot

The Telegram bot provides an interactive way to search and explore documentation:

Features:

Natural language question answering
Enhanced context retrieval (reads full files for top results)
Source file references in responses
Telegram-compatible Markdown formatting
Returns "No information in documentation" when no relevant content is found

Setup:

Create a Telegram bot via @BotFather:
- Send /newbot to BotFather
- Follow the instructions to create your bot
- Copy the bot token

Add the bot token to your .env file:

TELEGRAM_BOT_TOKEN=your_telegram_bot_token_here

Start the bot:
```
npm run bot
```
Open Telegram and start a conversation with your bot!

Bot Commands:

/start - Welcome message and introduction
/help - Usage instructions and example questions
Any text message - Search documentation and get answers

How it works:

User sends a question via Telegram
Bot searches the vector database using semantic search
For top 3 results, bot fetches full file content
LLM generates comprehensive answer with full context
Response is formatted in Telegram-compatible HTML and sent back

Architecture: The bot uses a multi-MCP architecture:

docs-rag MCP: Semantic search with query expansion
tools MCP: Code search, file operations, and folder exploration

File reading is done directly via fetchFile() for better performance.

Configuration is managed via mcp-servers.json.

CLI Tools

The project includes 11 reusable tools organized by purpose:

Documentation Management Tools

# Semantic search in documentation
npm run search -- "authentication" --limit=5

# Read file content
npm run tool:fetch-file -- path/to/file.md

# Create a new documentation file
npm run tool:create-file -- new-guide.md "# New Guide\n\nContent here..."

# Edit existing file (replace, insert, append, prepend, delete)
npm run tool:edit-file -- guide.md replace --search="old text" --replace="new text"

# Rename or move a file
npm run tool:rename-file -- old-name.md new-name.md

# Delete a file (with backup)
npm run tool:delete-file -- obsolete.md

# Update embeddings after file changes
npm run tool:update-embeddings -- path/to/changed-file.md

Code Exploration Tools

# Search code with ripgrep/grep
npm run tool:search -- "class.*Auth" src --file-pattern="*.ts"

# Find files by glob pattern
npm run tool:find-files -- src "*.ts" 10

# AI-powered folder exploration (generates summaries)
npm run tool:explore -- src/tools 1

# Directory tree visualization
npm run tool:tree -- src 3 --show-sizes

Programmatic Usage

import {
  // Documentation Management
  fetchFile, createFile, editFile, renameFile, deleteFile,
  semanticSearch, updateEmbeddings,
  // Code Exploration
  searchCode, exploreFolder, getFolderStructure, findFiles
} from './src/tools';

// === Documentation Management ===

// Read a file
const file = fetchFile({ filePath: 'guide.md' });
console.log(file.content);

// Create a new file
const created = createFile({
  filePath: 'new-guide.md',
  content: '# New Guide\n\nContent...'
});

// Edit file (replace, insert, append, prepend, delete)
const edited = editFile({
  filePath: 'guide.md',
  operation: 'replace',
  search: 'old text',
  replace: 'new text'
});

// Rename/move a file
const renamed = renameFile({ oldPath: 'old.md', newPath: 'new.md' });

// Delete a file (creates backup)
const deleted = deleteFile({ filePath: 'obsolete.md' });

// Semantic search
const docs = await semanticSearch({ query: 'authentication', limit: 5 });
console.log(docs.results);

// Update embeddings after changes
const updated = await updateEmbeddings({ filePath: 'guide.md' });

// === Code Exploration ===

// Search code patterns
const code = searchCode({
  pattern: 'class.*Auth',
  path: 'src',
  filePattern: '*.ts'
});

// Find files by pattern
const files = findFiles({ path: 'src', pattern: '*.ts', maxResults: 10 });

// AI-powered folder exploration
const exploration = await exploreFolder({
  folderPath: 'src/tools',
  maxDepth: 1
});

// Get folder structure
const tree = getFolderStructure({
  path: 'src',
  maxDepth: 3,
  showSizes: true
});

Testing

Run integration tests:

npm test

Docker Deployment

Deploy the Telegram bot using Docker:

Quick Start

# 1. Clone repository
git clone <repository-url>
cd md-docs-agent

# 2. Configure environment
cp docker/.env.example .env
nano .env  # Add your API keys

# 3. Start bot
docker-compose -f docker/docker-compose.yml up -d

Management

# View logs
docker-compose -f docker/docker-compose.yml logs -f

# Stop bot
docker-compose -f docker/docker-compose.yml down

# Restart after changes
docker-compose -f docker/docker-compose.yml restart

# Update documentation
cp new-docs/* md-docs/
docker-compose -f docker/docker-compose.yml run --rm telegram-bot npx tsx src/embed.ts
docker-compose -f docker/docker-compose.yml restart

See DEPLOY.md and docker/README.md for detailed deployment instructions.

Configuration

Configure the tool via .env file:

Variable	Default	Description
`OPENAI_API_KEY`	-	Your OpenAI API key (required)
`OPENAI_EMBEDDING_MODEL`	`text-embedding-3-small`	OpenAI embedding model
`OPENAI_MODEL`	`gpt-4o-mini`	LLM for query expansion and filtering
`TELEGRAM_BOT_TOKEN`	-	Telegram bot token from @BotFather (required for bot)
`TELEGRAM_ALLOWED_CHAT_IDS`	-	Comma-separated list of allowed chat IDs (optional access control)
`VECTOR_DB_PATH`	`./vector.db`	Path to SQLite database
`CHUNK_SIZE`	`1000`	Text chunk size in characters
`CHUNK_OVERLAP`	`200`	Overlap between chunks

Architecture

The system is built around two core capabilities: documentation management and code exploration, accessible through multiple interfaces.

System Overview

┌─────────────────────────────────────────────────────────────┐
│                    User Interfaces                          │
│  Telegram Bot │ MCP Servers │ CLI Tools │ Programmatic API │
└─────────────────────────────────────────────────────────────┘
                              │
┌─────────────────────────────┴─────────────────────────────┐
│                     Core Capabilities                       │
├─────────────────────────────┬─────────────────────────────┤
│  Documentation Management   │   Code Exploration          │
│  • Semantic Search (RAG)    │   • Pattern Search (grep)   │
│  • File Operations          │   • AI Exploration (LLM)    │
│  • Vector Embeddings        │   • Folder Visualization    │
│  • Auto-Update Index        │   • File Discovery          │
└─────────────────────────────┴─────────────────────────────┘
                              │
┌─────────────────────────────┴─────────────────────────────┐
│                      Storage Layer                          │
│    Vector DB (SQLite)  │  md-docs/  │  Source Code        │
└─────────────────────────────────────────────────────────────┘

Layered Architecture

Tools Layer (src/tools/) - Reusable, testable utilities
- Documentation: fetch, create, edit, rename, delete, semantic search, update embeddings
- Code Exploration: code search, find files, explore folder, folder structure
- Each tool works standalone via CLI or programmatically
Core Layer (src/core/) - Business logic
- Search orchestration with query expansion and LLM filtering
- Answer generation with context assembly
- Prompt template management
MCP Layer (src/mcp/) - Protocol servers for external integrations
- mcp-server: RAG search with query expansion and filtering
- tools-mcp-server: All 11 tools exposed via MCP protocol
- mcp-client: Multi-server connection manager
Interface Layer - Multiple ways to interact
- Telegram Bot: Chat-based documentation assistant
- MCP Servers: Integration with Claude Desktop and other MCP clients
- CLI Tools: Command-line interface for all operations
- Programmatic API: TypeScript/JavaScript library

Key Benefits

Dual Purpose: Both documentation management and code exploration in one system
Modular: Each component has a single, well-defined responsibility
Flexible: Multiple interfaces share the same core capabilities
Reusable: Tools work standalone, via MCP, or programmatically
Testable: Clean separation of concerns with dependency injection
Extensible: Easy to add new tools, interfaces, or integrations

Technology Stack

TypeScript - Type-safe JavaScript
OpenAI API - Text embeddings and LLM operations
SQLite Vector - Vector database with vec0 extension
better-sqlite3 - SQLite driver
MCP SDK - Model Context Protocol
Telegraf - Telegram bot framework
Docker - Containerization and deployment

License

ISC

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
docker		docker
prompts		prompts
src		src
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
DEPLOY.md		DEPLOY.md
README.md		README.md
confluence-pages.md		confluence-pages.md
mcp-config.example.json		mcp-config.example.json
mcp-config.json		mcp-config.json
mcp-servers.json		mcp-servers.json
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json

Folders and files

Latest commit

History

Repository files navigation

MD Docs Agent

Core Features

Documentation Management

Source Code Exploration

Interfaces & Integration

Use Cases

Documentation Management

Code Exploration

Integrated Workflows

Quick Start

1. Install Dependencies

2. Configure Environment

3. Add Your Documentation

4. Generate Embeddings

5. Search Documentation

6. Start Using the Tools

Usage

Embedding Generation

Semantic Search

MCP Server

Configure in Claude Desktop

Telegram Bot

CLI Tools

Documentation Management Tools

Code Exploration Tools

Programmatic Usage

Testing

Docker Deployment

Quick Start

Management

Configuration

Architecture

System Overview

Layered Architecture

Key Benefits

Technology Stack

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages