Skip to content

Cliff-AI-Lab/wikimap

Repository files navigation

Wiki-Map — Enterprise Knowledge Compilation & Retrieval

Version Python React License

Enterprise Knowledge Compilation & Intelligent Retrieval Platform
Transform raw documents into structured, searchable knowledge through AI-powered distillation, graph mapping, and dual-engine retrieval.

Features · Quick Start · Architecture · Configuration · Structure · Development

Wiki-Map Dashboard

  Features

Dual-Engine Architecture

  Wiki Compilation Branch

The Karpathy-inspired compilation pipeline transforms raw knowledge into distilled, structured Wiki articles.

Stage Process
1 Raw Import
2 AI Denoising
3 Statistical Analysis
4 Domain-level Wiki Compilation
5 Schema Index Generation

  Skills Retrieval Branch

The RAG-powered retrieval pipeline provides structured knowledge access through taxonomy and graph navigation.

Stage Process
1 Knowledge Upload
2 AI Denoising
3 4-Level Taxonomy Classification
4 Knowledge Graph Mapping
5 Knowledge Catalog

Core Capabilities




AI Knowledge Distillation
LLM-driven content analysis, quality scoring, and noise filtering with human-in-the-loop review



Knowledge Graph
Entity extraction, bidirectional indexing, relationship normalization, and de-duplication



Hybrid Search
Semantic vectors (Milvus) + BM25 keyword scoring with cross-encoder reranking



Industry Templates
Pre-built configurations for Energy, Finance, Healthcare, IT, and Manufacturing

  Tech Stack

Layer Technology Purpose
Frontend React 19, Vite 6, Tailwind CSS, TypeScript Dark-mode SPA with Linear-inspired design system
Backend FastAPI, Python 3.11+, Pydantic v2 Async API with structured logging
AI/Agent LangGraph, LangChain, OpenAI / Anthropic Multi-agent knowledge processing pipeline
Vector DB Milvus 2.4 High-performance similarity search
Graph DB Neo4j 5 (Community) Entity-relationship knowledge graph
Relational DB PostgreSQL 16 Metadata, projects, audit logs
Cache Redis 7 Session cache, rate limiting
Object Storage MinIO Document archival (PDF, DOCX, images)

  Quick Start

Prerequisites

  • Python 3.11+
  • Node.js 20+
  • Docker & Docker Compose

1. Clone & Setup

git clone https://github.com/Cliff-AI-Lab/wikimap.git
cd wikimap

# Copy environment config
cp .env.example .env
# Edit .env with your API keys and database credentials

2. Start Infrastructure

# Start Neo4j, Milvus, Redis (local development)
docker compose up -d

# For full containerized deployment (includes PostgreSQL + API)
docker compose --profile full up -d

3. Start Backend

# Create virtual environment
python -m venv .venv
source .venv/bin/activate  # Windows: .venv\Scripts\activate

# Install dependencies
pip install -e ".[dev]"

# Run API server
uvicorn api.main:app --reload --port 8000

4. Start Frontend

cd frontend
npm install
npm run dev

Open http://localhost:5173 to access the platform.

Mock Mode (No API Keys Needed)

Set LLM_PROVIDER=mock and EMBEDDING_PROVIDER=mock in .env to run the full platform offline — perfect for development and evaluation.

  Architecture

┌──────────────────────────────────────────────────────┐
│              Frontend (React 19 + Vite)               │
│   ProjectList → Dashboard → 8 Workflow Pages          │
├──────────────────────────────────────────────────────┤
│              API Layer (FastAPI v14.0)                 │
│   projects · knowledge · qa · wiki · analysis · ...   │
├──────────────────────────────────────────────────────┤
│           Business Logic (Retrieval + Distillation)   │
│   IntentRouter → SkillsRouter → Branch Activation     │
├──────────────────────────────────────────────────────┤
│              Agent Layer (LLM Agents)                  │
│   Librarian · Judge · Refiner · SkillsRouter          │
├──────────────────────────────────────────────────────┤
│           Storage Layer                                │
│   PostgreSQL · Milvus · Neo4j · Redis · MinIO         │
└──────────────────────────────────────────────────────┘

  Workflow

Step Stage Description
1 Import Upload documents — TXT, MD, DOCX, PDF, images, videos
2 Denoise AI quality scoring with human review for edge cases
3 Analyze Statistical insights and knowledge distillation metrics
4 Compile Domain-level Wiki synthesis with cross-reference linking
5 Index Schema generation for LLM-readable knowledge access
6 Search & QA Dual-engine retrieval combining both branches

  Configuration

Key environment variables (see .env.example for full list):

Variable Description Default
LLM_PROVIDER LLM backend (openai / anthropic / mock) openai
LLM_MODEL Model name gpt-4o-mini
EMBEDDING_PROVIDER Embedding backend (openai / mock) openai
AUTH_REQUIRED Enable authentication false
FEISHU_MOCK_MODE Mock Feishu/Lark connector true
DINGTALK_MOCK_MODE Mock DingTalk connector true
WECOM_MOCK_MODE Mock WeCom connector true

Tip: The platform supports any OpenAI-compatible API endpoint. Set OPENAI_BASE_URL to use providers like Azure OpenAI, DeepSeek, Zhipu GLM, or local models via Ollama/vLLM.

  Project Structure

wikimap/
├── api/                    # FastAPI application
│   ├── main.py             # App entry, lifespan, middleware
│   ├── deps.py             # Dependency injection & store init
│   ├── middleware/          # Authentication middleware
│   ├── routers/            # API endpoints (10 modules)
│   └── schemas/            # Pydantic request/response models
├── packages/               # Core business logic
│   ├── agents/             # LLM agents — Librarian, Judge, Refiner
│   ├── common/             # Config, logging, shared utilities
│   ├── retrieval/          # Search pipeline — IntentRouter, scoring
│   ├── storage/            # DB adapters — PG, Milvus, Neo4j, Redis
│   └── templates/          # Industry templates (5 verticals)
├── frontend/               # React 19 SPA
│   ├── src/components/     # UI components — layout, icons, settings
│   ├── src/pages/          # 14 route pages
│   └── DESIGN.md           # Design system spec
├── tests/                  # Unit & integration tests
├── docker-compose.yml      # Infrastructure (Neo4j, Milvus, Redis, PG)
├── Dockerfile              # Multi-stage production build
└── pyproject.toml          # Python project configuration

  Development

# Run tests
pytest

# Lint & format
ruff check . --fix
ruff format .

# Type check
mypy api/ packages/

# Build frontend for production
cd frontend && npm run build

  License

MIT

知 识 图 鉴  ·  Wiki-Map
Enterprise Knowledge Compilation & Intelligent Retrieval Platform

RuidongAI   X (Twitter)

FastAPI LangGraph Milvus Neo4j

About

Wiki-Map — Enterprise Knowledge Compilation and Intelligent Retrieval Platform

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors