GitHub - Cliff-AI-Lab/wikimap: Wiki-Map — Enterprise Knowledge Compilation and Intelligent Retrieval Platform

Enterprise Knowledge Compilation & Intelligent Retrieval Platform
_{Transform raw documents into structured, searchable knowledge through AI-powered distillation, graph mapping, and dual-engine retrieval.}

Features · Quick Start · Architecture · Configuration · Structure · Development

Features

Dual-Engine Architecture

Wiki Compilation Branch

The Karpathy-inspired compilation pipeline transforms raw knowledge into distilled, structured Wiki articles.

Stage	Process
1	Raw Import
2	AI Denoising
3	Statistical Analysis
4	Domain-level Wiki Compilation
5	Schema Index Generation

Skills Retrieval Branch

The RAG-powered retrieval pipeline provides structured knowledge access through taxonomy and graph navigation.

Stage	Process
1	Knowledge Upload
2	AI Denoising
3	4-Level Taxonomy Classification
4	Knowledge Graph Mapping
5	Knowledge Catalog

Core Capabilities

AI Knowledge Distillation
_{LLM-driven content analysis, quality scoring, and noise filtering with human-in-the-loop review}

Knowledge Graph
_{Entity extraction, bidirectional indexing, relationship normalization, and de-duplication}

Hybrid Search
_{Semantic vectors (Milvus) + BM25 keyword scoring with cross-encoder reranking}

Industry Templates
_{Pre-built configurations for Energy, Finance, Healthcare, IT, and Manufacturing}

Tech Stack

Layer	Technology	Purpose
Frontend	React 19, Vite 6, Tailwind CSS, TypeScript	Dark-mode SPA with Linear-inspired design system
Backend	FastAPI, Python 3.11+, Pydantic v2	Async API with structured logging
AI/Agent	LangGraph, LangChain, OpenAI / Anthropic	Multi-agent knowledge processing pipeline
Vector DB	Milvus 2.4	High-performance similarity search
Graph DB	Neo4j 5 (Community)	Entity-relationship knowledge graph
Relational DB	PostgreSQL 16	Metadata, projects, audit logs
Cache	Redis 7	Session cache, rate limiting
Object Storage	MinIO	Document archival (PDF, DOCX, images)

Quick Start

Prerequisites

Python 3.11+
Node.js 20+
Docker & Docker Compose

1. Clone & Setup

git clone https://github.com/Cliff-AI-Lab/wikimap.git
cd wikimap

# Copy environment config
cp .env.example .env
# Edit .env with your API keys and database credentials

2. Start Infrastructure

# Start Neo4j, Milvus, Redis (local development)
docker compose up -d

# For full containerized deployment (includes PostgreSQL + API)
docker compose --profile full up -d

3. Start Backend

# Create virtual environment
python -m venv .venv
source .venv/bin/activate  # Windows: .venv\Scripts\activate

# Install dependencies
pip install -e ".[dev]"

# Run API server
uvicorn api.main:app --reload --port 8000

4. Start Frontend

cd frontend
npm install
npm run dev

Open http://localhost:5173 to access the platform.

Mock Mode (No API Keys Needed)

Set LLM_PROVIDER=mock and EMBEDDING_PROVIDER=mock in .env to run the full platform offline — perfect for development and evaluation.

Architecture

┌──────────────────────────────────────────────────────┐
│              Frontend (React 19 + Vite)               │
│   ProjectList → Dashboard → 8 Workflow Pages          │
├──────────────────────────────────────────────────────┤
│              API Layer (FastAPI v14.0)                 │
│   projects · knowledge · qa · wiki · analysis · ...   │
├──────────────────────────────────────────────────────┤
│           Business Logic (Retrieval + Distillation)   │
│   IntentRouter → SkillsRouter → Branch Activation     │
├──────────────────────────────────────────────────────┤
│              Agent Layer (LLM Agents)                  │
│   Librarian · Judge · Refiner · SkillsRouter          │
├──────────────────────────────────────────────────────┤
│           Storage Layer                                │
│   PostgreSQL · Milvus · Neo4j · Redis · MinIO         │
└──────────────────────────────────────────────────────┘

Workflow

Step	Stage	Description
1	Import	Upload documents — TXT, MD, DOCX, PDF, images, videos
2	Denoise	AI quality scoring with human review for edge cases
3	Analyze	Statistical insights and knowledge distillation metrics
4	Compile	Domain-level Wiki synthesis with cross-reference linking
5	Index	Schema generation for LLM-readable knowledge access
6	Search & QA	Dual-engine retrieval combining both branches

Configuration

Key environment variables (see .env.example for full list):

Variable	Description	Default
`LLM_PROVIDER`	LLM backend (`openai` / `anthropic` / `mock`)	`openai`
`LLM_MODEL`	Model name	`gpt-4o-mini`
`EMBEDDING_PROVIDER`	Embedding backend (`openai` / `mock`)	`openai`
`AUTH_REQUIRED`	Enable authentication	`false`
`FEISHU_MOCK_MODE`	Mock Feishu/Lark connector	`true`
`DINGTALK_MOCK_MODE`	Mock DingTalk connector	`true`
`WECOM_MOCK_MODE`	Mock WeCom connector	`true`

Tip: The platform supports any OpenAI-compatible API endpoint. Set OPENAI_BASE_URL to use providers like Azure OpenAI, DeepSeek, Zhipu GLM, or local models via Ollama/vLLM.

Project Structure

wikimap/
├── api/                    # FastAPI application
│   ├── main.py             # App entry, lifespan, middleware
│   ├── deps.py             # Dependency injection & store init
│   ├── middleware/          # Authentication middleware
│   ├── routers/            # API endpoints (10 modules)
│   └── schemas/            # Pydantic request/response models
├── packages/               # Core business logic
│   ├── agents/             # LLM agents — Librarian, Judge, Refiner
│   ├── common/             # Config, logging, shared utilities
│   ├── retrieval/          # Search pipeline — IntentRouter, scoring
│   ├── storage/            # DB adapters — PG, Milvus, Neo4j, Redis
│   └── templates/          # Industry templates (5 verticals)
├── frontend/               # React 19 SPA
│   ├── src/components/     # UI components — layout, icons, settings
│   ├── src/pages/          # 14 route pages
│   └── DESIGN.md           # Design system spec
├── tests/                  # Unit & integration tests
├── docker-compose.yml      # Infrastructure (Neo4j, Milvus, Redis, PG)
├── Dockerfile              # Multi-stage production build
└── pyproject.toml          # Python project configuration

Development

# Run tests
pytest

# Lint & format
ruff check . --fix
ruff format .

# Type check
mypy api/ packages/

# Build frontend for production
cd frontend && npm run build

License

MIT

_{知识图鉴 · Wiki-Map}
_{Enterprise Knowledge Compilation & Intelligent Retrieval Platform}

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
api		api
assets		assets
configs		configs
docs		docs
frontend		frontend
packages		packages
scripts		scripts
tests		tests
.env.example		.env.example
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
docker-compose.yml		docker-compose.yml
package.json		package.json
pyproject.toml		pyproject.toml
start.bat		start.bat
stop.bat		stop.bat

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Features

Dual-Engine Architecture

Wiki Compilation Branch

Skills Retrieval Branch

Core Capabilities

Tech Stack

Quick Start

Prerequisites

1. Clone & Setup

2. Start Infrastructure

3. Start Backend

4. Start Frontend

Mock Mode (No API Keys Needed)

Architecture

Workflow

Configuration

Project Structure

Development

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Features

Dual-Engine Architecture

Wiki Compilation Branch

Skills Retrieval Branch

Core Capabilities

Tech Stack

Quick Start

Prerequisites

1. Clone & Setup

2. Start Infrastructure

3. Start Backend

4. Start Frontend

Mock Mode (No API Keys Needed)

Architecture

Workflow

Configuration

Project Structure

Development

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages