A lightweight Java library that implements a four-step pipeline for building and visualising a knowledge base from multimedia study materials.
The current version supports three ingestion modes:
- Direct
rawContentinput for tests and local demos - OpenAI-compatible OCR/transcription for image/audio/video files
- Local document file reading for
txt,md,json,csv,xml,yaml, and similar text files
| Step | What it does |
|---|---|
| 1 – Gather | Register study materials from photo, video, audio, and document sources, recognize their text content, then extract notions from chunked text with optional user guidance prompts. |
| 2 – Build | Store extracted notions in a knowledge base (in-memory or JSON file). |
| 3 – Infer Relations | Call LLM again to infer typed directed relationships between notions. |
| 4 – Visualise | Render a formatted text knowledge map showing all notions (names + comments) and their relationships, plus a focused sub-map via BFS traversal. |
knowledge-base/
├── src/
│ ├── main/java/com/knowledge/
│ │ ├── KnowledgeBaseApplication.java # Runnable end-to-end demo
│ │ ├── model/
│ │ │ ├── MediaType.java # PHOTO | VIDEO | AUDIO | DOCUMENT
│ │ │ ├── LlmServiceConfig.java # OpenAI-compatible service config
│ │ │ ├── InferenceConfig.java # Prompt + inference behavior config
│ │ │ ├── StudyMaterial.java # A registered media file
│ │ │ ├── Notion.java # A concept in the knowledge base
│ │ │ └── NotionRelation.java # Typed directed edge between notions
│ │ └── service/
│ │ ├── MediaProcessingService.java # Interface – Step 1
│ │ ├── MediaTextExtractionService.java # Media -> text abstraction
│ │ ├── TextKnowledgeExtractionService.java # Recognized text -> notions
│ │ ├── KnowledgeBaseService.java # Interface – Step 2
│ │ ├── NotionRelationInferenceService.java # Interface – Step 3
│ │ ├── KnowledgeMapService.java # Interface – Step 3
│ │ └── impl/
│ │ ├── FallbackMediaTextExtractionService.java
│ │ ├── FallbackTextKnowledgeExtractionService.java
│ │ ├── HeuristicTextKnowledgeExtractionService.java
│ │ ├── MediaProcessingServiceImpl.java
│ │ ├── KnowledgeBaseServiceImpl.java
│ │ ├── KnowledgeMapServiceImpl.java
│ │ ├── OpenAiCompatibleMediaTextExtractionService.java
│ │ ├── OpenAiCompatibleTextKnowledgeExtractionService.java
│ │ ├── OpenAiCompatibleNotionRelationInferenceService.java
│ │ ├── NoopNotionRelationInferenceService.java
│ │ └── RawContentMediaTextExtractionService.java
│ └── test/java/com/knowledge/
│ └── KnowledgeBaseTest.java # unit + integration tests
└── pom.xml
- Java 8 or later
- Maven 3.6+ or use the included Maven Wrapper (no prior Maven install needed)
# Using the Maven Wrapper (recommended – no system Maven required)
./mvnw clean test # Linux / macOS
mvnw.cmd clean test # Windows
# Or with a system Maven installation
mvn clean test./mvnw compile exec:java -Dexec.mainClass=com.knowledge.KnowledgeBaseApplication# Windows PowerShell
$env:KNOWLEDGE_BASE_WEB_PORT="8088"
& ./mvnw.cmd compile exec:java "-Dexec.mainClass=com.knowledge.web.KnowledgeMapWebApplication"
# Linux / macOS
KNOWLEDGE_BASE_WEB_PORT=8088 ./mvnw compile exec:java -Dexec.mainClass=com.knowledge.web.KnowledgeMapWebApplicationAPI endpoints (initial implementation):
GET /api/graph- graph nodes/edges and summaryGET /api/notions/{id}- notion detailGET /api/notes?notionId=...- notes list by notionPOST /api/notes- create note (text/imageDataUrl)DELETE /api/notes/{noteId}- delete noteGET /api/path/{sessionId}- knowledge pathPOST /api/path/{sessionId}- append notion to pathPOST /api/materials/import- multipart material import (title,mediaType,file, optionalextractionHint)
/api/notes and /api/path are persisted in H2 (kb_web_note and kb_web_path),
so data survives server restarts when using the same DB file.
cd web-ui
npm install
npm run devBy default, Vite dev server runs at http://localhost:5173 and proxies /api
to http://localhost:8088.
The import dialog accepts an optional extraction prompt such as 这是一段关于初中物理知识的学习材料。 to steer LLM-based knowledge extraction after OCR/transcription.
./scripts/start-fullstack.ps1By default, startup script seeds demo notions/relations into the target DB first, so the graph is not empty on first launch.
Skip seeding when needed:
./scripts/start-fullstack.ps1 -SkipSeedDataStop both processes:
./scripts/stop-fullstack.ps1Run smoke E2E checks (backend APIs + frontend proxy):
./scripts/e2e-smoke.ps1Expected output (trimmed):
=== Registered 3 study materials ===
StudyMaterial{id='mat_01', title='Distributed Systems Lecture Slides', type=Photo}
StudyMaterial{id='mat_02', title='Dubbo Architecture Overview', type=Video}
StudyMaterial{id='mat_03', title='Microservices Podcast', type=Audio}
=== Extracted 13 notions into knowledge base ===
================================================================
KNOWLEDGE MAP
================================================================
NOTIONS (13)
[mat_01_notion_1] Service Registry
Comments:
- A registry keeps track of all available service instances
...
RELATIONSHIPS (7)
Consumer --[depends on]--> Registry
Registry --[is a]--> Service Registry
...
================================================================
MediaProcessingService media = new MediaProcessingServiceImpl();
// Register a photo, video, or audio file
media.registerMaterial(new StudyMaterial(
"m1", "Lecture Slides", MediaType.PHOTO, "/slides.jpg",
"Service Registry\nLoad Balancing\nFault Tolerance"));
// Extract notions (one per non-blank line in the raw content)
List<Notion> notions = media.extractAllNotions();LlmServiceConfig config = new LlmServiceConfig(
LlmProvider.OPENAI_COMPATIBLE,
"https://api.openai.com/v1",
System.getenv("KNOWLEDGE_BASE_LLM_API_KEY"),
"gpt-4.1",
"gpt-4o-mini-transcribe",
"gpt-4.1-mini",
"2024-10-21",
null,
null,
null,
15000,
120000,
"ffmpeg",
300);
InferenceConfig inferenceConfig = InferenceConfig.fromRuntimeConfig();
MediaTextExtractionService extractor = new FallbackMediaTextExtractionService(
new OpenAiCompatibleMediaTextExtractionService(
config,
inferenceConfig,
new FfmpegMediaPreprocessingService(config)),
new RawContentMediaTextExtractionService());
MediaProcessingService media = new MediaProcessingServiceImpl(extractor);If you run KnowledgeBaseApplication, it will automatically enable LLM-based media parsing when these environment variables are present:
KNOWLEDGE_BASE_LLM_BASE_URL=https://api.openai.com/v1
KNOWLEDGE_BASE_LLM_API_KEY=...
KNOWLEDGE_BASE_LLM_PROVIDER=OPENAI_COMPATIBLE
KNOWLEDGE_BASE_LLM_VISION_MODEL=gpt-4.1
KNOWLEDGE_BASE_LLM_TRANSCRIPTION_MODEL=gpt-4o-mini-transcribe
KNOWLEDGE_BASE_LLM_KNOWLEDGE_EXTRACTION_MODEL=gpt-4.1-mini
KNOWLEDGE_BASE_LLM_API_VERSION=2024-10-21
KNOWLEDGE_BASE_LLM_FFMPEG_COMMAND=ffmpeg
KNOWLEDGE_BASE_LLM_MEDIA_SEGMENT_SECONDS=300GitHub Models (Copilot ecosystem) example:
KNOWLEDGE_BASE_LLM_PROVIDER=GITHUB_MODELS
KNOWLEDGE_BASE_LLM_BASE_URL=https://models.inference.ai.azure.com
KNOWLEDGE_BASE_LLM_API_KEY=<github_token>
KNOWLEDGE_BASE_LLM_VISION_MODEL=gpt-4.1
KNOWLEDGE_BASE_LLM_TRANSCRIPTION_MODEL=gpt-4o-mini-transcribe
KNOWLEDGE_BASE_LLM_KNOWLEDGE_EXTRACTION_MODEL=gpt-4.1-mini
KNOWLEDGE_BASE_LLM_FFMPEG_COMMAND=ffmpeg
KNOWLEDGE_BASE_LLM_MEDIA_SEGMENT_SECONDS=300Inference behavior/prompt env vars (optional, independent from LLM connectivity):
KNOWLEDGE_BASE_INFERENCE_IMAGE_PROMPT=Extract all readable text...
KNOWLEDGE_BASE_INFERENCE_MEDIA_PROMPT=Transcribe this study material...
KNOWLEDGE_BASE_INFERENCE_KNOWLEDGE_PROMPT=Extract study knowledge points as JSON...
KNOWLEDGE_BASE_INFERENCE_RELATION_PROMPT=Infer directed semantic relations as JSON...
KNOWLEDGE_BASE_INFERENCE_RELATION_HINTS=Prefer high-confidence relations.
KNOWLEDGE_BASE_INFERENCE_RELATION_MAX_COUNT=20Notes:
- Images are sent to a vision-capable model for OCR-like text extraction.
- Audio files are segmented by FFmpeg before transcription for long-file handling.
- Video files are converted to audio by FFmpeg, then segmented and transcribed.
- OCR/transcription output is then sent to a dedicated text-analysis model for knowledge extraction.
- Supported provider values are
OPENAI_COMPATIBLE,GITHUB_MODELS,AZURE_OPENAI,BAILIAN_COMPATIBLE,ARK_COMPATIBLE. - If FFmpeg or remote parsing fails, the extractor falls back to
rawContent. - If the selected GitHub Models account/model does not expose audio transcription endpoints, image extraction still works while audio/video fall back to
rawContent.
The application now supports loading runtime settings from two independent JSON files:
llm-config.jsonfor model connectivity (provider, endpoint, key, model names, timeout)inference-config.jsonfor prompts and inference behavior
Default lookup order:
For LLM connectivity:
KNOWLEDGE_BASE_LLM_CONFIG_FILE(if provided)./llm-config.jsonin the current working directory- LLM environment variables (fallback)
For inference behavior:
KNOWLEDGE_BASE_INFERENCE_CONFIG_FILE(if provided)./inference-config.jsonin the current working directory- inference environment variables (fallback)
- built-in defaults
Generated file examples: llm-config.json, inference-config.json
Run with default JSON file:
./mvnw compile exec:java -Dexec.mainClass=com.knowledge.KnowledgeBaseApplicationRun with explicit JSON path:
KNOWLEDGE_BASE_LLM_CONFIG_FILE=/path/to/llm-config.json ./mvnw compile exec:java -Dexec.mainClass=com.knowledge.KnowledgeBaseApplicationRun with explicit inference JSON path:
KNOWLEDGE_BASE_INFERENCE_CONFIG_FILE=/path/to/inference-config.json ./mvnw compile exec:java -Dexec.mainClass=com.knowledge.KnowledgeBaseApplicationKnowledgeBaseService kb = new KnowledgeBaseServiceImpl();
kb.addNotions(notions); // add extracted notions
kb.addCommentToNotion("m1_notion_1", "Central service lookup"); // enrich with comments
kb.addReferenceToNotion("m1_notion_1", new NotionReference( // add external resources
"https://en.wikipedia.org/wiki/Service_discovery",
"Service Discovery",
NotionReferenceType.WIKI,
"manual"));
kb.addRelation(new NotionRelation( // link notions
"m1_notion_1", "m1_notion_2",
RelationType.RELATED_TO, "Both are infrastructure concerns"));
// Optional JSON persistence
KnowledgeBaseService persistentKb = new KnowledgeBaseServiceImpl(
new JsonFileKnowledgeBaseStore("./knowledge-base-data.json"));KnowledgeBaseApplication enables JSON persistence by default and writes to:
./knowledge-base-data.json(default)- or path from env var
KNOWLEDGE_BASE_STORE_FILE
knowledge-base-data.json file format:
{
"version": 1,
"updatedAt": 1710000000000,
"notions": [
{
"id": "mat_01_notion_1",
"name": "Service Registry",
"description": "Extracted from Photo material: Distributed Systems Lecture Slides",
"comments": [
"A registry keeps track of all available service instances"
],
"sourceMaterialIds": [
"mat_01"
],
"references": [
{
"url": "https://en.wikipedia.org/wiki/Service_discovery",
"title": "Service Discovery",
"type": "WIKI",
"source": "manual",
"addedAt": 1710000000000
},
{
"url": "https://dubbo.apache.org/",
"title": "Dubbo Official",
"type": "DOC",
"source": "manual",
"addedAt": 1710000005000
}
]
}
],
"relations": [
{
"fromNotionId": "mat_01_notion_1",
"toNotionId": "mat_02_notion_3",
"relationType": "RELATED_TO",
"description": "Consumer relies on registry"
}
]
}Notes:
versionis reserved for future schema upgrades (current value is1).updatedAtandaddedAtare Unix epoch milliseconds.typein references must be one of:WIKI,DOC,BLOG,VIDEO,OTHER.relationTypemust be one of the supportedRelationTypeenum values.
KnowledgeMapService map = new KnowledgeMapServiceImpl(kb);
System.out.println(map.drawMap()); // full knowledge map
System.out.println(map.drawSubMap("m1_notion_1", 2)); // sub-map (BFS depth 2)RelationType |
Meaning |
|---|---|
RELATED_TO |
General association |
IS_A |
Subtype / classification |
HAS_A |
Composition |
DEPENDS_ON |
Dependency |
PART_OF |
Membership |
LEADS_TO |
Causal or sequential link |
CONTRASTS_WITH |
Opposition or contrast |
If you have cloned the parent dubboleaning repository and want to host
knowledge-base as a standalone Git repository, run:
# From the root of the dubboleaning repo:
git subtree split --prefix=knowledge-base --branch knowledge-base-standalone
# Then push that branch to a fresh empty repo:
git push <new-remote-url> knowledge-base-standalone:mainApache License 2.0 – see LICENSE for details.