tigergraph · chengbiao-jin · Apr 11, 2026 · Apr 8, 2026 · Apr 10, 2026 · Apr 11, 2026
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -0,0 +1,92 @@
+# Changelog
+
+## [1.3.0]
+
+### Added
+- **Admin configuration UI** with role-based access for DB, LLM, and GraphRAG settings
+  - Separate pages for DB config, LLM provider config, and GraphRAG config
+  - Graph admin role restriction via `ConfigScopeToggle`
+  - `apiToken` auth option added to GraphDB config with conditional UI
+- **Per-graph chatbot LLM override** (`chat_service` in `llm_config`) with inheritance from `completion_service`
+  - Missing keys fall back to `completion_service` automatically
+  - Graph admins can configure per graph via the UI
+- **Secret masking** in configuration API responses
+  - GET responses return masked values; backend substitutes on save/test
+  - Credentials never reach the frontend
+- **Session idle timeout** (1 hour) that auto-clears the session on inactivity
+  - Session data moved from `localStorage` to `sessionStorage`; theme stays in `localStorage`
+  - Timer pauses during long-running operations (ingest, rebuild)
+- **Auth guard** on all UI routes
+  - `RequireAuth` wrapper redirects unauthenticated users to login
+  - SPA routing with `serve -s` and catch-all route
+- **GraphRAG config UI fields**
+  - Search parameters: `top_k`, `num_hops`, `num_seen_min`, `community_level`, `doc_only`
+  - Advanced ingestion settings: `load_batch_size`, `upsert_delay`, `default_concurrency`
+  - All chunker settings (chunk_size, overlap_size, method, threshold, pattern) shown and saved regardless of selected chunker
+- **Multimodal inherit checkbox** in LLM config UI
+  - "Use same model as completion service" option in both single and multi-provider modes
+  - Amber warning when inheriting: "Ensure your completion model supports vision input"
+- **`get_embedding_config()`** getter in `common/config.py` for parity with other service getters
+- **Greeting detection** in agent router
+  - Regex-based pattern matching for common greetings, farewells, and thanks
+  - Responds directly without invoking query generation or search
+- **Centralized LLM token usage tracking**
+  - All LLM call sites (15+) migrated to `invoke_with_parser` / `ainvoke_with_parser`
+  - Supports both structured (JSON) and plain text LLM responses
+- **JSON parsing fallback** for LLM responses
+  - Handles responses wrapped in preamble text or markdown code fences
+  - Entity extraction uses a 3-tier fallback: direct parse, code fence extraction, regex extraction
+- **Cypher/GSQL output validation** before query execution
+  - Checks for required query keywords before wrapping in `INTERPRET OPENCYPHER QUERY`
+  - Invalid output raises an error and retries instead of executing garbage queries
+- **Retriever scoring** for all retriever types when `combine=False`
+  - Scoring logic lifted from `CommunityRetriever` into `BaseRetriever`
+  - Similarity, Hybrid, and Sibling retrievers now score and rank context chunks
+- **User-customized prompts** persisted under `configs/` across container restarts
+- **Unit tests** for LLM invocation and JSON parsing (13 test cases)
+
+### Changed
+- **All config consumers use `get_xxx_config(graphname)` getters** instead of direct `llm_config` access
+  - `root.py`, `report-service/root.py`, `ecc/main.py`, `ui.py` migrated
+  - Test connection and save endpoints use `_build_test_config()` overlay pattern
+  - `_unmask_auth` resolves credentials via getters for correct per-graph resolution
+- **Multimodal service inherits completion model directly** when not explicitly configured
+  - Removed hardcoded `DEFAULT_MULTIMODAL_MODELS` that silently substituted different models
+- **LLM config UI improvements**
+  - Red asterisk markers on mandatory model name fields
+  - Shared `LLM_PROVIDERS` constant replaces duplicate provider lists
+  - State synced when toggling between single/multi-provider modes
+  - Reordered sections: Completion → Chatbot → Multimodal → Embedding
+- Config file writes are now atomic with file locking to prevent race conditions
+  - `_config_file_lock` prevents concurrent overwrites
+  - In-memory config updates use atomic dict replacement instead of clear-and-update
+- Chat history messages display instantly without typewriter animation
+  - History messages tagged with `response_type: "history"` to skip CSS animation
+- Chatbot model selection uses `chat_service` config with `completion_service` fallback
+  - Community summarization prompt loaded at call time instead of import time
+- README config documentation updated for clarity and consistency
+  - Parameter descriptions focus on purpose, not implementation details
+  - `token_limit`, `default_concurrency`, and other parameters reworded
+  - `multimodal_service` defaults corrected to show inheritance from `completion_service`
+- `default_concurrency` replaces `tg_concurrency` in `graphrag_config`
+  - Configurable per graph
+- Wired up `default_mem_threshold` and `default_thread_limit` in database connection proxy
+
+### Fixed
+- **Bedrock multimodal connection test** — 1x1 test PNG rejected by Bedrock image validation; replaced with 20x20 PNG
+- **Provider-aware image format** in multimodal test and `image_data_extractor`
+  - GenAI/VertexAI require `image_url` format; Bedrock/Anthropic use `type:"image"` with source block
+- **report-service/root.py** — `llm_config` used but never imported (NameError on health endpoint)
+- **Null service values** stripped before config reload (null = inherit, key should be absent)
+- Login page shows proper error messages based on HTTP status
+  - 401/403: "Invalid credentials"; other errors: "Server error (N)"; network failure: "Unable to connect"
+- SPA routing fixed with catch-all route to login page
+- Rebuild dialog button no longer flickers between status labels
+  - Polling stops once rebuild completes; final status message preserved
+- Idle timer pauses during long-running operations (ingest, rebuild)
+  - Uses pause/resume instead of repeated signal activity calls
+- Bedrock model names no longer trigger token calculator warnings
+  - Provider prefix and version suffix stripped before tiktoken lookup
+- Config reload no longer clears in-memory state during concurrent requests
+- Startup validation restored for `llm_service` and `llm_model`
+- `HTTPException` properly re-raised in config and DB test endpoints
diff --git a/README.md b/README.md
@@ -469,14 +469,19 @@ Copy the below code into `configs/server_config.json`. You shouldn’t need to c
 | `chat_history_api` | string | `"http://chat-history:8002"` | URL of the chat history service. No change needed when using the provided Docker Compose file. |
 | `chunker` | string | `"semantic"` | Default document chunker. Options: `semantic`, `character`, `regex`, `markdown`, `html`, `recursive`. |
 | `extractor` | string | `"llm"` | Entity extraction method. Options: `llm`, `graphrag`. |
-| `chunker_config` | object | `{}` | Chunker-specific settings. For `character`/`markdown`/`recursive`: `chunk_size`, `overlap_size`. For `semantic`: `method`, `threshold`. For `regex`: `pattern`. |
-| `top_k` | int | `5` | Number of top similar results to retrieve during search. |
-| `num_hops` | int | `2` | Number of graph hops to traverse when expanding retrieved results. |
-| `num_seen_min` | int | `2` | Minimum occurrence threshold for a node to be included in search results. |
-| `community_level` | int | `2` | Community hierarchy level used for community search. |
-| `chunk_only` | bool | `true` | If true, hybrid search only retrieves document chunks (not entities). |
-| `doc_only` | bool | `false` | If true, hybrid search retrieves whole documents instead of chunks. |
-| `with_chunk` | bool | `true` | If true, community search also includes document chunks in results. |
+| `chunker_config` | object | `{}` | Chunker-specific settings (see sub-parameters below). All settings are saved regardless of which chunker is selected as default. |
+| ↳ `chunk_size` | int | `2048` | Maximum number of characters per chunk. Used by `character`, `markdown`, `html`, and `recursive` chunkers. Larger values produce fewer, bigger chunks; smaller values produce more, finer-grained chunks. |
+| ↳ `overlap_size` | int | 1/8 of `chunk_size` | Number of overlapping characters between consecutive chunks. Used by `character`, `markdown`, `html`, and `recursive` chunkers. More overlap preserves cross-chunk context but increases total chunk count. Set to `0` for no overlap. |
+| ↳ `method` | string | `"percentile"` | Breakpoint detection method for the `semantic` chunker. Options: `percentile`, `standard_deviation`, `interquartile`, `gradient`. Controls how the chunker decides where to split based on embedding similarity. |
+| ↳ `threshold` | float | `0.95` | Similarity threshold for the `semantic` chunker. Higher values produce more splits (smaller chunks); lower values produce fewer splits (larger chunks). |
+| ↳ `pattern` | string | `""` | Regular expression pattern for the `regex` chunker. The document is split at each match of this pattern. |
+| `top_k` | int | `5` | Number of initial seed results to retrieve per search. Also caps the final scored results. Increasing `top_k` increases the overall context size sent to the LLM. |
+| `num_hops` | int | `2` | Number of graph hops to traverse from seed nodes during hybrid search. More hops expand the result set with related context. |
+| `num_seen_min` | int | `2` | Minimum occurrence count for a node to be included during hybrid search traversal. Higher values filter out loosely connected nodes, reducing context size. |
+| `community_level` | int | `2` | Community hierarchy level for community search. Higher levels retrieve broader, higher-order community summaries. |
+| `chunk_only` | bool | `true` | If true, hybrid search only retrieves document chunks, excluding entity data. |
+| `doc_only` | bool | `false` | If true, hybrid search retrieves whole documents instead of chunks. Significantly increases context size. |
+| `with_chunk` | bool | `true` | If true, community search also includes document chunks alongside community summaries. Increases context size. |
 | `doc_process_switch` | bool | `true` | Enable/disable document processing during knowledge graph build. |
 | `entity_extraction_switch` | bool | same as `doc_process_switch` | Enable/disable entity extraction during knowledge graph build. |
 | `community_detection_switch` | bool | same as `entity_extraction_switch` | Enable/disable community detection during knowledge graph build. |
@@ -552,7 +557,7 @@ In the `llm_config` section of `configs/server_config.json` file, copy JSON conf
 | Parameter | Type | Default | Description |
 | --- | --- | --- | --- |
 | `authentication_configuration` | object | — | Shared authentication credentials for all services. Service-level values take precedence. |
-| `token_limit` | int | — | Maximum token count for retrieved context. Inherited by all services if not set at service level. `0` or omitted means unlimited. |
+| `token_limit` | int | — | Hard cap on token count for retrieved context sent to the LLM. Context exceeding this limit is truncated. Inherited by all services if not set at service level. `0` or omitted means unlimited. |
 
 **`completion_service` parameters:**
 
@@ -564,7 +569,7 @@ In the `llm_config` section of `configs/server_config.json` file, copy JSON conf
 | `model_kwargs` | object | No | `{}` | Additional model parameters (e.g., `{"temperature": 0}`). |
 | `prompt_path` | string | No | `"./common/prompts/openai_gpt4/"` | Path to prompt template files. |
 | `base_url` | string | No | — | Custom API endpoint URL. |
-| `token_limit` | int | No | inherited from top-level | Max token count for retrieved context sent to the LLM. `0` or omitted means unlimited. |
+| `token_limit` | int | No | inherited from top-level | Hard cap on token count for retrieved context sent to the LLM. Context exceeding this limit is truncated. `0` or omitted means unlimited. |
 
 **`embedding_service` parameters:**
 
@@ -587,16 +592,16 @@ Chatbot LLM override. If not configured, inherits from `completion_service`. Con
 | `model_kwargs` | object | No | inherited from completion | Additional model parameters (e.g., `{"temperature": 0}`). |
 | `prompt_path` | string | No | inherited from completion | Path to prompt template files. |
 | `base_url` | string | No | inherited from completion | Custom API endpoint URL. |
-| `token_limit` | int | No | inherited from completion | Max token count for retrieved context sent to the chatbot LLM. `0` or omitted means unlimited. |
+| `token_limit` | int | No | inherited from completion | Hard cap on token count for retrieved context sent to the chatbot LLM. Context exceeding this limit is truncated. `0` or omitted means unlimited. |
 
 **`multimodal_service` parameters (optional):**
 
-Vision model for image processing during document ingestion. If not configured, inherits from `completion_service` with a default vision model derived per provider.
+Vision model for image processing during document ingestion. If not configured, inherits from `completion_service` — ensure the completion model supports vision input.
 
 | Parameter | Type | Required | Default | Description |
 | --- | --- | --- | --- | --- |
 | `llm_service` | string | No | inherited from completion | Multimodal LLM provider. |
-| `llm_model` | string | No | auto-derived per provider | Vision model name (e.g., `gpt-4o`). |
+| `llm_model` | string | No | inherited from completion | Vision model name (e.g., `gpt-4o`). |
 | `authentication_configuration` | object | No | inherited from completion | Service-specific auth credentials. Overrides top-level values. |
 | `model_kwargs` | object | No | inherited from completion | Additional model parameters. |
 | `prompt_path` | string | No | inherited from completion | Path to prompt template files. |

diff --git a/common/chunkers/character_chunker.py b/common/chunkers/character_chunker.py
@@ -1,12 +1,12 @@
 from common.chunkers.base_chunker import BaseChunker
 
-_DEFAULT_FALLBACK_SIZE = 4096
+_DEFAULT_CHUNK_SIZE = 2048
 
 
 class CharacterChunker(BaseChunker):
-    def __init__(self, chunk_size=0, overlap_size=0):
-        self.chunk_size = chunk_size if chunk_size > 0 else _DEFAULT_FALLBACK_SIZE
-        self.overlap_size = overlap_size
+    def __init__(self, chunk_size=0, overlap_size=-1):
+        self.chunk_size = chunk_size if chunk_size > 0 else _DEFAULT_CHUNK_SIZE
+        self.overlap_size = overlap_size if overlap_size >= 0 else self.chunk_size // 8
 
     def chunk(self, input_string):
         if self.chunk_size <= self.overlap_size:

diff --git a/common/chunkers/html_chunker.py b/common/chunkers/html_chunker.py
@@ -20,7 +20,7 @@
 from langchain.text_splitter import RecursiveCharacterTextSplitter
 
 
-_DEFAULT_FALLBACK_SIZE = 4096
+_DEFAULT_CHUNK_SIZE = 2048
 
 
 class HTMLChunker(BaseChunker):
@@ -30,7 +30,7 @@ class HTMLChunker(BaseChunker):
     - Automatically detects which headers (h1-h6) are present in the HTML
     - Uses only the headers that exist in the document for optimal chunking
     - If custom headers are provided, uses those instead of auto-detection
-    - Supports chunk_size / chunk_overlap: when chunk_size > 0, oversized
+    - Supports chunk_size / overlap_size: when chunk_size > 0, oversized
       header-based chunks are further split with RecursiveCharacterTextSplitter
     - When chunk_size is 0 (default), a fallback of 4096 is used so that
       headerless HTML documents are still split into reasonable chunks
@@ -39,11 +39,11 @@ class HTMLChunker(BaseChunker):
     def __init__(
         self,
         chunk_size: int = 0,
-        chunk_overlap: int = 0,
+        overlap_size: int = -1,
         headers: Optional[List[Tuple[str, str]]] = None,
     ):
-        self.chunk_size = chunk_size if chunk_size > 0 else _DEFAULT_FALLBACK_SIZE
-        self.chunk_overlap = chunk_overlap
+        self.chunk_size = chunk_size if chunk_size > 0 else _DEFAULT_CHUNK_SIZE
+        self.overlap_size = overlap_size if overlap_size >= 0 else self.chunk_size // 8
         self.headers = headers
 
     def _detect_headers(self, html_content: str) -> List[Tuple[str, str]]:
@@ -96,7 +96,7 @@ def chunk(self, input_string: str) -> List[str]:
             recursive_splitter = RecursiveCharacterTextSplitter(
                 separators=TEXT_SEPARATORS,
                 chunk_size=self.chunk_size,
-                chunk_overlap=self.chunk_overlap,
+                chunk_overlap=self.overlap_size,
             )
             final_chunks = []
             for chunk in initial_chunks:

diff --git a/common/chunkers/markdown_chunker.py b/common/chunkers/markdown_chunker.py
@@ -20,18 +20,18 @@
 # When chunk_size is not configured, cap any heading-section that exceeds this
 # so that form-based PDFs (tables/bold but no # headings) are not left as a
 # single multi-thousand-character chunk.
-_DEFAULT_FALLBACK_SIZE = 4096
+_DEFAULT_CHUNK_SIZE = 2048
 
 
 class MarkdownChunker(BaseChunker):
-    
+
     def __init__(
         self,
         chunk_size: int = 0,
-        chunk_overlap: int = 0
+        overlap_size: int = -1
     ):
-        self.chunk_size = chunk_size if chunk_size > 0 else _DEFAULT_FALLBACK_SIZE
-        self.chunk_overlap = chunk_overlap
+        self.chunk_size = chunk_size if chunk_size > 0 else _DEFAULT_CHUNK_SIZE
+        self.overlap_size = overlap_size if overlap_size >= 0 else self.chunk_size // 8
 
     def chunk(self, input_string):
         md_splitter = ExperimentalMarkdownSyntaxTextSplitter()
@@ -46,7 +46,7 @@ def chunk(self, input_string):
             recursive_splitter = RecursiveCharacterTextSplitter(
                 separators=TEXT_SEPARATORS,
                 chunk_size=self.chunk_size,
-                chunk_overlap=self.chunk_overlap,
+                chunk_overlap=self.overlap_size,
             )
             md_chunks = []
             for chunk in initial_chunks:

diff --git a/common/chunkers/recursive_chunker.py b/common/chunkers/recursive_chunker.py
@@ -16,13 +16,13 @@
 from common.chunkers.separators import TEXT_SEPARATORS
 from langchain.text_splitter import RecursiveCharacterTextSplitter
 
-_DEFAULT_FALLBACK_SIZE = 4096
+_DEFAULT_CHUNK_SIZE = 2048
 
 
 class RecursiveChunker(BaseChunker):
-    def __init__(self, chunk_size=0, overlap_size=0):
-        self.chunk_size = chunk_size if chunk_size > 0 else _DEFAULT_FALLBACK_SIZE
-        self.overlap_size = overlap_size
+    def __init__(self, chunk_size=0, overlap_size=-1):
+        self.chunk_size = chunk_size if chunk_size > 0 else _DEFAULT_CHUNK_SIZE
+        self.overlap_size = overlap_size if overlap_size >= 0 else self.chunk_size // 8
 
     def chunk(self, input_string):
         text_splitter = RecursiveCharacterTextSplitter(