Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
92 changes: 92 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,92 @@
# Changelog

## [1.3.0]

### Added
- **Admin configuration UI** with role-based access for DB, LLM, and GraphRAG settings
- Separate pages for DB config, LLM provider config, and GraphRAG config
- Graph admin role restriction via `ConfigScopeToggle`
- `apiToken` auth option added to GraphDB config with conditional UI
- **Per-graph chatbot LLM override** (`chat_service` in `llm_config`) with inheritance from `completion_service`
- Missing keys fall back to `completion_service` automatically
- Graph admins can configure per graph via the UI
- **Secret masking** in configuration API responses
- GET responses return masked values; backend substitutes on save/test
- Credentials never reach the frontend
- **Session idle timeout** (1 hour) that auto-clears the session on inactivity
- Session data moved from `localStorage` to `sessionStorage`; theme stays in `localStorage`
- Timer pauses during long-running operations (ingest, rebuild)
- **Auth guard** on all UI routes
- `RequireAuth` wrapper redirects unauthenticated users to login
- SPA routing with `serve -s` and catch-all route
- **GraphRAG config UI fields**
- Search parameters: `top_k`, `num_hops`, `num_seen_min`, `community_level`, `doc_only`
- Advanced ingestion settings: `load_batch_size`, `upsert_delay`, `default_concurrency`
- All chunker settings (chunk_size, overlap_size, method, threshold, pattern) shown and saved regardless of selected chunker
- **Multimodal inherit checkbox** in LLM config UI
- "Use same model as completion service" option in both single and multi-provider modes
- Amber warning when inheriting: "Ensure your completion model supports vision input"
- **`get_embedding_config()`** getter in `common/config.py` for parity with other service getters
- **Greeting detection** in agent router
- Regex-based pattern matching for common greetings, farewells, and thanks
- Responds directly without invoking query generation or search
- **Centralized LLM token usage tracking**
- All LLM call sites (15+) migrated to `invoke_with_parser` / `ainvoke_with_parser`
- Supports both structured (JSON) and plain text LLM responses
- **JSON parsing fallback** for LLM responses
- Handles responses wrapped in preamble text or markdown code fences
- Entity extraction uses a 3-tier fallback: direct parse, code fence extraction, regex extraction
- **Cypher/GSQL output validation** before query execution
- Checks for required query keywords before wrapping in `INTERPRET OPENCYPHER QUERY`
- Invalid output raises an error and retries instead of executing garbage queries
- **Retriever scoring** for all retriever types when `combine=False`
- Scoring logic lifted from `CommunityRetriever` into `BaseRetriever`
- Similarity, Hybrid, and Sibling retrievers now score and rank context chunks
- **User-customized prompts** persisted under `configs/` across container restarts
- **Unit tests** for LLM invocation and JSON parsing (13 test cases)

### Changed
- **All config consumers use `get_xxx_config(graphname)` getters** instead of direct `llm_config` access
- `root.py`, `report-service/root.py`, `ecc/main.py`, `ui.py` migrated
- Test connection and save endpoints use `_build_test_config()` overlay pattern
- `_unmask_auth` resolves credentials via getters for correct per-graph resolution
- **Multimodal service inherits completion model directly** when not explicitly configured
- Removed hardcoded `DEFAULT_MULTIMODAL_MODELS` that silently substituted different models
- **LLM config UI improvements**
- Red asterisk markers on mandatory model name fields
- Shared `LLM_PROVIDERS` constant replaces duplicate provider lists
- State synced when toggling between single/multi-provider modes
- Reordered sections: Completion → Chatbot → Multimodal → Embedding
- Config file writes are now atomic with file locking to prevent race conditions
- `_config_file_lock` prevents concurrent overwrites
- In-memory config updates use atomic dict replacement instead of clear-and-update
- Chat history messages display instantly without typewriter animation
- History messages tagged with `response_type: "history"` to skip CSS animation
- Chatbot model selection uses `chat_service` config with `completion_service` fallback
- Community summarization prompt loaded at call time instead of import time
- README config documentation updated for clarity and consistency
- Parameter descriptions focus on purpose, not implementation details
- `token_limit`, `default_concurrency`, and other parameters reworded
- `multimodal_service` defaults corrected to show inheritance from `completion_service`
- `default_concurrency` replaces `tg_concurrency` in `graphrag_config`
- Configurable per graph
- Wired up `default_mem_threshold` and `default_thread_limit` in database connection proxy

### Fixed
- **Bedrock multimodal connection test** — 1x1 test PNG rejected by Bedrock image validation; replaced with 20x20 PNG
- **Provider-aware image format** in multimodal test and `image_data_extractor`
- GenAI/VertexAI require `image_url` format; Bedrock/Anthropic use `type:"image"` with source block
- **report-service/root.py** — `llm_config` used but never imported (NameError on health endpoint)
- **Null service values** stripped before config reload (null = inherit, key should be absent)
- Login page shows proper error messages based on HTTP status
- 401/403: "Invalid credentials"; other errors: "Server error (N)"; network failure: "Unable to connect"
- SPA routing fixed with catch-all route to login page
- Rebuild dialog button no longer flickers between status labels
- Polling stops once rebuild completes; final status message preserved
- Idle timer pauses during long-running operations (ingest, rebuild)
- Uses pause/resume instead of repeated signal activity calls
- Bedrock model names no longer trigger token calculator warnings
- Provider prefix and version suffix stripped before tiktoken lookup
- Config reload no longer clears in-memory state during concurrent requests
- Startup validation restored for `llm_service` and `llm_model`
- `HTTPException` properly re-raised in config and DB test endpoints
31 changes: 18 additions & 13 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -469,14 +469,19 @@ Copy the below code into `configs/server_config.json`. You shouldn’t need to c
| `chat_history_api` | string | `"http://chat-history:8002"` | URL of the chat history service. No change needed when using the provided Docker Compose file. |
| `chunker` | string | `"semantic"` | Default document chunker. Options: `semantic`, `character`, `regex`, `markdown`, `html`, `recursive`. |
| `extractor` | string | `"llm"` | Entity extraction method. Options: `llm`, `graphrag`. |
| `chunker_config` | object | `{}` | Chunker-specific settings. For `character`/`markdown`/`recursive`: `chunk_size`, `overlap_size`. For `semantic`: `method`, `threshold`. For `regex`: `pattern`. |
| `top_k` | int | `5` | Number of top similar results to retrieve during search. |
| `num_hops` | int | `2` | Number of graph hops to traverse when expanding retrieved results. |
| `num_seen_min` | int | `2` | Minimum occurrence threshold for a node to be included in search results. |
| `community_level` | int | `2` | Community hierarchy level used for community search. |
| `chunk_only` | bool | `true` | If true, hybrid search only retrieves document chunks (not entities). |
| `doc_only` | bool | `false` | If true, hybrid search retrieves whole documents instead of chunks. |
| `with_chunk` | bool | `true` | If true, community search also includes document chunks in results. |
| `chunker_config` | object | `{}` | Chunker-specific settings (see sub-parameters below). All settings are saved regardless of which chunker is selected as default. |
| ↳ `chunk_size` | int | `2048` | Maximum number of characters per chunk. Used by `character`, `markdown`, `html`, and `recursive` chunkers. Larger values produce fewer, bigger chunks; smaller values produce more, finer-grained chunks. |
| ↳ `overlap_size` | int | 1/8 of `chunk_size` | Number of overlapping characters between consecutive chunks. Used by `character`, `markdown`, `html`, and `recursive` chunkers. More overlap preserves cross-chunk context but increases total chunk count. Set to `0` for no overlap. |
| ↳ `method` | string | `"percentile"` | Breakpoint detection method for the `semantic` chunker. Options: `percentile`, `standard_deviation`, `interquartile`, `gradient`. Controls how the chunker decides where to split based on embedding similarity. |
| ↳ `threshold` | float | `0.95` | Similarity threshold for the `semantic` chunker. Higher values produce more splits (smaller chunks); lower values produce fewer splits (larger chunks). |
| ↳ `pattern` | string | `""` | Regular expression pattern for the `regex` chunker. The document is split at each match of this pattern. |
| `top_k` | int | `5` | Number of initial seed results to retrieve per search. Also caps the final scored results. Increasing `top_k` increases the overall context size sent to the LLM. |
| `num_hops` | int | `2` | Number of graph hops to traverse from seed nodes during hybrid search. More hops expand the result set with related context. |
| `num_seen_min` | int | `2` | Minimum occurrence count for a node to be included during hybrid search traversal. Higher values filter out loosely connected nodes, reducing context size. |
| `community_level` | int | `2` | Community hierarchy level for community search. Higher levels retrieve broader, higher-order community summaries. |
| `chunk_only` | bool | `true` | If true, hybrid search only retrieves document chunks, excluding entity data. |
| `doc_only` | bool | `false` | If true, hybrid search retrieves whole documents instead of chunks. Significantly increases context size. |
| `with_chunk` | bool | `true` | If true, community search also includes document chunks alongside community summaries. Increases context size. |
| `doc_process_switch` | bool | `true` | Enable/disable document processing during knowledge graph build. |
| `entity_extraction_switch` | bool | same as `doc_process_switch` | Enable/disable entity extraction during knowledge graph build. |
| `community_detection_switch` | bool | same as `entity_extraction_switch` | Enable/disable community detection during knowledge graph build. |
Expand Down Expand Up @@ -552,7 +557,7 @@ In the `llm_config` section of `configs/server_config.json` file, copy JSON conf
| Parameter | Type | Default | Description |
| --- | --- | --- | --- |
| `authentication_configuration` | object | — | Shared authentication credentials for all services. Service-level values take precedence. |
| `token_limit` | int | — | Maximum token count for retrieved context. Inherited by all services if not set at service level. `0` or omitted means unlimited. |
| `token_limit` | int | — | Hard cap on token count for retrieved context sent to the LLM. Context exceeding this limit is truncated. Inherited by all services if not set at service level. `0` or omitted means unlimited. |

**`completion_service` parameters:**

Expand All @@ -564,7 +569,7 @@ In the `llm_config` section of `configs/server_config.json` file, copy JSON conf
| `model_kwargs` | object | No | `{}` | Additional model parameters (e.g., `{"temperature": 0}`). |
| `prompt_path` | string | No | `"./common/prompts/openai_gpt4/"` | Path to prompt template files. |
| `base_url` | string | No | — | Custom API endpoint URL. |
| `token_limit` | int | No | inherited from top-level | Max token count for retrieved context sent to the LLM. `0` or omitted means unlimited. |
| `token_limit` | int | No | inherited from top-level | Hard cap on token count for retrieved context sent to the LLM. Context exceeding this limit is truncated. `0` or omitted means unlimited. |

**`embedding_service` parameters:**

Expand All @@ -587,16 +592,16 @@ Chatbot LLM override. If not configured, inherits from `completion_service`. Con
| `model_kwargs` | object | No | inherited from completion | Additional model parameters (e.g., `{"temperature": 0}`). |
| `prompt_path` | string | No | inherited from completion | Path to prompt template files. |
| `base_url` | string | No | inherited from completion | Custom API endpoint URL. |
| `token_limit` | int | No | inherited from completion | Max token count for retrieved context sent to the chatbot LLM. `0` or omitted means unlimited. |
| `token_limit` | int | No | inherited from completion | Hard cap on token count for retrieved context sent to the chatbot LLM. Context exceeding this limit is truncated. `0` or omitted means unlimited. |

**`multimodal_service` parameters (optional):**

Vision model for image processing during document ingestion. If not configured, inherits from `completion_service` with a default vision model derived per provider.
Vision model for image processing during document ingestion. If not configured, inherits from `completion_service` — ensure the completion model supports vision input.

| Parameter | Type | Required | Default | Description |
| --- | --- | --- | --- | --- |
| `llm_service` | string | No | inherited from completion | Multimodal LLM provider. |
| `llm_model` | string | No | auto-derived per provider | Vision model name (e.g., `gpt-4o`). |
| `llm_model` | string | No | inherited from completion | Vision model name (e.g., `gpt-4o`). |
| `authentication_configuration` | object | No | inherited from completion | Service-specific auth credentials. Overrides top-level values. |
| `model_kwargs` | object | No | inherited from completion | Additional model parameters. |
| `prompt_path` | string | No | inherited from completion | Path to prompt template files. |
Expand Down
8 changes: 4 additions & 4 deletions common/chunkers/character_chunker.py
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
from common.chunkers.base_chunker import BaseChunker

_DEFAULT_FALLBACK_SIZE = 4096
_DEFAULT_CHUNK_SIZE = 2048


class CharacterChunker(BaseChunker):
def __init__(self, chunk_size=0, overlap_size=0):
self.chunk_size = chunk_size if chunk_size > 0 else _DEFAULT_FALLBACK_SIZE
self.overlap_size = overlap_size
def __init__(self, chunk_size=0, overlap_size=-1):
self.chunk_size = chunk_size if chunk_size > 0 else _DEFAULT_CHUNK_SIZE
self.overlap_size = overlap_size if overlap_size >= 0 else self.chunk_size // 8

def chunk(self, input_string):
if self.chunk_size <= self.overlap_size:
Expand Down
12 changes: 6 additions & 6 deletions common/chunkers/html_chunker.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@
from langchain.text_splitter import RecursiveCharacterTextSplitter


_DEFAULT_FALLBACK_SIZE = 4096
_DEFAULT_CHUNK_SIZE = 2048


class HTMLChunker(BaseChunker):
Expand All @@ -30,7 +30,7 @@ class HTMLChunker(BaseChunker):
- Automatically detects which headers (h1-h6) are present in the HTML
- Uses only the headers that exist in the document for optimal chunking
- If custom headers are provided, uses those instead of auto-detection
- Supports chunk_size / chunk_overlap: when chunk_size > 0, oversized
- Supports chunk_size / overlap_size: when chunk_size > 0, oversized
header-based chunks are further split with RecursiveCharacterTextSplitter
- When chunk_size is 0 (default), a fallback of 4096 is used so that
headerless HTML documents are still split into reasonable chunks
Expand All @@ -39,11 +39,11 @@ class HTMLChunker(BaseChunker):
def __init__(
self,
chunk_size: int = 0,
chunk_overlap: int = 0,
overlap_size: int = -1,
headers: Optional[List[Tuple[str, str]]] = None,
):
self.chunk_size = chunk_size if chunk_size > 0 else _DEFAULT_FALLBACK_SIZE
self.chunk_overlap = chunk_overlap
self.chunk_size = chunk_size if chunk_size > 0 else _DEFAULT_CHUNK_SIZE
self.overlap_size = overlap_size if overlap_size >= 0 else self.chunk_size // 8
self.headers = headers

def _detect_headers(self, html_content: str) -> List[Tuple[str, str]]:
Expand Down Expand Up @@ -96,7 +96,7 @@ def chunk(self, input_string: str) -> List[str]:
recursive_splitter = RecursiveCharacterTextSplitter(
separators=TEXT_SEPARATORS,
chunk_size=self.chunk_size,
chunk_overlap=self.chunk_overlap,
chunk_overlap=self.overlap_size,
)
final_chunks = []
for chunk in initial_chunks:
Expand Down
12 changes: 6 additions & 6 deletions common/chunkers/markdown_chunker.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,18 +20,18 @@
# When chunk_size is not configured, cap any heading-section that exceeds this
# so that form-based PDFs (tables/bold but no # headings) are not left as a
# single multi-thousand-character chunk.
_DEFAULT_FALLBACK_SIZE = 4096
_DEFAULT_CHUNK_SIZE = 2048


class MarkdownChunker(BaseChunker):

def __init__(
self,
chunk_size: int = 0,
chunk_overlap: int = 0
overlap_size: int = -1
):
self.chunk_size = chunk_size if chunk_size > 0 else _DEFAULT_FALLBACK_SIZE
self.chunk_overlap = chunk_overlap
self.chunk_size = chunk_size if chunk_size > 0 else _DEFAULT_CHUNK_SIZE
self.overlap_size = overlap_size if overlap_size >= 0 else self.chunk_size // 8

def chunk(self, input_string):
md_splitter = ExperimentalMarkdownSyntaxTextSplitter()
Expand All @@ -46,7 +46,7 @@ def chunk(self, input_string):
recursive_splitter = RecursiveCharacterTextSplitter(
separators=TEXT_SEPARATORS,
chunk_size=self.chunk_size,
chunk_overlap=self.chunk_overlap,
chunk_overlap=self.overlap_size,
)
md_chunks = []
for chunk in initial_chunks:
Expand Down
8 changes: 4 additions & 4 deletions common/chunkers/recursive_chunker.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,13 +16,13 @@
from common.chunkers.separators import TEXT_SEPARATORS
from langchain.text_splitter import RecursiveCharacterTextSplitter

_DEFAULT_FALLBACK_SIZE = 4096
_DEFAULT_CHUNK_SIZE = 2048


class RecursiveChunker(BaseChunker):
def __init__(self, chunk_size=0, overlap_size=0):
self.chunk_size = chunk_size if chunk_size > 0 else _DEFAULT_FALLBACK_SIZE
self.overlap_size = overlap_size
def __init__(self, chunk_size=0, overlap_size=-1):
self.chunk_size = chunk_size if chunk_size > 0 else _DEFAULT_CHUNK_SIZE
self.overlap_size = overlap_size if overlap_size >= 0 else self.chunk_size // 8

def chunk(self, input_string):
text_splitter = RecursiveCharacterTextSplitter(
Expand Down
Loading
Loading