Skip to content

fix(release): v0.2.2 — Critical Hotfix for config_override Initialization#4

Merged
yashdesai023 merged 8 commits intomainfrom
v0.2.2-fix
Mar 1, 2026
Merged

fix(release): v0.2.2 — Critical Hotfix for config_override Initialization#4
yashdesai023 merged 8 commits intomainfrom
v0.2.2-fix

Conversation

@yashdesai023
Copy link
Collaborator

Summary

This PR resolves 7 critical bugs that affected all users initializing VDBpipe via config_override (e.g., Google Colab users). It is a pure hotfix — no breaking changes.


What's Fixed

Bug Fix
'NoneType' object has no attribute 'tokenize' crash _safe_reinit now unconditionally reinitializes Embedder from embedding.model_name key
LLM (Sarvam/Google/Cohere) not initialized via config_override Added all missing LLM providers to _safe_reinit
Graph always 0 nodes after ingestion Added _regex_graph_extract() as a no-LLM fallback using regex NLP
Corrupted PDF crash (FzErrorFormat) _load_pdf now loads pages by index with per-page try/except
Engine 2/3 returning "LLM not configured" All engines now return readable, useful fallback content without LLM
Engine 3 returning irrelevant graph output GraphRAG now filters edges by query keywords + vectors fallback
generate_response() signature mismatch All engine calls now correctly pass retrieved_context argument

Testing

  • Tested locally with faiss + all-MiniLM-L6-v2 (no LLM)
  • Tested in Google Colab with provider: null config
  • Tested with Sarvam LLM (sarvam-m)
  • All 4 engines verified working (Engines 2/3 fallback + Engine 1/4 with LLM)
  • Corrupted PDF (economy-ketan-sir.pdf) ingested successfully with skipped bad pages

@yashdesai023 yashdesai023 merged commit e2f87aa into main Mar 1, 2026
2 checks passed
yashdesai023 added a commit that referenced this pull request Mar 3, 2026
…hunking, PPTX Loader, Backend VDBpipe Upgrade, 39-test Suite, TUI Diagnostics

BREAKING CHANGES: None — fully backwards compatible.

Architecture:
- Refactor VDBpipe to pure composition (remove TextPipeline inheritance, delete _safe_reinit)
- Replace TextPipeline with VDBpipe in backend routers (ingest, chat, retrieve)

Semantic OmniRouter (#3):
- Embedding cosine-similarity intent routing with threshold=0.35
- Pre-computed intent prototype embeddings per engine at startup
- Keyword fallback when embedder unavailable

Persistence (#4):
- Auto-save graph + page_index as JSON after every ingest()
- Auto-load on VDBpipe.__init__() — survives restarts

Streaming (#15):
- BaseLLMProvider.stream_response() with safe default wrapper
- OpenAILLMProvider real SSE streaming (requests stream=True)
- VDBpipe.stream_query() generator
- POST /pipelines/chat/stream SSE endpoint (StreamingResponse)

Data Loading (#13):
- Add PPTX support via python-pptx (_load_pptx)
- Register .pptx in DataLoader.supported_ext

Chunking (#14):
- Add chunk_text_sentences() sentence-boundary sliding-window chunker
- Configurable max_tokens and overlap_sentences
- Old chunk_text() kept for compatibility

Tests (#12):
- Expand from 4 to 39 tests across 12 test classes
- All tests mocked — no GPU/API keys required

TUI (#16, #17, #18):
- System Doctor: 6 real execSync runtime checks
- SetupWizard: setStep(8) on write error (fix silent failure)
- SetupWizard: validateAndSave() with per-provider API key validation

Bug Fixes:
- File isolation: uploads go to data/<user_id>/<uuid>_filename
- Cache eviction on config update in backend

Deps: add python-pptx>=0.6.23 to setup.py install_requires
TUI: bump to v0.1.4, smarter postinstall.cjs (python -m pip)
yashdesai023 added a commit that referenced this pull request Mar 3, 2026
…hunking, PPTX Loader, Backend VDBpipe Upgrade, 39-test Suite, TUI Diagnostics

BREAKING CHANGES: None — fully backwards compatible.

Architecture:
- Refactor VDBpipe to pure composition (remove TextPipeline inheritance, delete _safe_reinit)
- Replace TextPipeline with VDBpipe in backend routers (ingest, chat, retrieve)

Semantic OmniRouter (#3):
- Embedding cosine-similarity intent routing with threshold=0.35
- Pre-computed intent prototype embeddings per engine at startup
- Keyword fallback when embedder unavailable

Persistence (#4):
- Auto-save graph + page_index as JSON after every ingest()
- Auto-load on VDBpipe.__init__() — survives restarts

Streaming (#15):
- BaseLLMProvider.stream_response() with safe default wrapper
- OpenAILLMProvider real SSE streaming (requests stream=True)
- VDBpipe.stream_query() generator
- POST /pipelines/chat/stream SSE endpoint (StreamingResponse)

Data Loading (#13):
- Add PPTX support via python-pptx (_load_pptx)
- Register .pptx in DataLoader.supported_ext

Chunking (#14):
- Add chunk_text_sentences() sentence-boundary sliding-window chunker
- Configurable max_tokens and overlap_sentences
- Old chunk_text() kept for compatibility

Tests (#12):
- Expand from 4 to 39 tests across 12 test classes
- All tests mocked — no GPU/API keys required

TUI (#16, #17, #18):
- System Doctor: 6 real execSync runtime checks
- SetupWizard: setStep(8) on write error (fix silent failure)
- SetupWizard: validateAndSave() with per-provider API key validation

Bug Fixes:
- File isolation: uploads go to data/<user_id>/<uuid>_filename
- Cache eviction on config update in backend

Deps: add python-pptx>=0.6.23 to setup.py install_requires
TUI: bump to v0.1.4, smarter postinstall.cjs (python -m pip)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant