Release v0.212.091 · microsoft/simplechat

Note

README will be updated with latest features, changes, updates, upgrade guidelines, videos, and more over the coming days.

New Features

1. Audio & Video Processing

Audio processing pipeline
- Integrated Azure Speech transcriptions into document ingestion.
- Splits transcripts into ~400-word chunks for downstream indexing.
Video Indexer settings UI
- Added input fields in Admin Settings for Video Indexer endpoint, key and locale.

2. Multi-Model Support

Users may choose from multiple OpenAI deployments at runtime.
Model list is dynamically populated based on Admin settings (including APIM).

3. Advanced Chunking Logic

PDF & PPTX: page-based chunks via Document Intelligence.
DOC/DOCX: ~400-word chunks via Document Intelligence.
Images (jpg/jpeg/png/bmp/tiff/tif/heif): single-chunk OCR.
Plain Text (.txt): ~400-word chunks.
HTML: hierarchical H1–H5 splits with table rebuilding, 600–1200-word sizing.
Markdown (.md): header-based splitting, table & code-block integrity, 600–1200-word sizing.
JSON: RecursiveJsonSplitter w/ convert_lists=True, max_chunk_size=600.
Tabular (CSV/XLSX/XLS): pandas-driven row chunks (≤800 chars + header), sheets as separate files, formulas stripped.

4. Group Workspace Consolidation

Unified all group document logic into functions_documents.js.
Removed functions_group_documents.js duplication.

5. Bulk File Uploads

Support for uploading up to 10 files in a single operation, with parallel ingestion and processing.

6. GPT-Driven Metadata Extraction

Admins can select a GPT model to power metadata parsing.
All new documents are processed through the chosen model for entity, keyword, and summary extraction.

7. Advanced Document Classification

Admin-configurable classification fields, each with custom color-coded labels.
Classification metadata persisted per document for filtering and display.

8. Contextual Classification Propagation

When a classified document is referenced in chat, its tags are automatically applied to the conversation as contextual metadata.

9. Chat UI Enhancements

Left-docked conversation menu for persistent navigation.
Editable conversation titles inline (left & right panes stay in sync).
Streamlined new chat flow: click-to-start or type-to-auto-create.
User-defined prompts surfaced inline within the message input.

10. Semantic Reranking & Extractive Answers

Switched to semantic queries (query_type="semantic") on both user and group indexes.
Enabled extractive highlights (query_caption="extractive") to surface the most relevant snippet in each hit.
Enabled extractive answers (query_answer="extractive") so the engine returns a concise, context-rich response directly from the index.
Automatically falls back to full-text search (query_type="full", search_mode="all") whenever no literal match is found, ensuring precise retrieval of references or other exact phrases.

Bug Fixes

A. AI Search Index Migration

Automatically add any missing fields (e.g. author, chunk_keywords, document_classification, page_number, start_time, video_ocr_chunk_text, etc.) on every Admin page load.
Fixed SDK usage (Collection attribute) to update index schema without full-index replacement.

B. User & Group Management

User search 401 error when adding a new user to a group resolved by:
- Implementing SerializableTokenCache in MSAL tied to Flask session.
- Ensuring _save_cache() is called after acquire_token_by_authorization_code.
- Refactoring get_valid_access_token() to use acquire_token_silent().
Restored metadata extraction & classification buttons in Group Workspace.
Fixed new role language in Admin settings and published an OpenAPI spec for /api/.

C. Conversation Flow & UI

Auto-create a new conversation on first user input, prompt selection or file upload.
Custom logo persistence across reboots via Base64 storage in Cosmos (max 100 px height, ≤ 500 KB).
Prevent uploaded files from overflowing the chat window (CSS update).
Sync conversation title in left pane without manual refresh.
Restore missing loadConversations() in chat-input-actions.js.
Fix feedback button behavior and ensure prompt selection sends full content.
Include original search_query & user_message in AI Search telemetry.
Ensure existing documents no longer appear “Not Available” by populating percent_complete.
Support Unicode (e.g. Japanese) in text-file chunking.

D. Miscellaneous Fixes

Error uploading file (loadConversations is not defined) fixed.
Classification disabled no longer displays in documents list or title.
Select prompt/upload file now always creates a conversation if none exists.
Fix new categories error by seeding missing nested settings with defaults on startup.
Fix returning too many results with legacy chunk size updated top_n from 20 to 12. Will add ability to control top_n via chat UI in future release.

Breaking Changes & Migration Notes

Index schema must be re-migrated via Admin Settings (admin initiates in the app settings page).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

v0.212.091

New Features

1. Audio & Video Processing

2. Multi-Model Support

3. Advanced Chunking Logic

4. Group Workspace Consolidation

5. Bulk File Uploads

6. GPT-Driven Metadata Extraction

7. Advanced Document Classification

8. Contextual Classification Propagation

9. Chat UI Enhancements

10. Semantic Reranking & Extractive Answers

Bug Fixes

A. AI Search Index Migration

B. User & Group Management

C. Conversation Flow & UI

D. Miscellaneous Fixes

Breaking Changes & Migration Notes

Uh oh!