Conversation
This reverts commit 3c4a140.
When using the continue function, add_generation_prompt was set to False, bypassing any template logic that depends on it. This caused missing tokens or headers that some models inject at the start of an assistant turn. Gemma 4 was particularly sensitive to this, producing garbled output due to the missing thought channel header (<|channel>thought\n<channel|>). Fix by popping the last incomplete assistant message and re-rendering the prompt with add_generation_prompt=True, then appending the partial content afterward. This ensures the prompt is structurally identical to normal generation regardless of the model's template. GPT-OSS and Seed-OSS thinking block handling is preserved via the existing fake-message approach, as those models manage thinking content differently.
Contributor
There was a problem hiding this comment.
Pull request overview
This PR merges development-branch updates across the desktop/Electron portable experience, chat/tool UI behavior, model/mmproj discovery, token display accounting, and documentation.
Changes:
- Adds Electron-specific settings such as model directory browsing, spellcheck toggling, update checking, and preload packaging.
- Refines chat/tool rendering, web search snippets, thinking/tool visibility, and token count display.
- Updates mmproj discovery/loading, dependencies, portable workflows, and OpenAI API docs.
Reviewed changes
Copilot reviewed 33 out of 33 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
user_data/tools/web_search.py |
Returns snippets in web search tool results. |
user_data/tools/fetch_webpage.py |
Fetches page content without link extraction. |
server.py |
Persists Electron model directory settings. |
requirements/full/requirements.txt |
Updates selected dependency and wheel versions. |
README.md |
Updates installation wording. |
modules/web_search.py |
Adds snippets to search results and simplifies content fetching. |
modules/utils.py |
Adds mmproj helpers and expands mmproj discovery. |
modules/ui.py |
Includes Electron-only settings in saved interface state. |
modules/ui_session.py |
Adds Electron model directory UI and portable update checker. |
modules/ui_model_menu.py |
Updates mmproj dropdown text and hides Jinja controls in chat mode. |
modules/ui_chat.py |
Reorganizes chat controls and adds token display refresh. |
modules/text_generation.py |
Tracks prompt/completion token counts for HF generation. |
modules/tensorrt_llm.py |
Tracks token counts for TensorRT-LLM streaming. |
modules/shared.py |
Adds Electron detection and spellcheck setting. |
modules/models_settings.py |
Auto-detects sibling mmproj files for llama.cpp models. |
modules/llama_cpp_server.py |
Tracks completion tokens and resolves mmproj paths from model folders. |
modules/html_generator.py |
Adds structured web search result rendering and tool-call spinner markup. |
modules/exllamav3.py |
Tracks ExLlamaV3 completion token counts. |
modules/chat.py |
Updates thinking continuation handling, token display, and active chat tracking. |
js/main.js |
Adjusts chat-tab character menu placement and Electron spellcheck toggle. |
js/global_scope_js.js |
Preserves open/closed thinking block state during morphdom updates. |
docs/12 - OpenAI API.md |
Reorders and expands API examples, including tool calling. |
docs/01 - Chat Tab.md |
Renames dummy message/reply documentation. |
desktop/textgen.bat |
Enables UTF-8 mode for Windows launcher. |
desktop/preload.js |
Exposes Electron directory picker bridge. |
desktop/main.js |
Adds preload, spellcheck context menu, external-link handling, and directory IPC. |
css/main.css |
Adds styles for settings buttons, spinner, and web search cards. |
.github/workflows/build-portable-release.yml |
Includes preload script in portable build packaging. |
.github/workflows/build-portable-release-vulkan.yml |
Includes preload script in Vulkan portable packaging. |
.github/workflows/build-portable-release-rocm.yml |
Includes preload script in ROCm portable packaging. |
.github/workflows/build-portable-release-ik.yml |
Includes preload script in IK portable packaging. |
.github/workflows/build-portable-release-ik-cuda.yml |
Includes preload script in IK CUDA portable packaging. |
.github/workflows/build-portable-release-cuda.yml |
Includes preload script in CUDA portable packaging. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+164
to
+167
| def apply_model_dir(value): | ||
| shared.args.model_dir = value | ||
| if Path(value).is_dir(): | ||
| shared.user_config = shared.load_user_config() |
Comment on lines
407
to
411
| output = input_ids[0] | ||
| shared.model.last_prompt_token_count = input_ids.shape[-1] | ||
| shared.model.last_completion_token_count = 0 | ||
| if state['auto_max_new_tokens']: | ||
| generate_params['max_new_tokens'] = state['truncation_length'] - input_ids.shape[-1] |
Comment on lines
+143
to
+148
| title = html.escape(r['title']) | ||
| url = html.escape(r['url']) | ||
| snippet = html.escape(r.get('snippet', '')) | ||
| cards.append( | ||
| f'<div class="web-search-result">' | ||
| f'<a class="web-search-title" href="{url}" target="_blank" rel="noopener noreferrer">{title}</a>' |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.