Skip to content

feat: add streaming output of LLM#43

Merged
Sskift merged 3 commits intodevfrom
feature/streaming
Dec 15, 2025
Merged

feat: add streaming output of LLM#43
Sskift merged 3 commits intodevfrom
feature/streaming

Conversation

@Dynamite2003
Copy link
Copy Markdown
Owner

No description provided.

Copilot AI review requested due to automatic review settings December 15, 2025 07:20
@Sskift Sskift merged commit 98ba09a into dev Dec 15, 2025
7 checks passed
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds streaming support for LLM responses using Server-Sent Events (SSE), enabling real-time token-by-token output in both intelligent search and paper Q&A features. It also introduces conversation categorization to separate search and reading conversation histories.

Key changes:

  • Implements SSE streaming infrastructure in both frontend and backend
  • Adds conversation category field ("search" or "reading") to separate different types of conversations
  • Updates UI components to display streaming responses with appropriate loading states

Reviewed changes

Copilot reviewed 20 out of 20 changed files in this pull request and generated 19 comments.

Show a summary per file
File Description
frontend/src/lib/sse.ts New SSE consumer utility for parsing Server-Sent Events streams
frontend/src/lib/url.ts URL normalization with loopback hostname replacement for Docker environments
frontend/src/lib/api/academic.ts Streaming API client for intelligent search with SSE support
frontend/src/lib/api-client.ts Streaming API client for paper Q&A with SSE support
frontend/src/components/layout/dashboard-shell.tsx Updated settings navigation to profile page with account tab
frontend/src/components/ai-agent/AgentChatPanel.tsx Integrated streaming support with abort controller for paper Q&A
frontend/src/components/academic/IntelligentConversation.tsx Added loading animation for empty assistant messages during streaming
frontend/src/components/academic/ConversationHistory.tsx Added category filter and URL building helper for conversation requests
frontend/src/app/profile/page.tsx Added URL search params support for tab navigation
frontend/src/app/academic/page.tsx Integrated streaming intelligent search with real-time token display
backend/migrations/versions/20251114_conversation_category.py Database migration adding category column to conversations table
backend/app/services/ai/llm_client.py Added streaming methods for LLM with SSE event parsing
backend/app/services/academic/intelligent_service.py Streaming support with JsonStringFieldStreamer for incremental JSON parsing
backend/app/schemas/conversation.py Added category field to conversation schemas
backend/app/schemas/academic.py Added stream flag to IntelligentSearchRequest
backend/app/models/conversation.py Added category column to Conversation model
backend/app/db/conversation_repository.py Added category filtering in list_conversations
backend/app/api/v1/endpoints/papers.py Streaming endpoint for paper Q&A with SSE response
backend/app/api/v1/endpoints/conversations.py Added category filtering across conversation endpoints
backend/app/api/v1/endpoints/academic.py Streaming endpoint for intelligent search with SSE response

Comment thread frontend/src/lib/sse.ts
): Promise<void> {
const reader = response.body?.getReader();
if (!reader) {
throw new Error("当前环境不支持流式传输");
Copy link

Copilot AI Dec 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error message "当前环境不支持流式传输" (current environment doesn't support streaming) is in Chinese, which is inconsistent with typical English error messages used in browser APIs. Consider using an English message or making error messages configurable for internationalization.

Suggested change
throw new Error("当前环境不支持流式传输");
throw new Error("Streaming is not supported in the current environment");

Copilot uses AI. Check for mistakes.
Comment on lines +268 to +271
async for chunk in response.content.iter_chunked(1024):
if not chunk:
continue
buffer += chunk.decode("utf-8")
Copy link

Copilot AI Dec 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code uses 'iter_chunked(1024)' to read from the response, but then decodes the chunk to UTF-8 immediately. This can cause issues with multi-byte UTF-8 characters that span chunk boundaries, potentially resulting in decode errors. Consider using TextDecoder with the 'stream: true' option or accumulating bytes before decoding.

Copilot uses AI. Check for mistakes.
Comment on lines +61 to +75
const performRequest = async (path: string, init?: RequestInit) => {
const absoluteUrl = buildBackendUrl(path);
try {
return await fetch(absoluteUrl, init);
} catch (error) {
if (typeof window !== "undefined") {
try {
return await fetch(path, init);
} catch {
throw error;
}
}
throw error;
}
};
Copy link

Copilot AI Dec 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 'performRequest' function implements a fallback mechanism where if the absolute URL fetch fails, it tries again with the original path. This fallback logic appears to be duplicative and may mask genuine network errors. The fallback should only occur for specific error types (e.g., network errors in SSR context) rather than catching all errors.

Copilot uses AI. Check for mistakes.
return None



Copy link

Copilot AI Dec 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two consecutive blank lines here which is inconsistent with the single blank line spacing used elsewhere in the file. Consider removing one blank line for consistency.

Suggested change

Copilot uses AI. Check for mistakes.
request: PaperQARequest,
options?: PaperQAStreamOptions,
): Promise<PaperQAResponse> {
const response = await fetch(`${BACKEND_URL}/api/v1/papers/qa`, {
Copy link

Copilot AI Dec 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function uses 'BACKEND_URL' directly instead of using 'buildBackendUrl' helper which is now imported at line 2. For consistency with other recent changes in this PR (e.g., ConversationHistory.tsx), consider using the 'buildBackendUrl' helper to construct the URL.

Suggested change
const response = await fetch(`${BACKEND_URL}/api/v1/papers/qa`, {
const response = await fetch(buildBackendUrl("/api/v1/papers/qa"), {

Copilot uses AI. Check for mistakes.
Comment on lines +17 to +25
op.add_column(
"conversations",
sa.Column(
"category",
sa.String(length=32),
nullable=False,
server_default="search",
),
)
Copy link

Copilot AI Dec 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new 'category' column is marked as 'nullable=False' with a 'server_default="search"'. While this works for new rows, consider that for existing rows in production databases, the default will be applied. Document whether this is the intended behavior, or if a two-step migration (first add nullable column, populate it, then make it non-nullable) would be safer for production deployments with existing data.

Suggested change
op.add_column(
"conversations",
sa.Column(
"category",
sa.String(length=32),
nullable=False,
server_default="search",
),
)
# Step 1: Add the column as nullable with a server default
op.add_column(
"conversations",
sa.Column(
"category",
sa.String(length=32),
nullable=True,
server_default="search",
),
)
# Step 2: Backfill existing rows
op.execute("UPDATE conversations SET category = 'search' WHERE category IS NULL")
# Step 3: Alter the column to be non-nullable
op.alter_column("conversations", "category", nullable=False)
# Step 4: Create the index

Copilot uses AI. Check for mistakes.
Comment on lines +323 to 325
if (!resultApplied) {
applyResponse(response);
}
Copy link

Copilot AI Dec 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 'resultApplied' flag is used to prevent duplicate application of the response, but the logic is complex with the flag being checked in both the 'onEvent' callback and after the await. This could be simplified by ensuring the response is only applied once in a single location, or by better documenting why this pattern is necessary to prevent race conditions.

Suggested change
if (!resultApplied) {
applyResponse(response);
}
// Response is only applied in the onEvent callback above.

Copilot uses AI. Check for mistakes.
return await self._legacy_search(request, query_text, history, stream_callback=stream_callback)

logger.info("[AI-Search] 收到查询: '%s'", query_text[:120])

Copy link

Copilot AI Dec 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic disables function calling when streaming is enabled ('use_function_calling = self.enable_function_calling and not stream_callback'). This decision should be documented with a comment explaining why function calling and streaming are mutually exclusive, as this constraint may not be obvious to future maintainers.

Suggested change
# Function calling is disabled when streaming is enabled because the current implementation
# does not support function calling and streaming simultaneously. This is due to the fact that
# function call responses require the full output to determine if a function should be invoked,
# whereas streaming delivers partial outputs incrementally, making them incompatible.

Copilot uses AI. Check for mistakes.
import ConversationHistory from "@/components/academic/ConversationHistory";
import FloatingFilters from "@/components/academic/FloatingFilters";
import { intelligentSearch, searchPapers, type SearchMode } from "@/lib/api/academic";
import { intelligentSearch, searchPapers, streamIntelligentSearch, type SearchMode } from "@/lib/api/academic";
Copy link

Copilot AI Dec 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused import intelligentSearch.

Suggested change
import { intelligentSearch, searchPapers, streamIntelligentSearch, type SearchMode } from "@/lib/api/academic";
import { searchPapers, streamIntelligentSearch, type SearchMode } from "@/lib/api/academic";

Copilot uses AI. Check for mistakes.
@@ -1,5 +1,5 @@
"""Conversation API endpoints."""
from typing import List
from typing import List, Literal, Optional
Copy link

Copilot AI Dec 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Import of 'List' is not used.

Suggested change
from typing import List, Literal, Optional
from typing import Literal, Optional

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants