Skip to content

feat(platform): add TXT file analysis tool and improve sub-agent thread linking#271

Merged
larryro merged 6 commits into
mainfrom
feature/216-improve-txt-recognition
Jan 23, 2026
Merged

feat(platform): add TXT file analysis tool and improve sub-agent thread linking#271
larryro merged 6 commits into
mainfrom
feature/216-improve-txt-recognition

Conversation

@larryro
Copy link
Copy Markdown
Collaborator

@larryro larryro commented Jan 23, 2026

Summary

  • Add new TXT file analysis tool that enables the agent to analyze and extract content from plain text files
  • Improve sub-agent thread linking by adding get_parent_thread_id function to properly track thread hierarchies
  • Update tool response helpers to support thread ID propagation for better conversation context

Test plan

  • Verify TXT file analysis works correctly with various text file formats
  • Test sub-agent thread linking maintains proper parent-child relationships
  • Confirm existing file analysis tools (PDF, DOCX, PPTX) still work as expected

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Added support for analyzing text files (.txt, .log) with automatic encoding detection and chunked processing for large files
    • Text files can now be attached and processed through the Document Agent alongside documents and images
  • Improvements

    • Enhanced error logging and user-facing error messages for tool operations
    • Improved thread management for approval workflows

✏️ Tip: You can customize this high-level summary in your review settings.

…ad linking

Add new txt tool for parsing and analyzing plain text files with:
- Multi-encoding support (UTF-8, UTF-16, GBK, Big5, Shift-JIS, etc.)
- Chunked processing for large files (up to 10MB)
- AI-powered content analysis via fast model

Refactor approval thread linking to use database-backed parent thread
lookup instead of context-passed parentThreadId. Sub-threads now store
their parent reference in the thread summary, ensuring approvals from
sub-agents always link to the main chat thread.

Also includes:
- Text file attachment handling in message formatting
- Extended agent response validators
- Improved error message extraction in tool responses
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Jan 23, 2026

📝 Walkthrough

Walkthrough

This PR adds text file analysis capabilities and restructures thread approval resolution across multiple agent tools. It introduces a new analyze_text.ts utility for processing large TXT files with chunking, encoding detection, and concurrency control. A corresponding txt_tool.ts exposes this functionality in the agent tool registry and document agent. Multiple approval-related tools are updated to use a new database-backed getApprovalThreadId() helper for consistent thread linkage. Attachment processing is enhanced to classify and handle text files separately from documents. New types (SubThreadSummary, context validators) are added, and documentation is updated across the document agent to reflect TXT file parsing capabilities.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~30 minutes

Possibly related PRs


Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
services/platform/convex/lib/agent_response/generate_response.ts (1)

231-241: Non-streaming ignores beforeGenerate hook outputs (systemContextMessages/promptContent).

The non-streaming path (sub-agents) bypasses beforeGenerate overrides and always uses structuredContext.contextText + taskDescription directly, while the streaming path respects these customizations. If sub-agents need to support beforeGenerate hooks for prompt/system context injection, map systemContextMessages and promptContent (from the hook result) into the messages and prompt parameters respectively, similar to the streaming path at lines 180-181.

Note: taskDescription is a required string field and cannot be undefined.

🤖 Fix all issues with AI agents
In `@services/platform/convex/agent_tools/files/txt_tool.ts`:
- Around line 11-12: The code is casting serialized fileId values to
Id<'_storage'>; instead change analyzeTextContent to accept fileId as a plain
string (remove the Id<'_storage'> import and cast), update all call sites that
pass fileId (including the usages around lines 95-97) to pass and treat it as
string, and adjust any parameter/typing in analyzeTextContent and its callers so
serialized IDs remain string across boundaries.

In `@services/platform/convex/lib/agent_chat/start_agent_chat.ts`:
- Around line 183-192: The isTextFile function currently checks fileType with
strict equality and misses MIME parameters like "text/plain; charset=utf-8";
update isTextFile (and its use of FileAttachment.fileType) to treat any fileType
that startsWith('text/plain') or splits on ';' and compares the media type
portion so MIME parameters are ignored, keeping the existing extension checks
(.txt, .log) intact.
- Around line 28-38: formatFileSize currently caps at MB causing huge MB values
for large files; update the function (formatFileSize) to support larger units
(GB, TB) or call the repo's shared human-readable size formatter if one exists.
Replace the fixed-branch logic with a loop over units ["B","KB","MB","GB","TB"]
dividing by 1024 until the value is <1024 and format with one decimal for values
<10 and integer otherwise, or import and use the shared formatter utility to
ensure consistent output across the codebase.

In `@services/platform/convex/lib/attachments/process_attachments.ts`:
- Around line 226-230: The size display logic currently computes sizeKB =
Math.round(txt.fileSize / 1024) which yields "0 KB" for very small files; update
the logic that sets sizeKB/sizeDisplay so sub‑KB files are shown in bytes (e.g.
"123 B") or at minimum "1 KB". Specifically, modify the block that defines
sizeKB and sizeDisplay (variables sizeKB, sizeDisplay, and usage of
txt.fileSize) to: if txt.fileSize < 1024 show `${txt.fileSize} B`, otherwise
compute KB (using Math.round or Math.max(1, ...)) and format as KB/MB as before;
ensure any downstream consumers of sizeDisplay continue to receive the new
string format.

In `@services/platform/convex/threads/get_parent_thread_id.ts`:
- Around line 33-35: The JSON.parse result in get_parent_thread_id.ts is blindly
cast to SubThreadSummary and may contain invalid shapes; update the try block to
validate that the parsed value is a plain object and that parsed.parentThreadId
is a string (or undefined) before returning it, otherwise return null;
specifically check the parsed value (from JSON.parse(thread.summary)) is
non-null and typeof parsed === 'object' and typeof parsed.parentThreadId ===
'string' (or use a safe guard for undefined) so that the function only returns a
valid parentThreadId string or null, leaving other behavior (error handling)
unchanged.

Comment thread services/platform/convex/agent_tools/files/txt_tool.ts Outdated
Comment thread services/platform/convex/lib/agent_chat/start_agent_chat.ts
Comment thread services/platform/convex/lib/agent_chat/start_agent_chat.ts
Comment thread services/platform/convex/lib/attachments/process_attachments.ts Outdated
Comment thread services/platform/convex/threads/get_parent_thread_id.ts Outdated
Changed AnalyzeTextParams.fileId from Id<'_storage'> to string to keep
serialized IDs as strings across boundaries. The cast is now internal
to analyzeTextContent at the storage.get() call.
Added support for GB and TB units to prevent large files from displaying
huge MB values.
Changed strict equality to startsWith to handle MIME parameters like
'text/plain; charset=utf-8'.
Use Math.max(1, ...) to prevent "0 KB" display for very small files.
Use Partial<SubThreadSummary> and typeof check to validate parentThreadId
is actually a string before returning it.
@larryro larryro merged commit e3c9434 into main Jan 23, 2026
2 checks passed
@larryro larryro deleted the feature/216-improve-txt-recognition branch January 23, 2026 08:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant