feat(platform): add TXT file analysis tool and improve sub-agent thread linking by larryro · Pull Request #271 · tale-project/tale

larryro · 2026-01-23T08:30:19Z

Summary

Add new TXT file analysis tool that enables the agent to analyze and extract content from plain text files
Improve sub-agent thread linking by adding get_parent_thread_id function to properly track thread hierarchies
Update tool response helpers to support thread ID propagation for better conversation context

Test plan

Verify TXT file analysis works correctly with various text file formats
Test sub-agent thread linking maintains proper parent-child relationships
Confirm existing file analysis tools (PDF, DOCX, PPTX) still work as expected

🤖 Generated with Claude Code

Summary by CodeRabbit

New Features
- Added support for analyzing text files (.txt, .log) with automatic encoding detection and chunked processing for large files
- Text files can now be attached and processed through the Document Agent alongside documents and images
Improvements
- Enhanced error logging and user-facing error messages for tool operations
- Improved thread management for approval workflows

_{✏️ Tip: You can customize this high-level summary in your review settings.}

…ad linking Add new txt tool for parsing and analyzing plain text files with: - Multi-encoding support (UTF-8, UTF-16, GBK, Big5, Shift-JIS, etc.) - Chunked processing for large files (up to 10MB) - AI-powered content analysis via fast model Refactor approval thread linking to use database-backed parent thread lookup instead of context-passed parentThreadId. Sub-threads now store their parent reference in the thread summary, ensuring approvals from sub-agents always link to the main chat thread. Also includes: - Text file attachment handling in message formatting - Extended agent response validators - Improved error message extraction in tool responses

coderabbitai · 2026-01-23T08:36:36Z

📝 Walkthrough

Walkthrough

This PR adds text file analysis capabilities and restructures thread approval resolution across multiple agent tools. It introduces a new analyze_text.ts utility for processing large TXT files with chunking, encoding detection, and concurrency control. A corresponding txt_tool.ts exposes this functionality in the agent tool registry and document agent. Multiple approval-related tools are updated to use a new database-backed getApprovalThreadId() helper for consistent thread linkage. Attachment processing is enhanced to classify and handle text files separately from documents. New types (SubThreadSummary, context validators) are added, and documentation is updated across the document agent to reflect TXT file parsing capabilities.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~30 minutes

Possibly related PRs

refactor(chat-agent): introduce sub-agents for high-context tools #76: Introduces the document_assistant sub-agent; this PR extends it with TXT file handling and registers the new txt tool in the document agent configuration.
improve coding agent #19: Modifies agent attachment and file-handling surfaces; this PR adds text file classification and processing logic to the attachment pipeline.
fix(agent): prevent race condition in concurrent sub-thread creation #81: Implements sub-thread summary storage and parent thread ID resolution; this PR builds on that foundation by adding the getApprovalThreadId() lookup used across multiple tools.

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

services/platform/convex/lib/agent_response/generate_response.ts (1)

231-241: Non-streaming ignores beforeGenerate hook outputs (systemContextMessages/promptContent).

The non-streaming path (sub-agents) bypasses beforeGenerate overrides and always uses structuredContext.contextText + taskDescription directly, while the streaming path respects these customizations. If sub-agents need to support beforeGenerate hooks for prompt/system context injection, map systemContextMessages and promptContent (from the hook result) into the messages and prompt parameters respectively, similar to the streaming path at lines 180-181.

Note: taskDescription is a required string field and cannot be undefined.

🤖 Fix all issues with AI agents

In `@services/platform/convex/agent_tools/files/txt_tool.ts`:
- Around line 11-12: The code is casting serialized fileId values to
Id<'_storage'>; instead change analyzeTextContent to accept fileId as a plain
string (remove the Id<'_storage'> import and cast), update all call sites that
pass fileId (including the usages around lines 95-97) to pass and treat it as
string, and adjust any parameter/typing in analyzeTextContent and its callers so
serialized IDs remain string across boundaries.

In `@services/platform/convex/lib/agent_chat/start_agent_chat.ts`:
- Around line 183-192: The isTextFile function currently checks fileType with
strict equality and misses MIME parameters like "text/plain; charset=utf-8";
update isTextFile (and its use of FileAttachment.fileType) to treat any fileType
that startsWith('text/plain') or splits on ';' and compares the media type
portion so MIME parameters are ignored, keeping the existing extension checks
(.txt, .log) intact.
- Around line 28-38: formatFileSize currently caps at MB causing huge MB values
for large files; update the function (formatFileSize) to support larger units
(GB, TB) or call the repo's shared human-readable size formatter if one exists.
Replace the fixed-branch logic with a loop over units ["B","KB","MB","GB","TB"]
dividing by 1024 until the value is <1024 and format with one decimal for values
<10 and integer otherwise, or import and use the shared formatter utility to
ensure consistent output across the codebase.

In `@services/platform/convex/lib/attachments/process_attachments.ts`:
- Around line 226-230: The size display logic currently computes sizeKB =
Math.round(txt.fileSize / 1024) which yields "0 KB" for very small files; update
the logic that sets sizeKB/sizeDisplay so sub‑KB files are shown in bytes (e.g.
"123 B") or at minimum "1 KB". Specifically, modify the block that defines
sizeKB and sizeDisplay (variables sizeKB, sizeDisplay, and usage of
txt.fileSize) to: if txt.fileSize < 1024 show `${txt.fileSize} B`, otherwise
compute KB (using Math.round or Math.max(1, ...)) and format as KB/MB as before;
ensure any downstream consumers of sizeDisplay continue to receive the new
string format.

In `@services/platform/convex/threads/get_parent_thread_id.ts`:
- Around line 33-35: The JSON.parse result in get_parent_thread_id.ts is blindly
cast to SubThreadSummary and may contain invalid shapes; update the try block to
validate that the parsed value is a plain object and that parsed.parentThreadId
is a string (or undefined) before returning it, otherwise return null;
specifically check the parsed value (from JSON.parse(thread.summary)) is
non-null and typeof parsed === 'object' and typeof parsed.parentThreadId ===
'string' (or use a safe guard for undefined) so that the function only returns a
valid parentThreadId string or null, leaving other behavior (error handling)
unchanged.

Changed AnalyzeTextParams.fileId from Id<'_storage'> to string to keep serialized IDs as strings across boundaries. The cast is now internal to analyzeTextContent at the storage.get() call.

Added support for GB and TB units to prevent large files from displaying huge MB values.

Changed strict equality to startsWith to handle MIME parameters like 'text/plain; charset=utf-8'.

Use Math.max(1, ...) to prevent "0 KB" display for very small files.

Use Partial<SubThreadSummary> and typeof check to validate parentThreadId is actually a string before returning it.

…ad linking (#271)

coderabbitai Bot requested changes Jan 23, 2026

View reviewed changes

larryro added 5 commits January 23, 2026 16:42

fix(review): use string type for fileId in analyzeTextContent interface

11f32da

Changed AnalyzeTextParams.fileId from Id<'_storage'> to string to keep serialized IDs as strings across boundaries. The cast is now internal to analyzeTextContent at the storage.get() call.

fix(review): extend formatFileSize to support GB and TB units

4d8b39b

Added support for GB and TB units to prevent large files from displaying huge MB values.

fix(review): handle text/plain MIME parameters in isTextFile

6478ef8

Changed strict equality to startsWith to handle MIME parameters like 'text/plain; charset=utf-8'.

fix(review): ensure minimum 1 KB display for small files

e12733b

Use Math.max(1, ...) to prevent "0 KB" display for very small files.

fix(review): add runtime validation for parsed summary shape

57d6634

Use Partial<SubThreadSummary> and typeof check to validate parentThreadId is actually a string before returning it.

coderabbitai Bot approved these changes Jan 23, 2026

View reviewed changes

larryro merged commit e3c9434 into main Jan 23, 2026
2 checks passed

larryro deleted the feature/216-improve-txt-recognition branch January 23, 2026 08:48

coderabbitai Bot mentioned this pull request Mar 16, 2026

refactor: rename txt to text, document to file, and broaden usage tracking #800

Merged

5 tasks

yannickmonney pushed a commit that referenced this pull request Apr 8, 2026

feat(platform): add TXT file analysis tool and improve sub-agent thre…

5d08009

…ad linking (#271)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(platform): add TXT file analysis tool and improve sub-agent thread linking#271

feat(platform): add TXT file analysis tool and improve sub-agent thread linking#271
larryro merged 6 commits into
mainfrom
feature/216-improve-txt-recognition

larryro commented Jan 23, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jan 23, 2026

Walkthrough

Estimated code review effort

Possibly related PRs

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

larryro commented Jan 23, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jan 23, 2026

Walkthrough

Estimated code review effort

Possibly related PRs

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

larryro commented Jan 23, 2026 •

edited by coderabbitai Bot

Loading