feat(sdk/ts): Add multimodal helpers by santoshkumarradha · Pull Request #321 · Agent-Field/agentfield

santoshkumarradha · 2026-04-01T14:09:06Z

Implements #91 — Image, Audio, File input helpers and MultimodalResponse output handler. Closes #91

…tputs

github-actions · 2026-04-01T14:10:12Z

Performance

SDK	Memory	Δ	Latency	Δ	Tests	Status
TS	465 B	+33%	1.36 µs	-32%	✓	✗

⚠ Regression detected:

TypeScript memory: 350 B → 465 B (+33%)

santoshkumarradha

Code Review — PR #321: Multimodal Helpers

Overall Assessment: ✅ Good implementation

The PR adds 942 lines across two well-structured files that closely follow the Python reference implementation. The API surface matches the issue spec.

Strengths

Clean API design — Image.fromFile(), Audio.fromUrl(), File.fromBuffer() factory methods match the issue requirements exactly
Private constructors — Forces use of factory methods, preventing invalid state
MIME type inference — Automatically determines content type from file extensions
Comprehensive response handler — MultimodalResponse with hasAudio(), hasImage(), save() methods
No any types — Proper TypeScript typing throughout
JSDoc comments on all public methods

Issues to Address

Image.fromUrl and Audio.fromUrl are async but don't await anything — The fromUrl methods return Promise<Image> but just construct synchronously. Should either be sync or actually fetch/validate the URL.
Missing export from index — Neither file is re-exported from the SDK's main entry point. Users can't import { Image } from '@agentfield/sdk' without an index update.
fetch dependency — Audio.fromUrl uses global fetch() but this isn't available in Node.js < 18. Should document the requirement or add a polyfill guard.
No input validation — fromFile doesn't check if the file exists before reading. A clear error message like "File not found: /path" would be better than a raw ENOENT.
MultimodalResponse.save() creates directories implicitly — Uses fs.mkdir with recursive: true which is fine, but should be documented.

Minor Suggestions

Consider making fromUrl sync since it doesn't do I/O (just stores the URL)
Add .svg to IMAGE_MIME_TYPES (image/svg+xml)
The Text class could extend or implement a common ContentPart interface for type discrimination

Verdict

Good first implementation. Address items 1-2 (async consistency + index export) before merge. Items 3-5 are nice-to-haves for a follow-up.

santoshkumarradha

PR #321 Review: MultimodalResponse.ts Implementation

Summary

PR #321 adds a new TypeScript SDK file (sdk/typescript/src/ai/MultimodalResponse.ts, 537 lines) implementing multimodal response handling. Analysis based on PR diff and comparison against Issue #91 requirements reveals 2 missing functions and field name inconsistencies that need clarification before merge.

Overall Assessment: ~80% aligned with requirements. MEDIUM severity blockers identified.

Missing Functionality (Critical)

Function	Required	Status	Action Needed
toMultimodalResponse	Yes	NOT FOUND	Clarify if alias for createMultimodalResponse
parseMultimodalResponse	Yes	NOT FOUND	Clarify if alias for createMultimodalResponse

Recommendation: Confirm with author if these are aliases or need separate implementation.

Field Name Inconsistencies (Medium)

Interface	Python SDK (Expected)	PR #321 (Actual)	Impact
ImageOutput.base64	base64	b64Json	Naming mismatch
ImageOutput.mimeType	mimeType	(missing)	Field missing
AudioOutput.base64	base64	data	Naming mismatch
AudioOutput.mimeType	mimeType	format	Naming mismatch

Recommendation: Decide whether to align with Python SDK conventions (base64, mimeType) or keep TypeScript-native naming (b64Json, format).

Convention Violations

Note: Direct file analysis blocked - files not in this repo. Based on PR diff review.

Underscore prefix on private fields - _text, _audio, _images, _files, _rawResponse, _costUsd, _usage violates TS SDK convention
Missing rate limiting - No withRateLimitRetry pattern in MultimodalResponse.ts
Duplicated base64 decoding - getImageBytes, getAudioBytes, getFileBytes repeat similar logic
Inconsistent naming - hasImage vs other patterns
Missing usage/cost extraction - createMultimodalResponse factory doesn't extract usage/cost
No error context - Generic error messages lack debugging context

Recommendations

Add or clarify toMultimodalResponse and parseMultimodalResponse functions
Standardize field names to match Python SDK or document why TS diverges
Consider refactoring base64 decoding into a shared utility to reduce duplication
Add rate limiting using existing withRateLimitRetry pattern from codebase
Add error context to improve debuggability

Data Sources

PR #321 diff (task t-83e0)
Issue #91 requirements analysis (task t-04ff)
Gap analysis (task t-d6d0)
TS SDK conventions reference (task t-6102)

Status: Ready for author response on missing functions and naming conventions.

santoshkumarradha · 2026-04-02T01:50:32Z

Ignore tests for https://github.com/Agent-Field/SWE-AF updates

feat(sdk/ts): add multimodal helpers for image, audio, file inputs/ou…

004ba87

…tputs

santoshkumarradha requested review from a team and AbirAbbas as code owners April 1, 2026 14:09

santoshkumarradha commented Apr 1, 2026

View reviewed changes

santoshkumarradha closed this Apr 2, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(sdk/ts): Add multimodal helpers#321

feat(sdk/ts): Add multimodal helpers#321
santoshkumarradha wants to merge 1 commit intomainfrom
feat/ts-multimodal-helpers

santoshkumarradha commented Apr 1, 2026

Uh oh!

github-actions bot commented Apr 1, 2026

Uh oh!

santoshkumarradha left a comment

Uh oh!

santoshkumarradha left a comment

Uh oh!

santoshkumarradha commented Apr 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

santoshkumarradha commented Apr 1, 2026

Uh oh!

github-actions bot commented Apr 1, 2026

Performance

Uh oh!

santoshkumarradha left a comment

Choose a reason for hiding this comment

Code Review — PR #321: Multimodal Helpers

Overall Assessment: ✅ Good implementation

Strengths

Issues to Address

Minor Suggestions

Verdict

Uh oh!

santoshkumarradha left a comment

Choose a reason for hiding this comment

PR #321 Review: MultimodalResponse.ts Implementation

Summary

Missing Functionality (Critical)

Field Name Inconsistencies (Medium)

Convention Violations

Recommendations

Data Sources

Uh oh!

santoshkumarradha commented Apr 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant