Skip to content

feat(ai): experimental_transcribe({maxDownloadSizeInBytes})#9481

Closed
gr2m wants to merge 5 commits intomainfrom
transcribe-maxDownloadSizeInBytes
Closed

feat(ai): experimental_transcribe({maxDownloadSizeInBytes})#9481
gr2m wants to merge 5 commits intomainfrom
transcribe-maxDownloadSizeInBytes

Conversation

@gr2m
Copy link
Collaborator

@gr2m gr2m commented Oct 13, 2025

Background

Summary

Manual Verification

Checklist

  • Tests have been added / updated (for bug fixes / features)
  • Documentation has been added / updated (for bug fixes / features)
  • A patch changeset for relevant packages has been added (for bug fixes / features - run pnpm changeset in the project root)
  • Formatting issues have been fixed (run pnpm prettier-fix in the project root)
  • I have reviewed this pull request (self-review)

Future Work

Related Issues

@gr2m gr2m force-pushed the transcribe-maxDownloadSizeInBytes branch from 2a2539c to 3bc4ce1 Compare October 13, 2025 21:51
@lgrammel
Copy link
Collaborator

Could affect generate/stream text/object url downloads

@gr2m gr2m added the backport Admins only: add this label to a pull request in order to backport it to the prior version label Oct 14, 2025
@gr2m gr2m marked this pull request as ready for review October 20, 2025 22:05
@gr2m
Copy link
Collaborator Author

gr2m commented Oct 20, 2025

Could affect generate/stream text/object url downloads

I confirmed that it e.g. applies to this example, without a way to set a max download size.

await generateText({
    model: openai('gpt-4o'),
    messages: [
      {
        role: 'user',
        content: [
          { type: 'text', text: 'Describe the image in detail.' },
          {
            type: 'image',
            image:
              'https://github.com/vercel/ai/blob/main/examples/ai-core/data/comic-cat.png?raw=true',

            // OpenAI specific option - image detail:
            providerOptions: {
              openai: { imageDetail: 'low' },
            },
          },
        ],
      },
    ],
  });

should we add maxDownloadSizeInBytes to all the methods exported by ai that might do downloads?

@lgrammel
Copy link
Collaborator

lgrammel commented Oct 21, 2025

Could affect generate/stream text/object url downloads

I confirmed that it e.g. applies to this example, without a way to set a max download size.

await generateText({
    model: openai('gpt-4o'),
    messages: [
      {
        role: 'user',
        content: [
          { type: 'text', text: 'Describe the image in detail.' },
          {
            type: 'image',
            image:
              'https://github.com/vercel/ai/blob/main/examples/ai-core/data/comic-cat.png?raw=true',

            // OpenAI specific option - image detail:
            providerOptions: {
              openai: { imageDetail: 'low' },
            },
          },
        ],
      },
    ],
  });

should we add maxDownloadSizeInBytes to all the methods exported by ai that might do downloads?

Adding maxDownloadSizeInBytes to all methods breaks encapsulation regarding download configuration and leads to unspecified behavior with custom download functions.

instead I'd prefer if we could expose our download function somehow, with the max download size as an option (but there will be a naming conflict with the upcoming download for the files api that we need to solve).
users can then choose custom download functions (our download function configured with a different limit) and the concepts stay clearly separated.

@gr2m
Copy link
Collaborator Author

gr2m commented Feb 11, 2026

closing in favor of #12445

@gr2m gr2m closed this Feb 11, 2026
gr2m added a commit that referenced this pull request Feb 12, 2026
- Replace unbounded `arrayBuffer()`/`blob()` calls in `download()` and
`downloadBlob()` with streaming reads that enforce a **2 GiB default
size limit**
- Add `abortSignal` passthrough from callers (`transcribe`,
`generateVideo`) to `fetch()`
- Check `Content-Length` header for early rejection before reading body
- Track bytes incrementally via `ReadableStream.getReader()`, abort with
`DownloadError` when limit exceeded
- Expose configurable `download` parameter on `transcribe()` and
`experimental_generateVideo()` (instead of adding a new
`maxDownloadSize` argument) — keeps download config separate from API
function signatures
- Export `createDownload({ maxBytes })` factory from `ai` for custom
size limits

closes #9481 / addresses
#9481 (comment)

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: vercel[bot] <35613825+vercel[bot]@users.noreply.github.com>
gr2m added a commit that referenced this pull request Feb 12, 2026
- Replace unbounded `arrayBuffer()`/`blob()` calls in `download()` and
`downloadBlob()` with streaming reads that enforce a **2 GiB default
size limit**
- Add `abortSignal` passthrough from callers (`transcribe`,
`generateVideo`) to `fetch()`
- Check `Content-Length` header for early rejection before reading body
- Track bytes incrementally via `ReadableStream.getReader()`, abort with
`DownloadError` when limit exceeded
- Expose configurable `download` parameter on `transcribe()` and
`experimental_generateVideo()` (instead of adding a new
`maxDownloadSize` argument) — keeps download config separate from API
function signatures
- Export `createDownload({ maxBytes })` factory from `ai` for custom
size limits

closes #9481 / addresses
#9481 (comment)

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: vercel[bot] <35613825+vercel[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport Admins only: add this label to a pull request in order to backport it to the prior version

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants