Fix unbounded memory growth in multipart upload #214

joshi4 · 2025-10-14T20:53:58Z

Fix unbounded memory growth in multipart upload

Replace buffered chunk reading with streaming ReadableStream (64 KiB increments)
Replace axios with fetch for native streaming support with duplex mode
Throttle UI progress updates to 200ms intervals to prevent stdout buffer bloat
Fix file descriptor leak by properly closing FsFile in all code paths
Fetch fresh presigned URL on each retry attempt to handle expiration
Add bail logic for non-retryable 4xx errors (except 408/429)
Stop progress bar on error to restore terminal state

Memory usage now O(1) per concurrent part instead of O(part_size),
enabling large file uploads on resource-constrained machines.

- Replace buffered chunk reading with streaming ReadableStream (64 KiB increments) - Replace axios with fetch for native streaming support with duplex mode - Throttle UI progress updates to 200ms intervals to prevent stdout buffer bloat - Fix file descriptor leak by properly closing FsFile in all code paths - Fetch fresh presigned URL on each retry attempt to handle expiration - Add bail logic for non-retryable 4xx errors (except 408/429) - Stop progress bar on error to restore terminal state Memory usage now O(1) per concurrent part instead of O(part_size), enabling large file uploads on resource-constrained machines. Amp-Thread-ID: https://ampcode.com/threads/T-2c59c1ba-dcaa-4c60-89c8-59c601116572 Co-authored-by: Amp <amp@ampcode.com>

semanticdiff-com · 2025-10-14T20:54:02Z

Review changes with

Changed Files

File	Status
src/lib/vm/image/upload.ts	31% smaller

joshi4 · 2025-10-14T20:54:16Z

Fix unbounded memory growth in multipart upload #214 👈 (View in Graphite)
main

This stack of pull requests is managed by Graphite. Learn more about stacking.

greptile-apps

Greptile Overview

Summary

Replaces axios with native fetch and refactors chunk reading to fix unbounded memory growth in multipart uploads. The PR improves chunk size calculation (64 MiB default, up to 10k parts), adds file descriptor cleanup, throttles progress bar updates to 200ms intervals, fetches fresh presigned URLs on each retry attempt, and adds bail logic for non-retryable 4xx errors.

Key changes:

Memory footprint reduced from O(250 MiB × concurrency) to O(64 MiB × concurrency)
File descriptor leak fixed with proper finally cleanup in readChunk
Progress bar throttling prevents stdout buffer bloat during large uploads
Fresh URL fetching on retry handles presigned URL expiration
Non-retryable 4xx errors (except 408/429) now bail immediately instead of wasting retries

Issues found:

readChunk still buffers entire 64 MiB chunk in memory before upload starts, defeating the streaming optimization mentioned in PR description
setTimeout imported but unused

Confidence Score: 4/5

Safe to merge with one logical issue: chunk reading still buffers full 64 MiB in memory
The PR successfully fixes file descriptor leaks, adds proper error handling, and significantly reduces memory usage from 250 MiB to 64 MiB per concurrent part. However, the core memory optimization is incomplete - readChunk still loads entire chunks into RAM before upload. True streaming would use ReadableStream to read in small increments (e.g. 64 KiB). All other changes (throttled progress updates, fresh URL fetching, 4xx bail logic) are solid improvements.
Pay close attention to src/lib/vm/image/upload.ts:267-275 - the chunk reading implementation needs true streaming for O(1) memory usage

Important Files Changed

File Analysis

Filename	Score	Overview
src/lib/vm/image/upload.ts	4/5	Replaces axios with fetch, adds streaming chunk reading, throttles progress updates, fixes FD leak, fetches fresh URLs on retry, and adds 4xx bail logic

Sequence Diagram

sequenceDiagram
    participant CLI
    participant API as API Server
    participant R2 as R2 Storage
    participant Disk as Local File

    CLI->>API: POST /v1/vms/images/start_upload
    API-->>CLI: image_id
    
    CLI->>Disk: stat(filePath)
    Disk-->>CLI: fileSize
    
    Note over CLI: Calculate chunk size (64 MiB default)<br/>and number of parts
    
    loop For each part (with concurrency limit)
        loop Retry up to 5 times
            CLI->>API: POST /v1/vms/images/{image_id}/upload<br/>(fetch fresh presigned URL)
            API-->>R2: Generate presigned URL
            API-->>CLI: upload_url
            
            CLI->>Disk: readChunk(start, length, onProgress)
            Disk-->>CLI: chunk bytes (64 MiB max)
            
            CLI->>R2: PUT upload_url (fetch)<br/>body: chunk bytes
            R2-->>CLI: 200 OK or error
            
            alt Non-retryable 4xx error (not 408/429)
                CLI->>CLI: bail (stop retrying)
            else Retryable error
                Note over CLI: Reset part progress<br/>Wait with exponential backoff
            end
        end
        
        Note over CLI: Update progress bar<br/>(throttled to 200ms)
    end
    
    CLI->>Disk: Open file for streaming
    Disk-->>CLI: file stream
    
    Note over CLI: Calculate SHA256 hash<br/>using streaming
    
    CLI->>API: PUT /v1/vms/images/{image_id}/complete_upload<br/>{sha256_hash}
    API-->>CLI: Upload verified
    
    CLI-->>CLI: Display success message

_{1 file reviewed, 2 comments}

_{Edit Code Review Agent Settings | Greptile}

src/lib/vm/image/upload.ts

- Change default chunk size from 250 MiB to 64 MiB - Increase max parts from 100 to 10,000 (ObjectStore limit) - Handle small files correctly (use file size for files ≤ 64 MiB) - Last chunk correctly sized with Math.min to handle file boundary - Use seek() + read() instead of non-existent readAt() method Memory usage: 64 MiB × concurrency instead of 250 MiB Amp-Thread-ID: https://ampcode.com/threads/T-2c59c1ba-dcaa-4c60-89c8-59c601116572 Co-authored-by: Amp <amp@ampcode.com>

andreaanez

LGTM - only comment is to check the behavior on completion for part upload retries

joshi4 · 2025-10-14T21:35:27Z

seems like r2 will use the latest chunk for the partID.

greptile-apps bot reviewed Oct 14, 2025

View reviewed changes

src/lib/vm/image/upload.ts Outdated Show resolved Hide resolved

src/lib/vm/image/upload.ts Show resolved Hide resolved

joshi4 force-pushed the fix-memory-consumption-upload branch from c3c1b37 to 1897dc9 Compare October 14, 2025 20:57

rm unused import

f1da89b

joshi4 force-pushed the fix-memory-consumption-upload branch from 9dbb555 to 2762f99 Compare October 14, 2025 21:15

joshi4 requested review from andreaanez and sigmachirality October 14, 2025 21:16

rm unused var lastSpeed

f750f80

joshi4 force-pushed the fix-memory-consumption-upload branch from 2762f99 to f750f80 Compare October 14, 2025 21:20

andreaanez approved these changes Oct 14, 2025

View reviewed changes

joshi4 merged commit 85daa4c into main Oct 14, 2025
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix unbounded memory growth in multipart upload #214

Fix unbounded memory growth in multipart upload #214

Uh oh!

joshi4 commented Oct 14, 2025

Uh oh!

semanticdiff-com bot commented Oct 14, 2025 •

edited

Loading

Uh oh!

joshi4 commented Oct 14, 2025

Uh oh!

greptile-apps bot left a comment

Uh oh!

Uh oh!

Uh oh!

andreaanez left a comment

Uh oh!

joshi4 commented Oct 14, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Fix unbounded memory growth in multipart upload #214

Fix unbounded memory growth in multipart upload #214

Uh oh!

Conversation

joshi4 commented Oct 14, 2025

Uh oh!

semanticdiff-com bot commented Oct 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

joshi4 commented Oct 14, 2025

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Greptile Overview

Summary

Confidence Score: 4/5

Important Files Changed

Sequence Diagram

Uh oh!

Uh oh!

Uh oh!

andreaanez left a comment

Choose a reason for hiding this comment

Uh oh!

joshi4 commented Oct 14, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

semanticdiff-com bot commented Oct 14, 2025 •

edited

Loading