refactor: mpu redesign#54
Conversation
📝 Walkthrough## Walkthrough
The changes update authentication type handling across configuration, main entry, workflow, and backend logic, switching from `"none"` to `"default"`. Multipart upload tracking in the Swift backend is refactored to use per-session files instead of a centralized index. Worker retry logic is adjusted to exclude client errors from retries.
## Changes
| File(s) | Change Summary |
|---------------------------------------|------------------------------------------------------------------------------------------------------------------|
| .github/workflows/tests.yml | Changed `AUTH_TYPE` in test job from `"none"` to `"default"`. |
| src/config/types.ts | Updated `auth_type` enum and default from `["service_account", "none"]` to `["service_account", "default"]`. |
| src/main.ts | Removed startup warning for `auth_type` set to `"none"`. |
| src/backends/mod.ts | Refactored signature verification: now only runs for `"default"` auth type; adjusted access checks accordingly. |
| src/backends/swift/objects.ts | Refactored multipart upload tracking to use per-session files instead of a central index; updated related logic. |
| src/backends/worker.ts | Changed retry logic to only retry on server errors (status ≥ 500), not on client errors (status 4xx). |
| tests/swift/basic/multipartupload_test.ts | Removed creation of empty multipart uploads index file in tests. |
## Sequence Diagram(s)
```mermaid
sequenceDiagram
participant Client
participant Server
participant AuthModule
Client->>Server: Request (with possible signature)
alt auth_type == "default"
Server->>AuthModule: verifyV4Signature()
AuthModule-->>Server: Verification result
Server->>Server: Check service account access
else auth_type != "default"
Server->>Server: Check service account access only
end
Server-->>Client: ResponsesequenceDiagram
participant Client
participant SwiftBackend
Client->>SwiftBackend: Initiate Multipart Upload
SwiftBackend->>SwiftBackend: Write session metadata to uploadId.json
Client->>SwiftBackend: Upload Part
SwiftBackend->>SwiftBackend: Update parts metadata in uploadId.json
Client->>SwiftBackend: Complete Multipart Upload
SwiftBackend->>SwiftBackend: Delete uploadId.json and finalize upload
Client->>SwiftBackend: Abort Multipart Upload
SwiftBackend->>SwiftBackend: Delete uploadId.json
Client->>SwiftBackend: List Multipart Uploads
SwiftBackend->>SwiftBackend: Read all uploadId.json files and aggregate
Possibly related PRs
Suggested labels
Suggested reviewers
Learnt from: destifo Learnt from: destifo Learnt from: destifo Learnt from: destifo Learnt from: destifo Learnt from: destifo |
There was a problem hiding this comment.
Actionable comments posted: 3
🔭 Outside diff range comments (3)
src/backends/swift/objects.ts (3)
756-781: Inconsistent implementation in completeMultipartUpload.The function attempts to fetch and update data at
reqUrl(the object URL) instead of the session file created duringcreateMultipartUpload. This doesn't align with the new per-session file approach.The code should be updated to:
- Read from the session file at
${MULTIPART_UPLOADS_PATH}/${uploadId}.json- Delete the session file after successful completion
- Not attempt to filter uploads array as each session has its own file
-// Fetch the existing index file to remove the upload metadata +// Fetch and delete the session file after completion try { - const fetchIndexFunc = async () => { - return await fetch(reqUrl, { + const sessionPath = `${MULTIPART_UPLOADS_PATH}/${uploadId}.json`; + const sessionUrl = `${swiftUrl}/${bucket}/${sessionPath}`; + + // Delete the session file + const deleteSessionFunc = async () => { + return await fetch(sessionUrl, { - method: "GET", + method: "DELETE", headers: headers, }); }; - const indexResponse = await retryWithExponentialBackoff(fetchIndexFunc); + const deleteResponse = await retryWithExponentialBackoff(deleteSessionFunc); - if (isOk(indexResponse) && unwrapOk(indexResponse).ok) { - const successResponse = unwrapOk(indexResponse); - const indexData = await successResponse.json(); - // Filter out the completed upload - const updatedUploads = indexData.uploads.filter(( - upload: { uploadId: string }, - ) => upload.uploadId !== uploadId); - - // Update the index file - const updateIndexFunc = async () => { - return await fetch(reqUrl, { - method: "PUT", - headers: headers, - body: JSON.stringify(updatedUploads), - }); - }; - await retryWithExponentialBackoff(updateIndexFunc); - logger.info(`Removed upload ${uploadId} from multipart uploads index`); + if (isOk(deleteResponse) && unwrapOk(deleteResponse).status === 204) { + logger.info(`Deleted multipart upload session file for ${uploadId}`); } else { logger.warn( - `Failed to fetch multipart uploads index: ${ - !isOk(indexResponse) - ? unwrapErr(indexResponse).message - : unwrapOk(indexResponse).statusText + `Failed to delete multipart upload session file: ${ + !isOk(deleteResponse) + ? unwrapErr(deleteResponse).message + : unwrapOk(deleteResponse).statusText }`, ); }
1174-1375: listMultipartUploads still uses centralized index approach.The
listMultipartUploadsfunction still expects a centralized index file at${MULTIPART_UPLOADS_PATH}/index.json, which is inconsistent with the new per-session file approach implemented increateMultipartUpload.This function needs to be updated to:
- List all files in the
${MULTIPART_UPLOADS_PATH}/directory- Read individual session files to build the uploads list
- Apply filtering and pagination on the aggregated data
Would you like me to help implement a solution that lists and aggregates individual session files for the multipart uploads listing?
1031-1172: abortMultipartUpload still uses centralized index approach.Similar to
listMultipartUploads, this function attempts to update a centralized index file which no longer exists with the new per-session approach.Update to simply delete the session file:
- // First, update the multipart uploads index to remove this upload - const multipartIndexPath = `${MULTIPART_UPLOADS_PATH}/index.json`; - const multipartIndexUrl = `${swiftUrl}/${bucket}/${multipartIndexPath}`; + // Delete the session file for this upload + const sessionPath = `${MULTIPART_UPLOADS_PATH}/${uploadId}.json`; + const sessionUrl = `${swiftUrl}/${bucket}/${sessionPath}`; - // Try to fetch the existing index file - let existingUploads = []; try { - const fetchIndexFunc = async () => { - return await fetch(multipartIndexUrl, { - method: "GET", - headers: headers, - }); - }; - // ... (remove all the index update logic) + const deleteSessionFunc = async () => { + return await fetch(sessionUrl, { + method: "DELETE", + headers: headers, + }); + }; + + const deleteResponse = await retryWithExponentialBackoff(deleteSessionFunc); + + if (isOk(deleteResponse) && unwrapOk(deleteResponse).status === 204) { + logger.info(`Deleted multipart upload session file for ${uploadId}`); + } else { + logger.warn(`Failed to delete session file for upload ${uploadId}`); + }
🧹 Nitpick comments (1)
src/backends/worker.ts (1)
157-158: Good change to exclude client errors from retries, but consider handling rate limits.Excluding 4xx errors from retries is correct as they typically indicate client-side issues. However, some client errors like 429 (Too Many Requests) might benefit from retries with appropriate backoff.
Consider implementing more sophisticated error categorization:
-// FIXME: filter out which errors can be ignored and which should be retried -while (!isOk(res) || unwrapOk(res).status >= 500) { +// Determine if error is retryable +const shouldRetry = (result: Result<Response, Error>): boolean => { + if (!isOk(result)) return true; // Network/connection errors + + const status = unwrapOk(result).status; + // Retry server errors and rate limits + return status >= 500 || status === 429 || status === 503; +}; + +while (shouldRetry(res)) {Would you like me to create an issue to track implementing a comprehensive retry strategy that handles rate limiting, circuit breaking, and jitter?
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
.github/workflows/tests.yml(1 hunks)src/backends/mod.ts(1 hunks)src/backends/swift/objects.ts(6 hunks)src/backends/worker.ts(1 hunks)src/config/types.ts(1 hunks)src/main.ts(0 hunks)
💤 Files with no reviewable changes (1)
- src/main.ts
🧰 Additional context used
🧠 Learnings (4)
.github/workflows/tests.yml (2)
Learnt from: destifo
PR: expnt/herald#41
File: .github/workflows/tests.yml:118-118
Timestamp: 2025-06-16T14:41:56.939Z
Learning: In GitHub Actions CI with start-server-and-test, deno serve command works fine without explicit --host=0.0.0.0 binding when the test URL targets http://0.0.0.0:8000/. The CI environment allows the test runner to reach the server even with default deno serve binding behavior.
Learnt from: destifo
PR: expnt/herald#41
File: .github/workflows/tests.yml:118-118
Timestamp: 2025-06-16T14:41:42.937Z
Learning: In GitHub Actions CI environment, `deno serve` without explicit `--host=0.0.0.0` flag works fine for `start-server-and-test` targeting `http://0.0.0.0:8000/` in the Herald project.
src/backends/worker.ts (3)
Learnt from: destifo
PR: expnt/herald#11
File: src/workers/mod.ts:0-0
Timestamp: 2025-01-24T12:45:29.893Z
Learning: In Deno applications, web worker initialization can fail due to module loading issues (missing files, syntax errors, broken imports), resource constraints (memory limits, process limits), or permission issues (Deno security restrictions, file system access). Consider adding retry mechanisms in production environments.
Learnt from: destifo
PR: expnt/herald#10
File: src/backends/s3/buckets.ts:0-0
Timestamp: 2025-01-28T17:36:22.149Z
Learning: The `routeQueryParamedRequest` and `headBucket` functions in `src/backends/s3/buckets.ts` already implement retry logic for replica failover when the primary request fails, supporting both S3 and Swift replicas.
Learnt from: destifo
PR: expnt/herald#5
File: src/backends/tasks.ts:0-0
Timestamp: 2025-01-08T06:46:12.523Z
Learning: In the Herald project, when managing worker availability for task processing, workers are removed from the workers map while processing a task and added back when the task is completed or fails. This prevents other tasks from using a busy worker.
src/backends/mod.ts (4)
Learnt from: destifo
PR: expnt/herald#3
File: src/main.ts:68-71
Timestamp: 2025-01-14T23:52:40.127Z
Learning: In the herald project, token verification errors in the auth module are handled by throwing HTTPException, which is then caught by the global error handler in main.ts. Additional try-catch blocks around the verification calls are not needed.
Learnt from: destifo
PR: expnt/herald#41
File: src/backends/mirror.ts:166-174
Timestamp: 2025-06-16T14:42:52.174Z
Learning: In src/backends/mirror.ts, when creating replica requests that reuse originalRequest.headers, the authentication headers (Authorization, x-amz-date, etc.) are automatically flushed/replaced down the request pipeline by functions like s3.putObject, so there's no signature mismatch issue with forwarding the original headers.
Learnt from: destifo
PR: expnt/herald#10
File: src/backends/mirror.ts:170-172
Timestamp: 2025-01-28T17:41:43.238Z
Learning: In the mirror operations (src/backends/mirror.ts), bucket existence is guaranteed by design as mirror tasks are only created for buckets that exist in the bucket store, which is initialized during startup. Therefore, non-null assertions when finding buckets in the store are safe.
Learnt from: destifo
PR: expnt/herald#10
File: src/backends/s3/buckets.ts:0-0
Timestamp: 2025-01-28T17:36:22.149Z
Learning: The `routeQueryParamedRequest` and `headBucket` functions in `src/backends/s3/buckets.ts` already implement retry logic for replica failover when the primary request fails, supporting both S3 and Swift replicas.
src/backends/swift/objects.ts (4)
Learnt from: destifo
PR: expnt/herald#49
File: src/backends/task_store.ts:170-228
Timestamp: 2025-06-26T12:48:25.255Z
Learning: In src/backends/task_store.ts, the TaskStore class creates Request objects with localhost URLs that are not actually used for HTTP requests. Instead, these Request objects serve as structured data containers that get passed to backend storage implementations (s3PutObject/swiftPutObject) which extract the necessary information (bucket, key, body, headers) and make the actual requests to the configured storage backends.
Learnt from: destifo
PR: expnt/herald#10
File: src/backends/s3/buckets.ts:0-0
Timestamp: 2025-01-28T17:36:22.149Z
Learning: The `routeQueryParamedRequest` and `headBucket` functions in `src/backends/s3/buckets.ts` already implement retry logic for replica failover when the primary request fails, supporting both S3 and Swift replicas.
Learnt from: destifo
PR: expnt/herald#41
File: src/backends/swift/mod.ts:318-323
Timestamp: 2025-06-16T14:40:12.564Z
Learning: In conversion functions that return `Result<Response, Error>`, the Result represents whether the conversion process succeeded, not the HTTP semantics. A function like `convertSwiftUploadPartToS3Response` should return `createOk(errorResponse)` when it successfully converts to a proper S3 error response, and `createErr(conversionError)` only when the conversion process itself fails.
Learnt from: destifo
PR: expnt/herald#35
File: src/backends/swift/mod.ts:0-0
Timestamp: 2025-06-11T20:21:26.693Z
Learning: For the Swift backend's PUT Object path (`convertSwiftPutObjectToS3Response`), Swift can return only 404 (container missing), 411 (length required), and 422 (checksum mismatch). It will not surface a 408 timeout here, so mapping all handled error cases to HTTP 400 in the S3 response is correct.
🧬 Code Graph Analysis (1)
src/backends/mod.ts (3)
src/utils/signer.ts (1)
verifyV4Signature(335-402)src/config/mod.ts (1)
globalConfig(23-23)src/auth/mod.ts (1)
hasBucketAccess(225-238)
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: test-full
🔇 Additional comments (2)
.github/workflows/tests.yml (1)
100-100: LGTM!The AUTH_TYPE environment variable update correctly aligns with the authentication type changes in
src/config/types.ts.src/backends/swift/objects.ts (1)
590-619: Good refactoring to per-session multipart upload files.The change from a centralized index to individual session files improves scalability and reduces contention. Each upload session now has its own metadata file at
${MULTIPART_UPLOADS_PATH}/${uploadId}.json.
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (4)
src/backends/swift/objects.ts (4)
590-643: Consider extracting XML error response generation to reduce duplication.The XML error response is constructed twice with identical content. Consider extracting this to a helper function for better maintainability.
+function createInternalServerErrorXmlResponse(): Response { + const xmlError = `<?xml version="1.0" encoding="UTF-8"?>\n<Error>\n <Code>InternalError</Code>\n <Message>Failed to save multipart upload metadata</Message>\n <RequestId>dummy-request-id</RequestId>\n <HostId>dummy-host-id</HostId>\n</Error>`; + return new Response(xmlError, { + status: 500, + headers: { + "Content-Type": "application/xml", + "x-amz-request-id": "dummy-request-id", + "x-amz-id-2": "dummy-host-id", + }, + }); +} if (!isOk(putResponse)) { const errRes = unwrapErr(putResponse); const errMessage = `Failed to save multipart upload session: ${errRes.message}`; logger.warn(errMessage); reportToSentry(errMessage); - const xmlError = `<?xml version="1.0" encoding="UTF-8"?>\n<Error>\n <Code>InternalError</Code>\n <Message>Failed to save multipart upload metadata</Message>\n <RequestId>dummy-request-id</RequestId>\n <HostId>dummy-host-id</HostId>\n</Error>`; - return createOk( - new Response(xmlError, { - status: 500, - headers: { - "Content-Type": "application/xml", - "x-amz-request-id": "dummy-request-id", - "x-amz-id-2": "dummy-host-id", - }, - }), - ); + return createOk(createInternalServerErrorXmlResponse()); }
1205-1214: Improve type safety for the parts array.The
partsproperty usesArray<unknown>which provides minimal type safety. Based on the part metadata structure used inuploadPart, this can be more specific.+interface PartMetadata { + partNumber: string; + eTag: string; + size: number; + lastModified: string; +} interface MultipartUploadSession { uploadId: string; bucket: string; objectKey: string; initiated: string; initiator: { ID: string; DisplayName: string }; owner: { ID: string; DisplayName: string }; storageClass: string; - parts?: Array<unknown>; + parts?: PartMetadata[]; }
1267-1270: Apply optional chaining for cleaner code.The static analysis tool correctly identified an opportunity to use optional chaining.
uploadJsonFiles = (listData || []) .filter((item: { name: string }) => - item.name && item.name.startsWith(multipartIndexPrefix) && - item.name.endsWith(".json") + item.name?.startsWith(multipartIndexPrefix) && + item.name?.endsWith(".json") ) .map((item: { name: string }) => item.name);
590-1434: Consider the architectural trade-offs of this multipart upload redesign.This refactor introduces significant architectural changes with important trade-offs:
Benefits:
- Better isolation of upload sessions
- No single point of contention for the index file
- Easier cleanup of individual upload sessions
Concerns:
- Performance:
listMultipartUploadsnow requires N+1 HTTP requests vs. 1- Race Conditions: Concurrent
uploadPartcalls for the same upload may conflict when updating session files- Complexity: Increased code complexity and error handling surface area
Recommendations:
- Consider implementing optimistic locking or compare-and-swap semantics for session file updates
- Evaluate adding a caching layer for frequently accessed session metadata
- Monitor performance impact in production, especially for high-concurrency scenarios
- Consider batching session updates or using atomic operations where possible
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
src/backends/swift/objects.ts(12 hunks)
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: destifo
PR: expnt/herald#10
File: src/backends/s3/buckets.ts:0-0
Timestamp: 2025-01-28T17:36:22.149Z
Learning: The `routeQueryParamedRequest` and `headBucket` functions in `src/backends/s3/buckets.ts` already implement retry logic for replica failover when the primary request fails, supporting both S3 and Swift replicas.
src/backends/swift/objects.ts (4)
Learnt from: destifo
PR: expnt/herald#10
File: src/backends/s3/buckets.ts:0-0
Timestamp: 2025-01-28T17:36:22.149Z
Learning: The `routeQueryParamedRequest` and `headBucket` functions in `src/backends/s3/buckets.ts` already implement retry logic for replica failover when the primary request fails, supporting both S3 and Swift replicas.
Learnt from: destifo
PR: expnt/herald#41
File: src/backends/swift/mod.ts:318-323
Timestamp: 2025-06-16T14:40:12.564Z
Learning: In conversion functions that return `Result<Response, Error>`, the Result represents whether the conversion process succeeded, not the HTTP semantics. A function like `convertSwiftUploadPartToS3Response` should return `createOk(errorResponse)` when it successfully converts to a proper S3 error response, and `createErr(conversionError)` only when the conversion process itself fails.
Learnt from: destifo
PR: expnt/herald#35
File: src/backends/swift/mod.ts:0-0
Timestamp: 2025-06-11T20:21:26.693Z
Learning: For the Swift backend's PUT Object path (`convertSwiftPutObjectToS3Response`), Swift can return only 404 (container missing), 411 (length required), and 422 (checksum mismatch). It will not surface a 408 timeout here, so mapping all handled error cases to HTTP 400 in the S3 response is correct.
Learnt from: destifo
PR: expnt/herald#49
File: src/backends/task_store.ts:170-228
Timestamp: 2025-06-26T12:48:25.255Z
Learning: In src/backends/task_store.ts, the TaskStore class creates Request objects with localhost URLs that are not actually used for HTTP requests. Instead, these Request objects serve as structured data containers that get passed to backend storage implementations (s3PutObject/swiftPutObject) which extract the necessary information (bucket, key, body, headers) and make the actual requests to the configured storage backends.
🪛 Biome (1.9.4)
src/backends/swift/objects.ts
[error] 1268-1268: Change to an optional chain.
Unsafe fix: Change to an optional chain.
(lint/complexity/useOptionalChain)
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: test-full
🔇 Additional comments (1)
src/backends/swift/objects.ts (1)
876-880: Good addition of upload ID validation.The validation ensures uploadId is present, improving robustness and providing clear error messages for malformed requests.
Migration notes
Summary by CodeRabbit
Refactor
Chores