Skip to content

refactor: worker mgmt#16

Merged
destifo merged 26 commits intomainfrom
refactor/worker-mgmt
Mar 12, 2025
Merged

refactor: worker mgmt#16
destifo merged 26 commits intomainfrom
refactor/worker-mgmt

Conversation

@destifo
Copy link
Copy Markdown
Contributor

@destifo destifo commented Feb 11, 2025

  • Listen for a bucket task queue from within a dedicated worker.
  • fixes the bug where task orders get shuffled when being re-enqueued during the process when a task has been dequeued and there was no worker to process it.
  • Now, for each primary bucket, we have a KV store and a worker listening to it.
  • Auth in herald can be configured to be none, i.e. there will not be any auth scheme, apart from signature verification.
  • Swift Keystone authentication is now done periodically rather than for each request. Every 55 min, the token and storage URL for every swift config is cached, and incoming requests will be using the cached token for auth. with this change, requests to the swift servers are faster by half the time. This also solves the bug that comes with reading put object body getting consumed when requests to the keystone services were being called.

Migration notes


  • The change comes with new or modified tests
  • Hard-to-understand functions have explanatory comments
  • End-user documentation is updated to reflect the change

Summary by CodeRabbit

  • New Features

    • Introduced enhanced authentication options with dynamic token management and periodic context refresh.
    • Updated default storage bucket configuration for improved data organization.
  • Refactor

    • Streamlined bucket and task management workflows to boost performance and stability.
    • Improved error handling and retry logic across cloud storage operations.
    • Added new methods for managing bucket configurations and authentication tokens.
    • Enhanced the Bucket class with JSON deserialization capabilities and additional properties.
  • Chores

    • Updated dependency configurations and import paths to maintain consistency.
    • Adjusted test configurations to align with production behavior.
    • Added a new configuration file for Deno project imports.

@destifo destifo added enhancement New feature or request bug fix labels Feb 11, 2025
@destifo destifo self-assigned this Feb 11, 2025
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Feb 11, 2025

📝 Walkthrough

Walkthrough

This pull request introduces widespread refactoring and enhancements across configuration, backend, and test modules. Key changes include the addition of new ignore patterns for database files, unifying bucket configuration via a new Bucket type, and overhauling task queue and worker management with simplified functions and context updates. Authentication logic now incorporates a new getAuthType function and a Keystone token store for Swift backends. Additionally, dependency import paths have been standardized and updated, with corresponding adjustments reflected in test files and configuration manifests.

Changes

File(s) Change Summary
.gitignore Added new entries: *.db, *.db-shm, and *.db-wal to ignore database files.
src/backends/mirror.ts, src/backends/types.ts, src/backends/worker.ts Refactored bucket configuration handling; updated function signatures to use the unified Bucket type; added task conversion and enhanced worker messaging (e.g., convertMessageToTask, onUpdateContext, onStart).
src/backends/task_store.ts Enhanced TaskStore to support multiple task queues via new parameters, updated serialization/deserialization methods, and added helper method #getDbName.
src/backends/tasks.ts, src/workers/mod.ts Replaced the taskHandler function with startWorkers and refreshWorkersContext; updated worker registration to refresh context periodically.
src/constants/message.ts, src/constants/time.ts Introduced new constants: TASK_QUEUE_DB ("taskQueue.db") and TASK_TIMEOUT (240000 ms).
src/auth/mod.ts, src/backends/mod.ts Added and integrated getAuthType to conditionally adjust bucket access logic based on authentication type.
src/backends/s3/buckets.ts, src/backends/s3/objects.ts Removed explicit type casting; refined retry logic based on replica presence; improved error logging in S3 operations.
src/backends/swift/buckets.ts, src/backends/swift/objects.ts, src/backends/swift/auth.ts Updated Swift backend functions by replacing token retrieval calls with context-based auth metadata retrieval; refined retry logic and logging for bucket/object operations.
src/buckets/mod.ts Extended the Bucket class with a new static method fromJSON and added getters for bucketName and isReplica, enhancing JSON deserialization and configuration access.
src/config/types.ts, src/main.ts Added an auth_type property to the configuration schema; updated main initialization to incorporate a keystoneStore and conditional service account verification based on auth type.
src/utils/url.ts Added a new retries parameter (default 3) to forwardRequestWithTimeouts and adjusted retryWithExponentialBackoff usage accordingly.
.ghjk/deno.jsonc, .ghjk/lock.json, import_map.json Updated dependency configuration: standardized import paths (trailing slashes), removed ESM-specific aws-sdk imports, and revised package versions in lock file.
tools/compose/compose.s3.yml Modified MINIO_DEFAULT_BUCKETS environment variable from "test" to "task-store".
Test files (e.g., paths starting with tests/mirror/, tests/s3/, tests/swift/, tests/iac/, tests/utils/ and utils/file.ts) Updated import statements to replace aws-sdk ESM imports with standard versions and adjusted std/path imports to include trailing slashes.
src/backends/swift/keystone_token_store.ts Introduced new KeystoneTokenStore class and an exported initKeystoneStore function for managing Swift authentication tokens with periodic refresh.
src/config/loader.ts, src/types/mod.ts, src/utils/log.ts Minor updates: removed redundant commented-out logger initialization and added keystoneStore to HeraldContext; standardized import paths.

Suggested reviewers

  • zifeo
  • hailatGH
  • edraobdu

Tip

⚡🧪 Multi-step agentic review comment chat (experimental)
  • We're introducing multi-step agentic chat in review comments. This experimental feature enhances review discussions with the CodeRabbit agentic chat by enabling advanced interactions, including the ability to create pull requests directly from comments.
    - To enable this feature, set early_access to true under in the settings.

📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0b37a43 and f09946a.

📒 Files selected for processing (1)
  • .github/workflows/tests.yml (1 hunks)
🔇 Additional comments (1)
.github/workflows/tests.yml (1)

59-59: Updated Deno Version: Ensure Compatibility of v2.2.3
The update to use Deno v2.2.3 instead of the previous version looks good. Please verify that all Deno-dependent steps in the workflow (such as running tests or other scripts) are fully compatible with this version to avoid any unforeseen issues.


🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🔭 Outside diff range comments (1)
src/backends/mirror.ts (1)

39-54: ⚠️ Potential issue

Close KV connection after use.

The KV connection opened with Deno.openKv should be closed after use to prevent resource leaks.

 export async function enqueueMirrorTask(ctx: HeraldContext, task: MirrorTask) {
   const bucket = getBucketFromTask(task);
   const kv = await Deno.openKv(`${bucket}_${TASK_QUEUE_DB}`);
+  try {
     const lockedStorages = ctx.taskStore.lockedStorages;
     const nonce = crypto.randomUUID(); // Unique identifier for the task
     task.nonce = nonce;
     logger.debug(
       `Enqueing task: ${task.command} for primary: ${task.mainBucketConfig.typ} to replica: ${task.backupBucketConfig.typ}`,
     );

     // Atomic transaction to add the task to the queue
     const storageKey = getStorageKey(task.backupBucketConfig.config);
     const currentCount = lockedStorages.get(storageKey) || 0;
     lockedStorages.set(storageKey, currentCount + 1);

     await kv.enqueue(task);
+  } finally {
+    kv.close();
+  }
   logger.debug(
     `Task enqueued: ${task.command} for primary: ${task.mainBucketConfig.typ} to replica: ${task.backupBucketConfig.typ}`,
   );
 }
🧹 Nitpick comments (3)
src/backends/worker.ts (1)

31-45: Consider additional handling for timed-out tasks.
Currently, a timed-out task logs an error, but there's no logic to re-queue or rollback partial updates. Reviewing whether tasks should automatically re-queue or be marked as failed might help robust error handling.

src/backends/tasks.ts (1)

53-53: Check error handling for future worker failures.
After startWorkers(ctx), consider what happens if a worker fails unexpectedly. You may want to watch for worker termination to gracefully restart.

src/backends/task_store.ts (1)

69-84: Be mindful of potential memory usage while serializing.
Looping over the entire key space and using JSON.stringify can be expensive if there are many tasks. Consider chunking or streaming if queues become large.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 000aa14 and 7f0e558.

📒 Files selected for processing (10)
  • .gitignore (1 hunks)
  • src/backends/mirror.ts (4 hunks)
  • src/backends/s3/mod.ts (1 hunks)
  • src/backends/task_store.ts (8 hunks)
  • src/backends/tasks.ts (2 hunks)
  • src/backends/types.ts (1 hunks)
  • src/backends/worker.ts (1 hunks)
  • src/constants/message.ts (1 hunks)
  • src/constants/time.ts (1 hunks)
  • src/workers/mod.ts (1 hunks)
✅ Files skipped from review due to trivial changes (3)
  • src/workers/mod.ts
  • .gitignore
  • src/constants/message.ts
🧰 Additional context used
🧠 Learnings (1)
src/backends/tasks.ts (1)
Learnt from: destifo
PR: expnt/herald#5
File: src/backends/tasks.ts:0-0
Timestamp: 2025-01-08T06:46:12.523Z
Learning: In the Herald project, when managing worker availability for task processing, workers are removed from the workers map while processing a task and added back when the task is completed or fails. This prevents other tasks from using a busy worker.
🔇 Additional comments (12)
src/backends/worker.ts (3)

7-8: Confirm the constants exist and match the intended usage.
These newly imported constants (TASK_QUEUE_DB, TASK_TIMEOUT) need to be defined or exported properly in their respective files. Make sure they're the correct references to avoid runtime errors.


15-15: Interface rename looks correct.
Renaming from a mirror-specific interface to a more generic StartMessage provides clarity on what the worker expects to receive. Good practice for code maintainability.


26-29: Check for resource handling in KV usage.
A Deno.openKv(dbName) call is persistent. Consider confirming whether you need to close or dispose the KV store after the worker finishes or fails tasks, to avoid potential resource leaks.

src/backends/tasks.ts (2)

10-10: Using the bucket name for the worker is a good approach.
Naming the worker after the bucket makes logs and debugging simpler. Ensure consistency with how self.name is referenced in worker.ts.


45-48: Initialize workers with context carefully.
Posting a cloned version of the context is reasonable, but confirm that all required properties serialize properly. If the context is large or contains functions, you may encounter serialization issues.

src/backends/task_store.ts (3)

35-43: Multi-bucket KV initialization looks good.
Creating a map of [bucketName, Deno.Kv] effectively segments tasks per bucket. This design scales well for multi-bucket usage.


187-189: Bucket-based path is well-structured.
Placing each queue under <bucket>/queue.json is straightforward and keeps remote storage organized by bucket.


301-304: Initialization of the task store is consistent.
Passing Object.keys(config.buckets) aligns with the new multi-bucket logic in TaskStore. Good job keeping it consistent with the rest of the store’s structure.

src/constants/time.ts (1)

5-5: Verify the task timeout duration.

A 4-minute timeout seems quite long. This could potentially lead to resource exhaustion if many tasks hit the maximum timeout duration simultaneously.

Could you provide the rationale for choosing 240 seconds? Consider:

  1. What's the average task completion time in production?
  2. What's the 95th percentile task completion time?
  3. Would a shorter timeout with retry logic be more appropriate?
src/backends/types.ts (1)

22-29: LGTM! Good addition to support bucket-specific task queues.

The addition of the bucket field to MirrorTask interface helps track which bucket a task belongs to, supporting the PR objective of preventing task order shuffling.

src/backends/s3/mod.ts (1)

72-72: Verify empty string handling for x-amz-copy-source header.

The condition has been simplified from !== undefined to a truthy check. While cleaner, this changes how empty strings are handled. Please verify that empty strings for x-amz-copy-source should be treated the same as undefined.

❌ Verification inconclusive

🏁 Script executed:

#!/bin/bash
# Search for tests or handling of empty x-amz-copy-source header
rg -l "x-amz-copy-source.*\"\"" .

Length of output: 33


Manual Verification Required: Confirm Empty String Handling for x-amz-copy-source Header

The current changes simplify the condition to a truthy check, meaning an empty string (a falsy value) will now be treated the same as undefined. Our initial search for tests or explicit handling of empty strings provided no results. As a result, we couldn't verify whether the empty string scenario is covered by tests or if the behavior change is intentional. Please manually verify that an empty string for x-amz-copy-source is acceptable or if it requires special handling.

src/backends/mirror.ts (1)

25-29: LGTM! Clean implementation of bucket extraction.

The getBucketFromTask function cleanly handles both S3 and Swift configurations.

Comment thread src/backends/worker.ts
Comment thread src/backends/task_store.ts
Comment thread src/backends/mirror.ts Outdated
@destifo destifo requested review from edraobdu and zifeo February 12, 2025 09:45
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🔭 Outside diff range comments (2)
src/backends/mirror.ts (1)

30-46: 💡 Verification agent

🧩 Analysis chain

Verify task queue persistence.

The task queue is now stored in a bucket-specific KV store. Ensure that:

  1. The KV store is properly closed after use
  2. The task order is preserved during re-enqueuing

🏁 Script executed:

#!/bin/bash
# Search for KV store cleanup patterns
rg -A 5 'openKv\(' 

Length of output: 1637


I'll now verify if there is any explicit cleanup of the KV store (i.e. a call to a close() or similar method) after its use, and I'll also search for evidence that the enqueued tasks maintain their order during any potential re-enqueuing.

I'll generate a script to search for any instance of .close( in the repository so we can check whether the KV stores are being closed.


🏁 Script executed:

#!/bin/bash
rg -A 5 '\.close\('

Length of output: 457


KV Store Not Closed and Unverified Task Order Preservation

  • The code in src/backends/mirror.ts opens a KV store via Deno.openKv(...) but never explicitly calls a closure method such as kv.close(). This could lead to resource leakage if the function is called repeatedly.
  • There is no visible mechanism or safeguard ensuring that the task order is maintained during re-enqueuing; the reliance on kv.enqueue(task) implies that ordering depends solely on the underlying KV implementation, which has not been explicitly verified here.
src/backends/swift/buckets.ts (1)

135-137: 🛠️ Refactor suggestion

Add retry count to createBucket operation.

The createBucket function's retry logic should also be updated to use the conditional retry count like other operations.

   const response = await retryWithExponentialBackoff(
     fetchFunc,
+    bucketConfig.hasReplicas() || bucketConfig.isReplica ? 1 : 3,
   );

Also applies to: 191-193

♻️ Duplicate comments (1)
src/backends/worker.ts (1)

58-60: ⚠️ Potential issue

Fix undefined variable reference.

The variable name is used in both the log message and database name construction but is never declared. This will cause a runtime error.

Apply this diff to fix the reference:

- logger.info(`Worker started listening to tasks for bucket: ${name}`);
- const dbName = `${name}_${TASK_QUEUE_DB}`;
+ logger.info(`Worker started listening to tasks for bucket: ${self.name}`);
+ const dbName = `${self.name}_${TASK_QUEUE_DB}`;
🧹 Nitpick comments (9)
src/backends/worker.ts (1)

67-69: Consider extracting timeout duration to a constant.

The timeout duration is defined in TASK_TIMEOUT but its value (240000ms = 4 minutes) is a magic number. Consider adding a comment explaining the rationale for this specific duration.

src/buckets/mod.ts (3)

25-26: Replace 'any' type with proper interface.

The use of // deno-lint-ignore no-explicit-any can be avoided by defining a proper type for the replica JSON structure.

Apply this diff to improve type safety:

+interface BucketJSON {
+  _name: string;
+  _config: S3Config | SwiftConfig;
+  _replicas?: BucketJSON[];
+  _typ: string;
+  _backend: string;
+}

 public static fromJSON(json: {
   _name: string;
   _config: S3Config | SwiftConfig;
   _replicas?: object[];
   _typ: string;
   _backend: string;
 }): Bucket {
   const replicas = json._replicas
-    // deno-lint-ignore no-explicit-any
-    ? json._replicas.map((replicaJson: any) => Bucket.fromJSON(replicaJson))
+    ? json._replicas.map((replicaJson: BucketJSON) => Bucket.fromJSON(replicaJson))
     : [];

45-49: Consider using type guard for safer type casting.

The type casting in bucketName getter could be made safer using type guards.

Apply this diff to improve type safety:

+private isS3Config(config: S3Config | SwiftConfig): config is S3Config {
+  return this.typ === "S3BucketConfig" || this.typ === "ReplicaS3Config";
+}

 get bucketName() {
-  return this.typ === "S3BucketConfig" || this.typ === "ReplicaS3Config"
-    ? (this.config as S3Config).bucket
-    : (this.config as SwiftConfig).container;
+  return this.isS3Config(this.config)
+    ? this.config.bucket
+    : this.config.container;
 }

71-73: Use strict equality comparison.

The isReplica getter uses loose equality (==) which is generally discouraged in TypeScript/JavaScript.

Apply this diff:

 get isReplica() {
-  return this.typ == "ReplicaS3Config" || this.typ == "ReplicaSwiftConfig";
+  return this.typ === "ReplicaS3Config" || this.typ === "ReplicaSwiftConfig";
 }
src/backends/s3/objects.ts (2)

169-169: Fix typo in success messages.

The word "Successful" is misspelled as "Successfull" in multiple log messages.

Apply this diff:

- logger.info(`Delete Object Successfull: ${response.statusText}`);
+ logger.info(`Delete Object Successful: ${response.statusText}`);
- logger.info(`Copy Object Successfull: ${response.statusText}`);
+ logger.info(`Copy Object Successful: ${response.statusText}`);
- logger.info(`Head Object Successfull: ${response.statusText}`);
+ logger.info(`Head Object Successful: ${response.statusText}`);

Also applies to: 208-208, 265-265


165-167: Remove duplicate error logging in deleteObject.

The error message is logged twice with slightly different variable names (errMesage vs direct string).

Apply this diff:

 if (response.status != 204) {
-  const errMesage = `Delete Object Failed: ${response.statusText}`;
+  const errMessage = `Delete Object Failed: ${response.statusText}`;
   logger.warn(`Delete Object Failed: ${response.statusText}`);
-  reportToSentry(errMesage);
+  reportToSentry(errMessage);
 }
src/backends/mirror.ts (1)

139-143: Improve error handling for S3 URL generation.

The getDownloadS3Url function should handle potential errors when extracting request information.

-    const getObjectUrl = getDownloadS3Url(
-      originalRequest,
-      primary.config as S3Config,
-    );
+    const reqMeta = s3Utils.extractRequestInfo(originalRequest);
+    if (!reqMeta.bucket || !reqMeta.objectKey) {
+      const errMessage = "Invalid request: missing bucket or object key";
+      logger.error(errMessage);
+      reportToSentry(errMessage);
+      return;
+    }
+    const getObjectUrl = getDownloadS3Url(originalRequest, primary.config as S3Config);
src/config/types.ts (1)

124-124: Consider enhancing the auth_type schema with environment-specific validation.

While the schema correctly defines the auth types with a secure default, it could benefit from additional validation to prevent insecure configurations in production.

Consider adding a custom validator:

-  auth_type: z.enum(["service_account", "none"]).default("service_account"),
+  auth_type: z.enum(["service_account", "none"])
+    .default("service_account")
+    .superRefine((val, ctx) => {
+      if (val === "none" && process.env.NODE_ENV === "production") {
+        ctx.addIssue({
+          code: z.ZodIssueCode.custom,
+          message: "Auth type 'none' is not allowed in production",
+        });
+      }
+    }),
src/auth/mod.ts (1)

232-234: Add documentation and logging to the getAuthType function.

The function should include JSDoc documentation and logging for better maintainability and observability.

Consider adding:

+/**
+ * Returns the configured authentication type.
+ * @returns {"service_account" | "none"} The current authentication type
+ */
 export function getAuthType() {
+  logger.debug(`Using authentication type: ${envVarsConfig.auth_type}`);
   return envVarsConfig.auth_type;
 }
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7f0e558 and cf84b97.

📒 Files selected for processing (13)
  • src/auth/mod.ts (1 hunks)
  • src/backends/mirror.ts (12 hunks)
  • src/backends/mod.ts (2 hunks)
  • src/backends/s3/buckets.ts (5 hunks)
  • src/backends/s3/objects.ts (7 hunks)
  • src/backends/swift/buckets.ts (16 hunks)
  • src/backends/swift/objects.ts (15 hunks)
  • src/backends/types.ts (2 hunks)
  • src/backends/worker.ts (1 hunks)
  • src/buckets/mod.ts (3 hunks)
  • src/config/types.ts (1 hunks)
  • src/main.ts (2 hunks)
  • src/utils/url.ts (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/backends/types.ts
🧰 Additional context used
🧠 Learnings (2)
src/backends/s3/buckets.ts (1)
Learnt from: destifo
PR: expnt/herald#10
File: src/backends/s3/buckets.ts:0-0
Timestamp: 2025-01-28T17:36:22.149Z
Learning: The `routeQueryParamedRequest` and `headBucket` functions in `src/backends/s3/buckets.ts` already implement retry logic for replica failover when the primary request fails, supporting both S3 and Swift replicas.
src/backends/swift/buckets.ts (1)
Learnt from: destifo
PR: expnt/herald#10
File: src/backends/s3/buckets.ts:0-0
Timestamp: 2025-01-28T17:36:22.149Z
Learning: The `routeQueryParamedRequest` and `headBucket` functions in `src/backends/s3/buckets.ts` already implement retry logic for replica failover when the primary request fails, supporting both S3 and Swift replicas.
🔇 Additional comments (9)
src/backends/worker.ts (1)

71-74: LGTM! Robust timeout handling implementation.

The use of Promise.race with a timeout promise is a good pattern for handling long-running tasks.

src/backends/s3/buckets.ts (2)

102-102: LGTM! Improved retry logic for replicas.

The retry count adjustment (1 for replicas, 3 for primary) aligns well with the existing replica failover mechanism.

Also applies to: 153-153


107-109: LGTM! Enhanced error logging for replica failover.

The addition of detailed logging for primary bucket failures and replica attempts improves observability.

Also applies to: 157-160

src/backends/s3/objects.ts (1)

23-23: LGTM! Consistent retry logic implementation.

The retry count adjustment (1 for replicas, 3 for primary) is consistently implemented across all object operations.

Also applies to: 69-69, 236-236

src/utils/url.ts (2)

59-63: LGTM! Enhanced retry flexibility.

The addition of the retries parameter with a default value of 3 improves the function's configurability while maintaining backward compatibility.


127-132: LGTM! Consistent retry configuration.

The retry configuration is correctly passed through to the underlying retry mechanism.

src/backends/mirror.ts (1)

18-20: LGTM! Simplified bucket extraction.

The function has been simplified to directly return the bucket name from the main configuration.

src/backends/swift/objects.ts (1)

36-43: LGTM! Improved error handling.

The error handling for authentication token retrieval has been enhanced with proper early returns.

src/backends/swift/buckets.ts (1)

45-53: LGTM! Consistent error handling pattern.

The error handling for authentication token retrieval follows the same improved pattern as in objects.ts.

Also applies to: 63-65

Comment thread src/backends/swift/objects.ts
Comment thread src/backends/mod.ts
Comment thread src/main.ts
@destifo destifo requested a review from hailatGH February 13, 2025 11:50
hailatGH
hailatGH previously approved these changes Feb 20, 2025
zifeo
zifeo previously approved these changes Mar 11, 2025
@zifeo
Copy link
Copy Markdown
Contributor

zifeo commented Mar 11, 2025

@destifo you have a conflict

@destifo destifo dismissed stale reviews from zifeo and hailatGH via e2a34fb March 11, 2025 21:35
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🔭 Outside diff range comments (1)
.ghjk/lock.json (1)

51-55: 🛠️ Refactor suggestion

❓ Verification inconclusive

Consistent Update for zstd_aa in Secondary Entry

The zstd_aa dependency is similarly updated to "v1.4.8," here. Again, check that the version string format (including the trailing comma) is intentional and uniformly processed across the system.


Attention: Verify Trailing Comma in Dependency Version

The zstd_aa dependency in .ghjk/lock.json (lines 51–55) now shows its version as "v1.4.8," (with a trailing comma). Please verify whether this trailing comma is intentional and consistently handled by the system across all dependency entries. If the inclusion of the comma is not by design, consider removing it to maintain uniformity.

🧹 Nitpick comments (1)
.ghjk/lock.json (1)

26-29: Updated Version for zstd_aa Dependency

The version for zstd_aa has been updated to "v1.4.8,". Please double-check that the trailing comma in the version string is deliberate and that version parsing routines (if any) will handle this format correctly.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9959eee and e2a34fb.

⛔ Files ignored due to path filters (2)
  • .ghjk/deno.lock is excluded by !**/*.lock
  • deno.lock is excluded by !**/*.lock
📒 Files selected for processing (5)
  • .ghjk/deno.jsonc (1 hunks)
  • .ghjk/lock.json (4 hunks)
  • import_map.json (1 hunks)
  • src/backends/swift/mod.ts (1 hunks)
  • tools/compose/compose.s3.yml (1 hunks)
✅ Files skipped from review due to trivial changes (3)
  • tools/compose/compose.s3.yml
  • import_map.json
  • .ghjk/deno.jsonc
🔇 Additional comments (5)
src/backends/swift/mod.ts (1)

110-110: Simplified conditional check looks good.

The modified conditional check for "x-amz-copy-source" header now relies on truthiness rather than explicitly checking against undefined. This is a cleaner approach as Headers.get() returns either the value or null (not undefined), so the behavior remains unchanged while making the code more concise.

.ghjk/lock.json (4)

3-3: Renamed Key: moduleEntriessys_entries

The key has been renamed as intended by the PR objectives. Please ensure that any downstream references (in code, documentation, or configuration files) are updated accordingly.


45-49:

Details

❓ Verification inconclusive

Consistent Update for tar_aa in Secondary Entry

Another occurrence of the tar_aa dependency shows the version now being "1.34". This consistency is good; however, please verify that no references were missed and that all parts of the build process support this change.


Below is a shell script you can run to search for all occurrences of the tar_aa dependency in the repository—especially to check that every instance (including in the lock file) correctly reflects version "1.34". This will help ensure that no references were missed and that all parts of the build process support the change.

Please run the following script:

#!/bin/bash
# Search for all references to 'tar_aa@' in the codebase to verify version consistency.
echo "Searching for 'tar_aa@' references across the repository:"
rg "tar_aa@" -n .

# Additionally, check the specific lock file for consistency.
echo "Checking '.ghjk/lock.json' for 'tar_aa' references:"
rg "tar_aa" -n .ghjk/lock.json

Once you verify the output, please ensure that:

  • Every reference to tar_aa shows version "1.34".
  • There are no lingering outdated references elsewhere in the repository.

Let me know if any discrepancies appear so we can address them accordingly.


20-24:

Details

❓ Verification inconclusive

Updated Version for tar_aa Dependency

The version for tar_aa has been updated to "1.34", consistent with the change noted in the PR summary. Verify that this lower version is compatible with all dependent components and that related tests/documentation are revised if necessary.


Below is a shell script that verifies the tar_aa dependency version in the lock file. Please run it and share the output so we can confirm that the update to version "1.34" (as shown in lines 20‑24) is indeed present and correct.

#!/bin/bash
# Verify that the dependency update for tar_aa is applied in .ghjk/lock.json

FILE=".ghjk/lock.json"

if [[ -f "$FILE" ]]; then
    echo "Displaying lines 20-24 from ${FILE}:"
    sed -n '20,24p' "$FILE"
    
    echo "Searching for tar_aa dependency update to version \"1.34\":"
    grep -R '"version": "1.34"' "$FILE"
else
    echo "Error: ${FILE} not found."
fi

Once we have the output from this script, we can finalize the rewritten review comment.


68-72:

Details

❓ Verification inconclusive

Additional Entry: Updated Version for zstd_aa

Similarly, an additional entry for zstd_aa shows the version update to "v1.4.8,". It would be beneficial to verify that this formatting is consistent across all dependency entries and that no unintentional artifacts (like an extra comma) are introduced.


Below is a shell script to help verify that the version string for zstd_aa (and similar dependency entries) consistently uses the intended formatting without any unintentional trailing artifacts (like an extra comma):

Please run the above script to gather the necessary verification details. Once you have the results, we can confirm whether the formatting is consistent across dependency entries and whether the extra comma in "v1.4.8," needs to be removed or intentionally retained.

Thank you.


🏁 Script executed:

#!/bin/bash
# Check for version entries with a trailing comma in the .ghjk/lock.json file.
echo "== Checking for version entries with a trailing comma =="
rg -n '"version":\s*"v[0-9]+\.[0-9]+\.[0-9]+,' .ghjk/lock.json

# Check for all entries related to zstd_aa in the file to verify formatting consistency.
echo ""
echo "== Checking for zstd_aa-related entries =="
rg -n 'zstd_aa' .ghjk/lock.json

Length of output: 952


Review: Verify Trailing Comma in zstd_aa Version Update

  • The .ghjk/lock.json file shows that the version string "v1.4.8," (with a trailing comma) is present consistently (e.g., lines 26, 51, and 68).
  • Please verify that this trailing comma is intentional and that no unintentional formatting artifacts have been introduced across other dependency entries.
  • If the comma is not required, update the version strings across the file accordingly.

Comment thread .ghjk/lock.json
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (8)
src/backends/swift/objects.ts (1)

346-350: Replica Fallback in getObjectMeta
The fallback on error is helpful for fault tolerance. Consider verifying that multiple attempts won’t cause performance overhead in high-volume scenarios.

src/workers/mod.ts (2)

1-4: Check the import structure and usage.

The addition of refreshWorkersContext is appropriate for ensuring the worker environment stays up-to-date. Confirm that initializeTaskHandler and refreshWorkersContext do not introduce redundant or conflicting behavior when run concurrently.


16-20: Consider potential memory and performance implications of frequent context refresh.

Refreshing the entire context every 5 minutes is fine for many use-cases, but if HeraldContext grows large or sensitive, consider refining which data is refreshed. You may also want to track the overhead of frequent serialization (JSON.stringify) calls.

src/backends/tasks.ts (1)

45-52: Clarify worker startup messaging.

Far as logic goes, broadcasting { type: "Start" } with the entire serialized context is a straightforward approach. However, if ctx becomes large or contains sensitive tokens, consider limiting the data workers actually need.

src/backends/swift/keystone_token_store.ts (3)

15-32: Log duplicate Swift configs when skipping.

You silently skip registering duplicate Swift configurations without logging. This can lead to confusion if certain configurations are left out. It may be helpful to add a log statement or comment clarifying that a duplicate config was found and skipped.

for (const config of swiftConfigs) {
  const configKey = KeystoneTokenStore.#getConfigKey(config);
  if (configAuthMetas.has(configKey)) {
+   console.debug(`Skipping duplicate Swift config for key: ${configKey}`);
    continue;
  }
  ...
}

47-58: Handle individual refresh errors gracefully.

In refreshTokens(), if fetching a token fails for one of the configurations, the entire method continues silently. Consider logging or storing errors on a per-configuration basis so that partial refresh failures don’t remain undetected.


75-98: Encapsulate token refresh scheduling in a configurable approach.

The current 55-minute hard-coded interval might be better suited as a configuration parameter, allowing more flexibility if token refresh intervals change in the future or differ between environments.

src/backends/worker.ts (1)

100-117: Consider re-try or backoff for failed tasks.

When a task times out or fails, the current implementation only logs an error. For resilience, you may want to retry or re-queue the task, possibly with a backoff. This ensures that transient errors do not permanently drop tasks.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 413aaa0 and 9abe7db.

⛔ Files ignored due to path filters (1)
  • deno.lock is excluded by !**/*.lock
📒 Files selected for processing (19)
  • import_map.json (1 hunks)
  • src/backends/swift/auth.ts (0 hunks)
  • src/backends/swift/buckets.ts (17 hunks)
  • src/backends/swift/keystone_token_store.ts (1 hunks)
  • src/backends/swift/objects.ts (16 hunks)
  • src/backends/tasks.ts (2 hunks)
  • src/backends/worker.ts (1 hunks)
  • src/config/loader.ts (0 hunks)
  • src/main.ts (4 hunks)
  • src/types/mod.ts (1 hunks)
  • src/utils/log.ts (1 hunks)
  • src/workers/mod.ts (2 hunks)
  • tests/iac/simple_bucket_test/iac_test.ts (1 hunks)
  • tests/s3/iac/simple_bucket_test/iac_test.ts (1 hunks)
  • tests/s3/sdk/upload_test.ts (2 hunks)
  • tests/swift/basic/object_test.ts (1 hunks)
  • tests/swift/iac/simple_bucket_test/iac_test.ts (1 hunks)
  • tests/utils/mod.ts (1 hunks)
  • utils/file.ts (1 hunks)
💤 Files with no reviewable changes (2)
  • src/backends/swift/auth.ts
  • src/config/loader.ts
✅ Files skipped from review due to trivial changes (5)
  • utils/file.ts
  • tests/iac/simple_bucket_test/iac_test.ts
  • tests/swift/iac/simple_bucket_test/iac_test.ts
  • src/utils/log.ts
  • tests/s3/iac/simple_bucket_test/iac_test.ts
🚧 Files skipped from review as they are similar to previous changes (5)
  • src/main.ts
  • tests/swift/basic/object_test.ts
  • tests/s3/sdk/upload_test.ts
  • src/backends/swift/buckets.ts
  • tests/utils/mod.ts
🧰 Additional context used
🧠 Learnings (2)
src/workers/mod.ts (1)
Learnt from: destifo
PR: expnt/herald#5
File: src/backends/tasks.ts:0-0
Timestamp: 2025-03-12T21:08:17.871Z
Learning: In the Herald project, when managing worker availability for task processing, workers are removed from the workers map while processing a task and added back when the task is completed or fails. This prevents other tasks from using a busy worker.
src/backends/swift/objects.ts (1)
Learnt from: destifo
PR: expnt/herald#10
File: src/backends/s3/buckets.ts:0-0
Timestamp: 2025-03-12T21:08:17.871Z
Learning: The `routeQueryParamedRequest` and `headBucket` functions in `src/backends/s3/buckets.ts` already implement retry logic for replica failover when the primary request fails, supporting both S3 and Swift replicas.
🔇 Additional comments (21)
import_map.json (3)

29-29: Ensure Consistent Import Mapping for std/path/
The updated mapping now includes a trailing slash ("std/path/"), which aligns with the new guidelines and ensures consistent module resolution across the project.


33-33: Standardize Import Path for std/fmt/
Adding the trailing slash to "std/fmt/" provides uniformity with other standard library paths, improving maintainability and reducing ambiguity in how modules are resolved.


39-39: Adhere to Consistent Format for std/testing/
The trailing slash in "std/testing/" brings consistency to the import mappings for standard modules, which is beneficial for readability and future maintenance.

src/backends/swift/objects.ts (14)

3-3: Adoption of getSwiftRequestHeaders
No issues found. This updated import clarifies the usage of Swift-specific headers, removing the overhead of an additional token retrieval step.


10-10: Refined Import
Using SwiftConfig from shared config types is consistent with the overall design.


27-27: Potential Stream Handling Concern
Switching from getBodyFromReq(req) to req.body may alter how data is streamed or read. Please confirm that large or chunked payloads are still handled as expected.


37-39: Neatly Retrieving Swift Credentials
This direct approach using ctx.keystoneStore.getConfigAuthMeta(config) simplifies the flow and removes unnecessary steps.


94-96: Consistent Auth Retrieval
Using keystoneStore.getConfigAuthMeta(config) again ensures uniform Swift auth handling across methods.


110-110: Verify Reduced Retries
The condition sets fewer retries (1) if the bucket is a replica or has replicas. Double-check that this aligns with your reliability expectations.


113-117: Replica Fallback for GET
The fallback logic is appropriate. Ensure that the bucket name, logged at line 115, does not reveal sensitive information.


163-165: Consistent Swift Auth Retrieval
Mirroring the earlier approach. No issues found here.


233-235: Uniform Auth Pattern
Same pattern of retrieving Swift configuration, promoting consistency across the codebase.


394-396: Swift Auth Replacement
This continues the uniform token retrieval approach, improving maintainability.


409-409: Duplicate Retry Logic
Same note as line 110 about verifying the reduced retry count for replicas.


413-416: Replica Fallback for HEAD
Consistent fallback logic when the primary bucket fails, with informative logging.


467-469: Auth Meta Retrieval
Matches the repeated approach for extracting storageUrl and authToken. Good for consistency.


501-501: Mirroring Setup
Ensuring bucketConfig is passed to prepareMirrorRequests keeps the mirror logic consistent.

src/types/mod.ts (1)

6-6: Confirm token security scope and handling.

By embedding the keystoneStore into HeraldContext, be sure that sensitive info within the token store is safely managed, particularly if the entire context is later sent to workers or external modules. Consider whether only partial data (e.g., short-lived tokens) is necessary to minimize exposure of secrets.

Would you like me to generate a shell script to search for usages of keystoneStore and confirm if secure handling has been applied consistently?

src/workers/mod.ts (1)

13-13: Minor log message adjustment.

Renaming "Registering Worker: Task Handler" from plural to singular clarifies the action being performed.

src/backends/tasks.ts (1)

65-65: Straightforward worker initialization.

Switching from a direct task handler loop to calling startWorkers is coherent with the new approach of delegating tasks to multiple workers. Keep track of error handling if any worker fails on startup.

src/backends/worker.ts (1)

94-94: Undefined variable reference.
This issue was previously flagged. The variable name is never defined or assigned. This will trigger a reference error.

- logger.info(`Worker started listening to tasks for bucket: ${name}`);
+ logger.info(`Worker started listening to tasks for bucket: ${self.name}`);

Comment thread src/backends/tasks.ts
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🔭 Outside diff range comments (1)
src/backends/mirror.ts (1)

429-434: ⚠️ Potential issue

Fix parameter order in mirrorPutObject call.

The parameters passed to mirrorPutObject are in the wrong order, which could cause runtime errors.

-      await mirrorPutObject(
-        ctx,
-        originalRequest,
-        mainBucketConfig,
-        backupBucketConfig,
-      );
+      await mirrorPutObject(
+        ctx,
+        mainBucketConfig,
+        backupBucketConfig,
+        originalRequest,
+      );
♻️ Duplicate comments (1)
src/backends/swift/objects.ts (1)

109-112: ⚠️ Potential issue

Add retry count to deleteObject operation.

The deleteObject function should also be updated to use the conditional retry count like other operations such as getObject, listObjects, etc.

  const response = await retryWithExponentialBackoff(
    fetchFunc,
+    bucketConfig.hasReplicas() || bucketConfig.isReplica ? 1 : 3,
  );

This also applies to other similar operations like putObject (lines 51-53), copyObject (lines 483-485), and completeMultipartUpload (lines 624-626).

🧹 Nitpick comments (6)
src/backends/swift/objects.ts (1)

28-28: Inconsistent request body handling.

The putObject function now directly uses req.body instead of getBodyFromReq(req), but this change is not applied consistently throughout the file.

Other methods like deleteObject, listObjects and getObject still use getBodyFromReq(req). For consistency, consider using the same approach across all methods.

  return await fetch(reqUrl, {
    method: "PUT",
    headers: headers,
-    body: body,
+    body: getBodyFromReq(req),
  });

Or consistently update all other methods to use direct body access.

Also applies to: 47-49

src/backends/s3/objects.ts (4)

169-169: Fix spelling in log message.

The word "Successfull" has a spelling error and should be "Successful".

-  logger.info(`Delete Object Successfull: ${response.statusText}`);
+  logger.info(`Delete Object Successful: ${response.statusText}`);

208-208: Fix spelling in log message.

The word "Successfull" has a spelling error and should be "Successful".

-  logger.info(`Copy Object Successfull: ${response.statusText}`);
+  logger.info(`Copy Object Successful: ${response.statusText}`);

265-265: Fix spelling in log message.

The word "Successfull" has a spelling error and should be "Successful".

-  logger.info(`Head Object Successfull: ${response.statusText}`);
+  logger.info(`Head Object Successful: ${response.statusText}`);

328-328: Use consistent bucket parameter passing.

This line still uses type casting while other similar calls have been updated to use the simplified approach.

-        bucketConfig as S3BucketConfig,
+        bucketConfig,
src/backends/mirror.ts (1)

282-289: Add null check for bucket lookup.

The code assumes the bucket will always be found in the bucketStore, but it's safer to add null checking.

  const primaryBucket = bucketStore.buckets.find((bucket) =>
    bucket.bucketName === replica.bucketName
  )!;
+  if (!primaryBucket) {
+    logger.error(`Primary bucket ${replica.bucketName} not found in bucket store`);
+    return;
+  }

This pattern should be applied to all similar non-null assertions throughout the file (lines 323-330, 364-366, 386-388).

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9abe7db and cc2c0be.

📒 Files selected for processing (6)
  • src/backends/mirror.ts (12 hunks)
  • src/backends/s3/buckets.ts (5 hunks)
  • src/backends/s3/objects.ts (7 hunks)
  • src/backends/swift/objects.ts (20 hunks)
  • src/backends/types.ts (2 hunks)
  • src/utils/url.ts (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
  • src/backends/s3/buckets.ts
  • src/utils/url.ts
  • src/backends/types.ts
🧰 Additional context used
🧠 Learnings (2)
src/backends/swift/objects.ts (1)
Learnt from: destifo
PR: expnt/herald#10
File: src/backends/s3/buckets.ts:0-0
Timestamp: 2025-03-12T21:08:17.871Z
Learning: The `routeQueryParamedRequest` and `headBucket` functions in `src/backends/s3/buckets.ts` already implement retry logic for replica failover when the primary request fails, supporting both S3 and Swift replicas.
src/backends/mirror.ts (1)
Learnt from: destifo
PR: expnt/herald#10
File: src/backends/mirror.ts:170-172
Timestamp: 2025-03-12T21:08:17.871Z
Learning: In the mirror operations (src/backends/mirror.ts), bucket existence is guaranteed by design as mirror tasks are only created for buckets that exist in the bucket store, which is initialized during startup. Therefore, non-null assertions when finding buckets in the store are safe.
🔇 Additional comments (7)
src/backends/swift/objects.ts (2)

38-40: Authentication mechanism improvement.

The authentication method has been updated to use ctx.keystoneStore.getConfigAuthMeta(config), which is more efficient by retrieving both storage URL and auth token in a single operation.


115-118: Enhanced logging for replica failovers.

The improved logging provides more context when operations fail on the primary bucket and attempt to access replicas, making debugging and monitoring easier.

src/backends/s3/objects.ts (2)

23-23: Improved retry logic based on bucket configuration.

The conditional retry logic that adjusts based on bucket replica configuration is a good optimization, ensuring fewer retries when replicas are available.

Also applies to: 69-69, 236-236


27-30: Enhanced logging for replica failovers.

The addition of detailed warnings when operations fail on primary buckets provides better diagnostics and makes troubleshooting easier.

Also applies to: 73-76, 240-243

src/backends/mirror.ts (3)

31-33: Enhanced task queue with bucket-specific KV stores.

The implementation now uses bucket-specific key-value stores for task queues, which improves isolation and prevents task shuffling across different buckets.


19-19: Simplified getBucketFromTask implementation.

The function now directly returns the bucketName from mainBucketConfig, which is a cleaner approach that avoids type-specific logic.


59-59: Streamlined replica handling.

Directly iterating over bucketConfig.replicas instead of filtering from a global config is more efficient and maintainable.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🔭 Outside diff range comments (3)
src/backends/swift/objects.ts (3)

483-485: 🛠️ Refactor suggestion

Add retry count to copyObject operation

The copyObject function's retry logic should be updated to use the conditional retry count pattern for consistency with other operations.

  const response = await retryWithExponentialBackoff(
    fetchFunc,
+   bucketConfig.hasReplicas() || bucketConfig.isReplica ? 1 : 3,
  );

624-626: 🛠️ Refactor suggestion

Add retry count to completeMultipartUpload operation

The completeMultipartUpload function should also implement the conditional retry logic for consistency:

  const response = await retryWithExponentialBackoff(
    fetchFunc,
+   bucketConfig.hasReplicas() || bucketConfig.isReplica ? 1 : 3,
  );

701-703: 🛠️ Refactor suggestion

Add retry count to uploadPart operation

The uploadPart function should also implement the conditional retry logic:

  const response = await retryWithExponentialBackoff(
    fetchFunc,
+   bucketConfig.hasReplicas() || bucketConfig.isReplica ? 1 : 3,
  );
🧹 Nitpick comments (1)
src/backends/swift/objects.ts (1)

28-28: Consider consistent request body handling

You've updated putObject to use req.body directly instead of getBodyFromReq(req), but this approach isn't consistently applied across other methods that still use getBodyFromReq(req).

For consistency, consider applying this change to other methods as well:

-body: getBodyFromReq(req),
+body: req.body,
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between cc2c0be and 1b09a5a.

📒 Files selected for processing (1)
  • src/backends/swift/objects.ts (22 hunks)
🧰 Additional context used
🧠 Learnings (1)
src/backends/swift/objects.ts (1)
Learnt from: destifo
PR: expnt/herald#10
File: src/backends/s3/buckets.ts:0-0
Timestamp: 2025-03-12T21:08:17.871Z
Learning: The `routeQueryParamedRequest` and `headBucket` functions in `src/backends/s3/buckets.ts` already implement retry logic for replica failover when the primary request fails, supporting both S3 and Swift replicas.
🔇 Additional comments (7)
src/backends/swift/objects.ts (7)

3-10: Import refinement looks good

Simplifying imports by removing getAuthTokenWithTimeouts and SwiftBucketConfig reflects the new authentication approach using the Keystone token store.


38-40: Authentication improvement looks good

Using ctx.keystoneStore.getConfigAuthMeta(config) instead of separate authentication calls aligns with the PR objective to improve Swift authentication by caching tokens periodically.


111-111: Retry logic improvement looks good

Adjusting retry count based on bucket configuration is a smart optimization - using fewer retries when replicas are available.


115-118: Enhanced logging looks good

Adding more context to error logs when primary bucket operations fail will improve troubleshooting and debugging capabilities.


178-180: Add retry count to deleteObject operation

The deleteObject function's retry logic should also be updated to use the conditional retry count like other operations.

  const response = await retryWithExponentialBackoff(
    fetchFunc,
+   bucketConfig.hasReplicas() || bucketConfig.isReplica ? 1 : 3,
  );

553-553: Logging formatting looks good

The updated log message provides better clarity on the outcome of the operation.


656-664: Code simplification looks good

The simplified variable naming and return statement improves readability.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
src/backends/mirror.ts (2)

31-45: Consider concurrency safeguards.
When multiple tasks are enqueued concurrently, lockedStorages.set(storageKey, currentCount + 1); might cause race conditions as it isn’t protected by any synchronization. You may want to introduce an atomic transaction, lock, or other concurrency solutions to ensure consistent increments.


198-217: Avoid forcing Swift configs into S3 shape.
Creating an artificial S3Config object from a Swift configuration can become error-prone. An adapter pattern or specialized function might cleanly separate Swift from S3 logic, improving readability and reducing potential errors.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1b09a5a and 0b37a43.

📒 Files selected for processing (2)
  • src/backends/mirror.ts (13 hunks)
  • src/backends/s3/objects.ts (8 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/backends/s3/objects.ts
🧰 Additional context used
🧠 Learnings (1)
src/backends/mirror.ts (1)
Learnt from: destifo
PR: expnt/herald#10
File: src/backends/mirror.ts:170-172
Timestamp: 2025-03-12T21:08:17.871Z
Learning: In the mirror operations (src/backends/mirror.ts), bucket existence is guaranteed by design as mirror tasks are only created for buckets that exist in the bucket store, which is initialized during startup. Therefore, non-null assertions when finding buckets in the store are safe.
⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: build-docker
🔇 Additional comments (9)
src/backends/mirror.ts (9)

2-14: No issues with updated imports.
These additions and modifications are consistent with the new bucket-based architecture.


19-19: Helper function is clear and concise.
Returning task.mainBucketConfig.bucketName directly is straightforward, with no identified issues.


54-59: Signature adjustments look good.
Transitioning to the Bucket type and iterating over bucketConfig.replicas is straightforward and aligns well with the new design.


132-151: Refactoring to Bucket parameters seems fine.
The updated logic to handle both S3 and Swift via Bucket is coherent. Validation checks for primary.typ are appropriately placed before casting.


285-306: Deletion mirroring changes are correct.
The function has consistent usage of Bucket and properly checks the replica type. No major issues found.


326-347: Copy operation adjustments are consistent.
Switching to the unified Bucket type, along with properly handling headers and calling the correct backend, looks fine.


362-379: Bucket creation mirror logic looks good.
The approach for both S3 and Swift remains consistent with the rest of the mirror strategy. No concerns.


385-408: Deletion of buckets is handled correctly.
Properly detects whether the replica is S3 or Swift, creating and modifying the request as needed.


413-417: Multipart completion reuse is a nice approach.
Delegating completion to mirrorPutObject keeps the logic simple and cohesive.

Comment thread src/backends/mirror.ts
@destifo destifo merged commit 86cabca into main Mar 12, 2025
@destifo destifo deleted the refactor/worker-mgmt branch March 12, 2025 22:19
@coderabbitai coderabbitai Bot mentioned this pull request Apr 29, 2025
3 tasks
@coderabbitai coderabbitai Bot mentioned this pull request Jun 16, 2025
3 tasks
@coderabbitai coderabbitai Bot mentioned this pull request Feb 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug fix enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants