Conversation
📝 WalkthroughWalkthroughThis pull request introduces widespread refactoring and enhancements across configuration, backend, and test modules. Key changes include the addition of new ignore patterns for database files, unifying bucket configuration via a new Changes
Suggested reviewers
Tip ⚡🧪 Multi-step agentic review comment chat (experimental)
📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (1)
🔇 Additional comments (1)
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Actionable comments posted: 3
🔭 Outside diff range comments (1)
src/backends/mirror.ts (1)
39-54:⚠️ Potential issueClose KV connection after use.
The KV connection opened with
Deno.openKvshould be closed after use to prevent resource leaks.export async function enqueueMirrorTask(ctx: HeraldContext, task: MirrorTask) { const bucket = getBucketFromTask(task); const kv = await Deno.openKv(`${bucket}_${TASK_QUEUE_DB}`); + try { const lockedStorages = ctx.taskStore.lockedStorages; const nonce = crypto.randomUUID(); // Unique identifier for the task task.nonce = nonce; logger.debug( `Enqueing task: ${task.command} for primary: ${task.mainBucketConfig.typ} to replica: ${task.backupBucketConfig.typ}`, ); // Atomic transaction to add the task to the queue const storageKey = getStorageKey(task.backupBucketConfig.config); const currentCount = lockedStorages.get(storageKey) || 0; lockedStorages.set(storageKey, currentCount + 1); await kv.enqueue(task); + } finally { + kv.close(); + } logger.debug( `Task enqueued: ${task.command} for primary: ${task.mainBucketConfig.typ} to replica: ${task.backupBucketConfig.typ}`, ); }
🧹 Nitpick comments (3)
src/backends/worker.ts (1)
31-45: Consider additional handling for timed-out tasks.
Currently, a timed-out task logs an error, but there's no logic to re-queue or rollback partial updates. Reviewing whether tasks should automatically re-queue or be marked as failed might help robust error handling.src/backends/tasks.ts (1)
53-53: Check error handling for future worker failures.
AfterstartWorkers(ctx), consider what happens if a worker fails unexpectedly. You may want to watch for worker termination to gracefully restart.src/backends/task_store.ts (1)
69-84: Be mindful of potential memory usage while serializing.
Looping over the entire key space and usingJSON.stringifycan be expensive if there are many tasks. Consider chunking or streaming if queues become large.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (10)
.gitignore(1 hunks)src/backends/mirror.ts(4 hunks)src/backends/s3/mod.ts(1 hunks)src/backends/task_store.ts(8 hunks)src/backends/tasks.ts(2 hunks)src/backends/types.ts(1 hunks)src/backends/worker.ts(1 hunks)src/constants/message.ts(1 hunks)src/constants/time.ts(1 hunks)src/workers/mod.ts(1 hunks)
✅ Files skipped from review due to trivial changes (3)
- src/workers/mod.ts
- .gitignore
- src/constants/message.ts
🧰 Additional context used
🧠 Learnings (1)
src/backends/tasks.ts (1)
Learnt from: destifo
PR: expnt/herald#5
File: src/backends/tasks.ts:0-0
Timestamp: 2025-01-08T06:46:12.523Z
Learning: In the Herald project, when managing worker availability for task processing, workers are removed from the workers map while processing a task and added back when the task is completed or fails. This prevents other tasks from using a busy worker.
🔇 Additional comments (12)
src/backends/worker.ts (3)
7-8: Confirm the constants exist and match the intended usage.
These newly imported constants (TASK_QUEUE_DB,TASK_TIMEOUT) need to be defined or exported properly in their respective files. Make sure they're the correct references to avoid runtime errors.
15-15: Interface rename looks correct.
Renaming from a mirror-specific interface to a more genericStartMessageprovides clarity on what the worker expects to receive. Good practice for code maintainability.
26-29: Check for resource handling in KV usage.
ADeno.openKv(dbName)call is persistent. Consider confirming whether you need to close or dispose the KV store after the worker finishes or fails tasks, to avoid potential resource leaks.src/backends/tasks.ts (2)
10-10: Using the bucket name for the worker is a good approach.
Naming the worker after the bucket makes logs and debugging simpler. Ensure consistency with howself.nameis referenced inworker.ts.
45-48: Initialize workers with context carefully.
Posting a cloned version of the context is reasonable, but confirm that all required properties serialize properly. If the context is large or contains functions, you may encounter serialization issues.src/backends/task_store.ts (3)
35-43: Multi-bucket KV initialization looks good.
Creating a map of[bucketName, Deno.Kv]effectively segments tasks per bucket. This design scales well for multi-bucket usage.
187-189: Bucket-based path is well-structured.
Placing each queue under<bucket>/queue.jsonis straightforward and keeps remote storage organized by bucket.
301-304: Initialization of the task store is consistent.
PassingObject.keys(config.buckets)aligns with the new multi-bucket logic inTaskStore. Good job keeping it consistent with the rest of the store’s structure.src/constants/time.ts (1)
5-5: Verify the task timeout duration.A 4-minute timeout seems quite long. This could potentially lead to resource exhaustion if many tasks hit the maximum timeout duration simultaneously.
Could you provide the rationale for choosing 240 seconds? Consider:
- What's the average task completion time in production?
- What's the 95th percentile task completion time?
- Would a shorter timeout with retry logic be more appropriate?
src/backends/types.ts (1)
22-29: LGTM! Good addition to support bucket-specific task queues.The addition of the
bucketfield toMirrorTaskinterface helps track which bucket a task belongs to, supporting the PR objective of preventing task order shuffling.src/backends/s3/mod.ts (1)
72-72: Verify empty string handling for x-amz-copy-source header.The condition has been simplified from
!== undefinedto a truthy check. While cleaner, this changes how empty strings are handled. Please verify that empty strings forx-amz-copy-sourceshould be treated the same as undefined.❌ Verification inconclusive
🏁 Script executed:
#!/bin/bash # Search for tests or handling of empty x-amz-copy-source header rg -l "x-amz-copy-source.*\"\"" .Length of output: 33
Manual Verification Required: Confirm Empty String Handling for
x-amz-copy-sourceHeaderThe current changes simplify the condition to a truthy check, meaning an empty string (a falsy value) will now be treated the same as
undefined. Our initial search for tests or explicit handling of empty strings provided no results. As a result, we couldn't verify whether the empty string scenario is covered by tests or if the behavior change is intentional. Please manually verify that an empty string forx-amz-copy-sourceis acceptable or if it requires special handling.src/backends/mirror.ts (1)
25-29: LGTM! Clean implementation of bucket extraction.The
getBucketFromTaskfunction cleanly handles both S3 and Swift configurations.
There was a problem hiding this comment.
Actionable comments posted: 3
🔭 Outside diff range comments (2)
src/backends/mirror.ts (1)
30-46: 💡 Verification agent🧩 Analysis chain
Verify task queue persistence.
The task queue is now stored in a bucket-specific KV store. Ensure that:
- The KV store is properly closed after use
- The task order is preserved during re-enqueuing
🏁 Script executed:
#!/bin/bash # Search for KV store cleanup patterns rg -A 5 'openKv\('Length of output: 1637
I'll now verify if there is any explicit cleanup of the KV store (i.e. a call to a
close()or similar method) after its use, and I'll also search for evidence that the enqueued tasks maintain their order during any potential re-enqueuing.I'll generate a script to search for any instance of
.close(in the repository so we can check whether the KV stores are being closed.
🏁 Script executed:
#!/bin/bash rg -A 5 '\.close\('Length of output: 457
KV Store Not Closed and Unverified Task Order Preservation
- The code in
src/backends/mirror.tsopens a KV store viaDeno.openKv(...)but never explicitly calls a closure method such askv.close(). This could lead to resource leakage if the function is called repeatedly.- There is no visible mechanism or safeguard ensuring that the task order is maintained during re-enqueuing; the reliance on
kv.enqueue(task)implies that ordering depends solely on the underlying KV implementation, which has not been explicitly verified here.src/backends/swift/buckets.ts (1)
135-137: 🛠️ Refactor suggestionAdd retry count to createBucket operation.
The
createBucketfunction's retry logic should also be updated to use the conditional retry count like other operations.const response = await retryWithExponentialBackoff( fetchFunc, + bucketConfig.hasReplicas() || bucketConfig.isReplica ? 1 : 3, );Also applies to: 191-193
♻️ Duplicate comments (1)
src/backends/worker.ts (1)
58-60:⚠️ Potential issueFix undefined variable reference.
The variable
nameis used in both the log message and database name construction but is never declared. This will cause a runtime error.Apply this diff to fix the reference:
- logger.info(`Worker started listening to tasks for bucket: ${name}`); - const dbName = `${name}_${TASK_QUEUE_DB}`; + logger.info(`Worker started listening to tasks for bucket: ${self.name}`); + const dbName = `${self.name}_${TASK_QUEUE_DB}`;
🧹 Nitpick comments (9)
src/backends/worker.ts (1)
67-69: Consider extracting timeout duration to a constant.The timeout duration is defined in
TASK_TIMEOUTbut its value (240000ms = 4 minutes) is a magic number. Consider adding a comment explaining the rationale for this specific duration.src/buckets/mod.ts (3)
25-26: Replace 'any' type with proper interface.The use of
// deno-lint-ignore no-explicit-anycan be avoided by defining a proper type for the replica JSON structure.Apply this diff to improve type safety:
+interface BucketJSON { + _name: string; + _config: S3Config | SwiftConfig; + _replicas?: BucketJSON[]; + _typ: string; + _backend: string; +} public static fromJSON(json: { _name: string; _config: S3Config | SwiftConfig; _replicas?: object[]; _typ: string; _backend: string; }): Bucket { const replicas = json._replicas - // deno-lint-ignore no-explicit-any - ? json._replicas.map((replicaJson: any) => Bucket.fromJSON(replicaJson)) + ? json._replicas.map((replicaJson: BucketJSON) => Bucket.fromJSON(replicaJson)) : [];
45-49: Consider using type guard for safer type casting.The type casting in bucketName getter could be made safer using type guards.
Apply this diff to improve type safety:
+private isS3Config(config: S3Config | SwiftConfig): config is S3Config { + return this.typ === "S3BucketConfig" || this.typ === "ReplicaS3Config"; +} get bucketName() { - return this.typ === "S3BucketConfig" || this.typ === "ReplicaS3Config" - ? (this.config as S3Config).bucket - : (this.config as SwiftConfig).container; + return this.isS3Config(this.config) + ? this.config.bucket + : this.config.container; }
71-73: Use strict equality comparison.The
isReplicagetter uses loose equality (==) which is generally discouraged in TypeScript/JavaScript.Apply this diff:
get isReplica() { - return this.typ == "ReplicaS3Config" || this.typ == "ReplicaSwiftConfig"; + return this.typ === "ReplicaS3Config" || this.typ === "ReplicaSwiftConfig"; }src/backends/s3/objects.ts (2)
169-169: Fix typo in success messages.The word "Successful" is misspelled as "Successfull" in multiple log messages.
Apply this diff:
- logger.info(`Delete Object Successfull: ${response.statusText}`); + logger.info(`Delete Object Successful: ${response.statusText}`); - logger.info(`Copy Object Successfull: ${response.statusText}`); + logger.info(`Copy Object Successful: ${response.statusText}`); - logger.info(`Head Object Successfull: ${response.statusText}`); + logger.info(`Head Object Successful: ${response.statusText}`);Also applies to: 208-208, 265-265
165-167: Remove duplicate error logging in deleteObject.The error message is logged twice with slightly different variable names (
errMesagevs direct string).Apply this diff:
if (response.status != 204) { - const errMesage = `Delete Object Failed: ${response.statusText}`; + const errMessage = `Delete Object Failed: ${response.statusText}`; logger.warn(`Delete Object Failed: ${response.statusText}`); - reportToSentry(errMesage); + reportToSentry(errMessage); }src/backends/mirror.ts (1)
139-143: Improve error handling for S3 URL generation.The
getDownloadS3Urlfunction should handle potential errors when extracting request information.- const getObjectUrl = getDownloadS3Url( - originalRequest, - primary.config as S3Config, - ); + const reqMeta = s3Utils.extractRequestInfo(originalRequest); + if (!reqMeta.bucket || !reqMeta.objectKey) { + const errMessage = "Invalid request: missing bucket or object key"; + logger.error(errMessage); + reportToSentry(errMessage); + return; + } + const getObjectUrl = getDownloadS3Url(originalRequest, primary.config as S3Config);src/config/types.ts (1)
124-124: Consider enhancing the auth_type schema with environment-specific validation.While the schema correctly defines the auth types with a secure default, it could benefit from additional validation to prevent insecure configurations in production.
Consider adding a custom validator:
- auth_type: z.enum(["service_account", "none"]).default("service_account"), + auth_type: z.enum(["service_account", "none"]) + .default("service_account") + .superRefine((val, ctx) => { + if (val === "none" && process.env.NODE_ENV === "production") { + ctx.addIssue({ + code: z.ZodIssueCode.custom, + message: "Auth type 'none' is not allowed in production", + }); + } + }),src/auth/mod.ts (1)
232-234: Add documentation and logging to the getAuthType function.The function should include JSDoc documentation and logging for better maintainability and observability.
Consider adding:
+/** + * Returns the configured authentication type. + * @returns {"service_account" | "none"} The current authentication type + */ export function getAuthType() { + logger.debug(`Using authentication type: ${envVarsConfig.auth_type}`); return envVarsConfig.auth_type; }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (13)
src/auth/mod.ts(1 hunks)src/backends/mirror.ts(12 hunks)src/backends/mod.ts(2 hunks)src/backends/s3/buckets.ts(5 hunks)src/backends/s3/objects.ts(7 hunks)src/backends/swift/buckets.ts(16 hunks)src/backends/swift/objects.ts(15 hunks)src/backends/types.ts(2 hunks)src/backends/worker.ts(1 hunks)src/buckets/mod.ts(3 hunks)src/config/types.ts(1 hunks)src/main.ts(2 hunks)src/utils/url.ts(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- src/backends/types.ts
🧰 Additional context used
🧠 Learnings (2)
src/backends/s3/buckets.ts (1)
Learnt from: destifo
PR: expnt/herald#10
File: src/backends/s3/buckets.ts:0-0
Timestamp: 2025-01-28T17:36:22.149Z
Learning: The `routeQueryParamedRequest` and `headBucket` functions in `src/backends/s3/buckets.ts` already implement retry logic for replica failover when the primary request fails, supporting both S3 and Swift replicas.
src/backends/swift/buckets.ts (1)
Learnt from: destifo
PR: expnt/herald#10
File: src/backends/s3/buckets.ts:0-0
Timestamp: 2025-01-28T17:36:22.149Z
Learning: The `routeQueryParamedRequest` and `headBucket` functions in `src/backends/s3/buckets.ts` already implement retry logic for replica failover when the primary request fails, supporting both S3 and Swift replicas.
🔇 Additional comments (9)
src/backends/worker.ts (1)
71-74: LGTM! Robust timeout handling implementation.The use of
Promise.racewith a timeout promise is a good pattern for handling long-running tasks.src/backends/s3/buckets.ts (2)
102-102: LGTM! Improved retry logic for replicas.The retry count adjustment (1 for replicas, 3 for primary) aligns well with the existing replica failover mechanism.
Also applies to: 153-153
107-109: LGTM! Enhanced error logging for replica failover.The addition of detailed logging for primary bucket failures and replica attempts improves observability.
Also applies to: 157-160
src/backends/s3/objects.ts (1)
23-23: LGTM! Consistent retry logic implementation.The retry count adjustment (1 for replicas, 3 for primary) is consistently implemented across all object operations.
Also applies to: 69-69, 236-236
src/utils/url.ts (2)
59-63: LGTM! Enhanced retry flexibility.The addition of the
retriesparameter with a default value of 3 improves the function's configurability while maintaining backward compatibility.
127-132: LGTM! Consistent retry configuration.The retry configuration is correctly passed through to the underlying retry mechanism.
src/backends/mirror.ts (1)
18-20: LGTM! Simplified bucket extraction.The function has been simplified to directly return the bucket name from the main configuration.
src/backends/swift/objects.ts (1)
36-43: LGTM! Improved error handling.The error handling for authentication token retrieval has been enhanced with proper early returns.
src/backends/swift/buckets.ts (1)
45-53: LGTM! Consistent error handling pattern.The error handling for authentication token retrieval follows the same improved pattern as in objects.ts.
Also applies to: 63-65
|
@destifo you have a conflict |
There was a problem hiding this comment.
Actionable comments posted: 1
🔭 Outside diff range comments (1)
.ghjk/lock.json (1)
51-55: 🛠️ Refactor suggestion❓ Verification inconclusive
Consistent Update for
zstd_aain Secondary EntryThe
zstd_aadependency is similarly updated to"v1.4.8,"here. Again, check that the version string format (including the trailing comma) is intentional and uniformly processed across the system.
Attention: Verify Trailing Comma in Dependency Version
The
zstd_aadependency in.ghjk/lock.json(lines 51–55) now shows its version as"v1.4.8,"(with a trailing comma). Please verify whether this trailing comma is intentional and consistently handled by the system across all dependency entries. If the inclusion of the comma is not by design, consider removing it to maintain uniformity.
🧹 Nitpick comments (1)
.ghjk/lock.json (1)
26-29: Updated Version forzstd_aaDependencyThe version for
zstd_aahas been updated to"v1.4.8,". Please double-check that the trailing comma in the version string is deliberate and that version parsing routines (if any) will handle this format correctly.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (2)
.ghjk/deno.lockis excluded by!**/*.lockdeno.lockis excluded by!**/*.lock
📒 Files selected for processing (5)
.ghjk/deno.jsonc(1 hunks).ghjk/lock.json(4 hunks)import_map.json(1 hunks)src/backends/swift/mod.ts(1 hunks)tools/compose/compose.s3.yml(1 hunks)
✅ Files skipped from review due to trivial changes (3)
- tools/compose/compose.s3.yml
- import_map.json
- .ghjk/deno.jsonc
🔇 Additional comments (5)
src/backends/swift/mod.ts (1)
110-110: Simplified conditional check looks good.The modified conditional check for "x-amz-copy-source" header now relies on truthiness rather than explicitly checking against
undefined. This is a cleaner approach asHeaders.get()returns either the value ornull(notundefined), so the behavior remains unchanged while making the code more concise..ghjk/lock.json (4)
3-3: Renamed Key:moduleEntries→sys_entriesThe key has been renamed as intended by the PR objectives. Please ensure that any downstream references (in code, documentation, or configuration files) are updated accordingly.
45-49:Details
❓ Verification inconclusive
Consistent Update for
tar_aain Secondary EntryAnother occurrence of the
tar_aadependency shows the version now being"1.34". This consistency is good; however, please verify that no references were missed and that all parts of the build process support this change.
Below is a shell script you can run to search for all occurrences of the
tar_aadependency in the repository—especially to check that every instance (including in the lock file) correctly reflects version"1.34". This will help ensure that no references were missed and that all parts of the build process support the change.Please run the following script:
#!/bin/bash # Search for all references to 'tar_aa@' in the codebase to verify version consistency. echo "Searching for 'tar_aa@' references across the repository:" rg "tar_aa@" -n . # Additionally, check the specific lock file for consistency. echo "Checking '.ghjk/lock.json' for 'tar_aa' references:" rg "tar_aa" -n .ghjk/lock.jsonOnce you verify the output, please ensure that:
- Every reference to
tar_aashows version"1.34".- There are no lingering outdated references elsewhere in the repository.
Let me know if any discrepancies appear so we can address them accordingly.
20-24:Details
❓ Verification inconclusive
Updated Version for
tar_aaDependencyThe version for
tar_aahas been updated to"1.34", consistent with the change noted in the PR summary. Verify that this lower version is compatible with all dependent components and that related tests/documentation are revised if necessary.
Below is a shell script that verifies the tar_aa dependency version in the lock file. Please run it and share the output so we can confirm that the update to version "1.34" (as shown in lines 20‑24) is indeed present and correct.
#!/bin/bash # Verify that the dependency update for tar_aa is applied in .ghjk/lock.json FILE=".ghjk/lock.json" if [[ -f "$FILE" ]]; then echo "Displaying lines 20-24 from ${FILE}:" sed -n '20,24p' "$FILE" echo "Searching for tar_aa dependency update to version \"1.34\":" grep -R '"version": "1.34"' "$FILE" else echo "Error: ${FILE} not found." fiOnce we have the output from this script, we can finalize the rewritten review comment.
68-72:Details
❓ Verification inconclusive
Additional Entry: Updated Version for
zstd_aaSimilarly, an additional entry for
zstd_aashows the version update to"v1.4.8,". It would be beneficial to verify that this formatting is consistent across all dependency entries and that no unintentional artifacts (like an extra comma) are introduced.
Below is a shell script to help verify that the version string for
zstd_aa(and similar dependency entries) consistently uses the intended formatting without any unintentional trailing artifacts (like an extra comma):Please run the above script to gather the necessary verification details. Once you have the results, we can confirm whether the formatting is consistent across dependency entries and whether the extra comma in
"v1.4.8,"needs to be removed or intentionally retained.Thank you.
🏁 Script executed:
#!/bin/bash # Check for version entries with a trailing comma in the .ghjk/lock.json file. echo "== Checking for version entries with a trailing comma ==" rg -n '"version":\s*"v[0-9]+\.[0-9]+\.[0-9]+,' .ghjk/lock.json # Check for all entries related to zstd_aa in the file to verify formatting consistency. echo "" echo "== Checking for zstd_aa-related entries ==" rg -n 'zstd_aa' .ghjk/lock.jsonLength of output: 952
Review: Verify Trailing Comma in zstd_aa Version Update
- The
.ghjk/lock.jsonfile shows that the version string"v1.4.8,"(with a trailing comma) is present consistently (e.g., lines 26, 51, and 68).- Please verify that this trailing comma is intentional and that no unintentional formatting artifacts have been introduced across other dependency entries.
- If the comma is not required, update the version strings across the file accordingly.
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (8)
src/backends/swift/objects.ts (1)
346-350: Replica Fallback ingetObjectMeta
The fallback on error is helpful for fault tolerance. Consider verifying that multiple attempts won’t cause performance overhead in high-volume scenarios.src/workers/mod.ts (2)
1-4: Check the import structure and usage.The addition of
refreshWorkersContextis appropriate for ensuring the worker environment stays up-to-date. Confirm thatinitializeTaskHandlerandrefreshWorkersContextdo not introduce redundant or conflicting behavior when run concurrently.
16-20: Consider potential memory and performance implications of frequent context refresh.Refreshing the entire context every 5 minutes is fine for many use-cases, but if
HeraldContextgrows large or sensitive, consider refining which data is refreshed. You may also want to track the overhead of frequent serialization (JSON.stringify) calls.src/backends/tasks.ts (1)
45-52: Clarify worker startup messaging.Far as logic goes, broadcasting
{ type: "Start" }with the entire serialized context is a straightforward approach. However, ifctxbecomes large or contains sensitive tokens, consider limiting the data workers actually need.src/backends/swift/keystone_token_store.ts (3)
15-32: Log duplicate Swift configs when skipping.You silently skip registering duplicate Swift configurations without logging. This can lead to confusion if certain configurations are left out. It may be helpful to add a log statement or comment clarifying that a duplicate config was found and skipped.
for (const config of swiftConfigs) { const configKey = KeystoneTokenStore.#getConfigKey(config); if (configAuthMetas.has(configKey)) { + console.debug(`Skipping duplicate Swift config for key: ${configKey}`); continue; } ... }
47-58: Handle individual refresh errors gracefully.In
refreshTokens(), if fetching a token fails for one of the configurations, the entire method continues silently. Consider logging or storing errors on a per-configuration basis so that partial refresh failures don’t remain undetected.
75-98: Encapsulate token refresh scheduling in a configurable approach.The current 55-minute hard-coded interval might be better suited as a configuration parameter, allowing more flexibility if token refresh intervals change in the future or differ between environments.
src/backends/worker.ts (1)
100-117: Consider re-try or backoff for failed tasks.When a task times out or fails, the current implementation only logs an error. For resilience, you may want to retry or re-queue the task, possibly with a backoff. This ensures that transient errors do not permanently drop tasks.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
deno.lockis excluded by!**/*.lock
📒 Files selected for processing (19)
import_map.json(1 hunks)src/backends/swift/auth.ts(0 hunks)src/backends/swift/buckets.ts(17 hunks)src/backends/swift/keystone_token_store.ts(1 hunks)src/backends/swift/objects.ts(16 hunks)src/backends/tasks.ts(2 hunks)src/backends/worker.ts(1 hunks)src/config/loader.ts(0 hunks)src/main.ts(4 hunks)src/types/mod.ts(1 hunks)src/utils/log.ts(1 hunks)src/workers/mod.ts(2 hunks)tests/iac/simple_bucket_test/iac_test.ts(1 hunks)tests/s3/iac/simple_bucket_test/iac_test.ts(1 hunks)tests/s3/sdk/upload_test.ts(2 hunks)tests/swift/basic/object_test.ts(1 hunks)tests/swift/iac/simple_bucket_test/iac_test.ts(1 hunks)tests/utils/mod.ts(1 hunks)utils/file.ts(1 hunks)
💤 Files with no reviewable changes (2)
- src/backends/swift/auth.ts
- src/config/loader.ts
✅ Files skipped from review due to trivial changes (5)
- utils/file.ts
- tests/iac/simple_bucket_test/iac_test.ts
- tests/swift/iac/simple_bucket_test/iac_test.ts
- src/utils/log.ts
- tests/s3/iac/simple_bucket_test/iac_test.ts
🚧 Files skipped from review as they are similar to previous changes (5)
- src/main.ts
- tests/swift/basic/object_test.ts
- tests/s3/sdk/upload_test.ts
- src/backends/swift/buckets.ts
- tests/utils/mod.ts
🧰 Additional context used
🧠 Learnings (2)
src/workers/mod.ts (1)
Learnt from: destifo
PR: expnt/herald#5
File: src/backends/tasks.ts:0-0
Timestamp: 2025-03-12T21:08:17.871Z
Learning: In the Herald project, when managing worker availability for task processing, workers are removed from the workers map while processing a task and added back when the task is completed or fails. This prevents other tasks from using a busy worker.
src/backends/swift/objects.ts (1)
Learnt from: destifo
PR: expnt/herald#10
File: src/backends/s3/buckets.ts:0-0
Timestamp: 2025-03-12T21:08:17.871Z
Learning: The `routeQueryParamedRequest` and `headBucket` functions in `src/backends/s3/buckets.ts` already implement retry logic for replica failover when the primary request fails, supporting both S3 and Swift replicas.
🔇 Additional comments (21)
import_map.json (3)
29-29: Ensure Consistent Import Mapping forstd/path/
The updated mapping now includes a trailing slash ("std/path/"), which aligns with the new guidelines and ensures consistent module resolution across the project.
33-33: Standardize Import Path forstd/fmt/
Adding the trailing slash to"std/fmt/"provides uniformity with other standard library paths, improving maintainability and reducing ambiguity in how modules are resolved.
39-39: Adhere to Consistent Format forstd/testing/
The trailing slash in"std/testing/"brings consistency to the import mappings for standard modules, which is beneficial for readability and future maintenance.src/backends/swift/objects.ts (14)
3-3: Adoption ofgetSwiftRequestHeaders
No issues found. This updated import clarifies the usage of Swift-specific headers, removing the overhead of an additional token retrieval step.
10-10: Refined Import
UsingSwiftConfigfrom shared config types is consistent with the overall design.
27-27: Potential Stream Handling Concern
Switching fromgetBodyFromReq(req)toreq.bodymay alter how data is streamed or read. Please confirm that large or chunked payloads are still handled as expected.
37-39: Neatly Retrieving Swift Credentials
This direct approach usingctx.keystoneStore.getConfigAuthMeta(config)simplifies the flow and removes unnecessary steps.
94-96: Consistent Auth Retrieval
UsingkeystoneStore.getConfigAuthMeta(config)again ensures uniform Swift auth handling across methods.
110-110: Verify Reduced Retries
The condition sets fewer retries (1) if the bucket is a replica or has replicas. Double-check that this aligns with your reliability expectations.
113-117: Replica Fallback for GET
The fallback logic is appropriate. Ensure that the bucket name, logged at line 115, does not reveal sensitive information.
163-165: Consistent Swift Auth Retrieval
Mirroring the earlier approach. No issues found here.
233-235: Uniform Auth Pattern
Same pattern of retrieving Swift configuration, promoting consistency across the codebase.
394-396: Swift Auth Replacement
This continues the uniform token retrieval approach, improving maintainability.
409-409: Duplicate Retry Logic
Same note as line 110 about verifying the reduced retry count for replicas.
413-416: Replica Fallback for HEAD
Consistent fallback logic when the primary bucket fails, with informative logging.
467-469: Auth Meta Retrieval
Matches the repeated approach for extractingstorageUrlandauthToken. Good for consistency.
501-501: Mirroring Setup
EnsuringbucketConfigis passed toprepareMirrorRequestskeeps the mirror logic consistent.src/types/mod.ts (1)
6-6: Confirm token security scope and handling.By embedding the
keystoneStoreintoHeraldContext, be sure that sensitive info within the token store is safely managed, particularly if the entire context is later sent to workers or external modules. Consider whether only partial data (e.g., short-lived tokens) is necessary to minimize exposure of secrets.Would you like me to generate a shell script to search for usages of
keystoneStoreand confirm if secure handling has been applied consistently?src/workers/mod.ts (1)
13-13: Minor log message adjustment.Renaming "Registering Worker: Task Handler" from plural to singular clarifies the action being performed.
src/backends/tasks.ts (1)
65-65: Straightforward worker initialization.Switching from a direct task handler loop to calling
startWorkersis coherent with the new approach of delegating tasks to multiple workers. Keep track of error handling if any worker fails on startup.src/backends/worker.ts (1)
94-94: Undefined variable reference.
This issue was previously flagged. The variablenameis never defined or assigned. This will trigger a reference error.- logger.info(`Worker started listening to tasks for bucket: ${name}`); + logger.info(`Worker started listening to tasks for bucket: ${self.name}`);
There was a problem hiding this comment.
Actionable comments posted: 0
🔭 Outside diff range comments (1)
src/backends/mirror.ts (1)
429-434:⚠️ Potential issueFix parameter order in mirrorPutObject call.
The parameters passed to
mirrorPutObjectare in the wrong order, which could cause runtime errors.- await mirrorPutObject( - ctx, - originalRequest, - mainBucketConfig, - backupBucketConfig, - ); + await mirrorPutObject( + ctx, + mainBucketConfig, + backupBucketConfig, + originalRequest, + );
♻️ Duplicate comments (1)
src/backends/swift/objects.ts (1)
109-112:⚠️ Potential issueAdd retry count to deleteObject operation.
The
deleteObjectfunction should also be updated to use the conditional retry count like other operations such asgetObject,listObjects, etc.const response = await retryWithExponentialBackoff( fetchFunc, + bucketConfig.hasReplicas() || bucketConfig.isReplica ? 1 : 3, );This also applies to other similar operations like
putObject(lines 51-53),copyObject(lines 483-485), andcompleteMultipartUpload(lines 624-626).
🧹 Nitpick comments (6)
src/backends/swift/objects.ts (1)
28-28: Inconsistent request body handling.The
putObjectfunction now directly usesreq.bodyinstead ofgetBodyFromReq(req), but this change is not applied consistently throughout the file.Other methods like
deleteObject,listObjectsandgetObjectstill usegetBodyFromReq(req). For consistency, consider using the same approach across all methods.return await fetch(reqUrl, { method: "PUT", headers: headers, - body: body, + body: getBodyFromReq(req), });Or consistently update all other methods to use direct body access.
Also applies to: 47-49
src/backends/s3/objects.ts (4)
169-169: Fix spelling in log message.The word "Successfull" has a spelling error and should be "Successful".
- logger.info(`Delete Object Successfull: ${response.statusText}`); + logger.info(`Delete Object Successful: ${response.statusText}`);
208-208: Fix spelling in log message.The word "Successfull" has a spelling error and should be "Successful".
- logger.info(`Copy Object Successfull: ${response.statusText}`); + logger.info(`Copy Object Successful: ${response.statusText}`);
265-265: Fix spelling in log message.The word "Successfull" has a spelling error and should be "Successful".
- logger.info(`Head Object Successfull: ${response.statusText}`); + logger.info(`Head Object Successful: ${response.statusText}`);
328-328: Use consistent bucket parameter passing.This line still uses type casting while other similar calls have been updated to use the simplified approach.
- bucketConfig as S3BucketConfig, + bucketConfig,src/backends/mirror.ts (1)
282-289: Add null check for bucket lookup.The code assumes the bucket will always be found in the bucketStore, but it's safer to add null checking.
const primaryBucket = bucketStore.buckets.find((bucket) => bucket.bucketName === replica.bucketName )!; + if (!primaryBucket) { + logger.error(`Primary bucket ${replica.bucketName} not found in bucket store`); + return; + }This pattern should be applied to all similar non-null assertions throughout the file (lines 323-330, 364-366, 386-388).
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
src/backends/mirror.ts(12 hunks)src/backends/s3/buckets.ts(5 hunks)src/backends/s3/objects.ts(7 hunks)src/backends/swift/objects.ts(20 hunks)src/backends/types.ts(2 hunks)src/utils/url.ts(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
- src/backends/s3/buckets.ts
- src/utils/url.ts
- src/backends/types.ts
🧰 Additional context used
🧠 Learnings (2)
src/backends/swift/objects.ts (1)
Learnt from: destifo
PR: expnt/herald#10
File: src/backends/s3/buckets.ts:0-0
Timestamp: 2025-03-12T21:08:17.871Z
Learning: The `routeQueryParamedRequest` and `headBucket` functions in `src/backends/s3/buckets.ts` already implement retry logic for replica failover when the primary request fails, supporting both S3 and Swift replicas.
src/backends/mirror.ts (1)
Learnt from: destifo
PR: expnt/herald#10
File: src/backends/mirror.ts:170-172
Timestamp: 2025-03-12T21:08:17.871Z
Learning: In the mirror operations (src/backends/mirror.ts), bucket existence is guaranteed by design as mirror tasks are only created for buckets that exist in the bucket store, which is initialized during startup. Therefore, non-null assertions when finding buckets in the store are safe.
🔇 Additional comments (7)
src/backends/swift/objects.ts (2)
38-40: Authentication mechanism improvement.The authentication method has been updated to use
ctx.keystoneStore.getConfigAuthMeta(config), which is more efficient by retrieving both storage URL and auth token in a single operation.
115-118: Enhanced logging for replica failovers.The improved logging provides more context when operations fail on the primary bucket and attempt to access replicas, making debugging and monitoring easier.
src/backends/s3/objects.ts (2)
23-23: Improved retry logic based on bucket configuration.The conditional retry logic that adjusts based on bucket replica configuration is a good optimization, ensuring fewer retries when replicas are available.
Also applies to: 69-69, 236-236
27-30: Enhanced logging for replica failovers.The addition of detailed warnings when operations fail on primary buckets provides better diagnostics and makes troubleshooting easier.
Also applies to: 73-76, 240-243
src/backends/mirror.ts (3)
31-33: Enhanced task queue with bucket-specific KV stores.The implementation now uses bucket-specific key-value stores for task queues, which improves isolation and prevents task shuffling across different buckets.
19-19: Simplified getBucketFromTask implementation.The function now directly returns the bucketName from mainBucketConfig, which is a cleaner approach that avoids type-specific logic.
59-59: Streamlined replica handling.Directly iterating over
bucketConfig.replicasinstead of filtering from a global config is more efficient and maintainable.
There was a problem hiding this comment.
Actionable comments posted: 0
🔭 Outside diff range comments (3)
src/backends/swift/objects.ts (3)
483-485: 🛠️ Refactor suggestionAdd retry count to copyObject operation
The
copyObjectfunction's retry logic should be updated to use the conditional retry count pattern for consistency with other operations.const response = await retryWithExponentialBackoff( fetchFunc, + bucketConfig.hasReplicas() || bucketConfig.isReplica ? 1 : 3, );
624-626: 🛠️ Refactor suggestionAdd retry count to completeMultipartUpload operation
The
completeMultipartUploadfunction should also implement the conditional retry logic for consistency:const response = await retryWithExponentialBackoff( fetchFunc, + bucketConfig.hasReplicas() || bucketConfig.isReplica ? 1 : 3, );
701-703: 🛠️ Refactor suggestionAdd retry count to uploadPart operation
The
uploadPartfunction should also implement the conditional retry logic:const response = await retryWithExponentialBackoff( fetchFunc, + bucketConfig.hasReplicas() || bucketConfig.isReplica ? 1 : 3, );
🧹 Nitpick comments (1)
src/backends/swift/objects.ts (1)
28-28: Consider consistent request body handlingYou've updated
putObjectto usereq.bodydirectly instead ofgetBodyFromReq(req), but this approach isn't consistently applied across other methods that still usegetBodyFromReq(req).For consistency, consider applying this change to other methods as well:
-body: getBodyFromReq(req), +body: req.body,
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
src/backends/swift/objects.ts(22 hunks)
🧰 Additional context used
🧠 Learnings (1)
src/backends/swift/objects.ts (1)
Learnt from: destifo
PR: expnt/herald#10
File: src/backends/s3/buckets.ts:0-0
Timestamp: 2025-03-12T21:08:17.871Z
Learning: The `routeQueryParamedRequest` and `headBucket` functions in `src/backends/s3/buckets.ts` already implement retry logic for replica failover when the primary request fails, supporting both S3 and Swift replicas.
🔇 Additional comments (7)
src/backends/swift/objects.ts (7)
3-10: Import refinement looks goodSimplifying imports by removing
getAuthTokenWithTimeoutsandSwiftBucketConfigreflects the new authentication approach using the Keystone token store.
38-40: Authentication improvement looks goodUsing
ctx.keystoneStore.getConfigAuthMeta(config)instead of separate authentication calls aligns with the PR objective to improve Swift authentication by caching tokens periodically.
111-111: Retry logic improvement looks goodAdjusting retry count based on bucket configuration is a smart optimization - using fewer retries when replicas are available.
115-118: Enhanced logging looks goodAdding more context to error logs when primary bucket operations fail will improve troubleshooting and debugging capabilities.
178-180: Add retry count to deleteObject operationThe
deleteObjectfunction's retry logic should also be updated to use the conditional retry count like other operations.const response = await retryWithExponentialBackoff( fetchFunc, + bucketConfig.hasReplicas() || bucketConfig.isReplica ? 1 : 3, );
553-553: Logging formatting looks goodThe updated log message provides better clarity on the outcome of the operation.
656-664: Code simplification looks goodThe simplified variable naming and return statement improves readability.
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
src/backends/mirror.ts (2)
31-45: Consider concurrency safeguards.
When multiple tasks are enqueued concurrently,lockedStorages.set(storageKey, currentCount + 1);might cause race conditions as it isn’t protected by any synchronization. You may want to introduce an atomic transaction, lock, or other concurrency solutions to ensure consistent increments.
198-217: Avoid forcing Swift configs into S3 shape.
Creating an artificialS3Configobject from a Swift configuration can become error-prone. An adapter pattern or specialized function might cleanly separate Swift from S3 logic, improving readability and reducing potential errors.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
src/backends/mirror.ts(13 hunks)src/backends/s3/objects.ts(8 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- src/backends/s3/objects.ts
🧰 Additional context used
🧠 Learnings (1)
src/backends/mirror.ts (1)
Learnt from: destifo
PR: expnt/herald#10
File: src/backends/mirror.ts:170-172
Timestamp: 2025-03-12T21:08:17.871Z
Learning: In the mirror operations (src/backends/mirror.ts), bucket existence is guaranteed by design as mirror tasks are only created for buckets that exist in the bucket store, which is initialized during startup. Therefore, non-null assertions when finding buckets in the store are safe.
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: build-docker
🔇 Additional comments (9)
src/backends/mirror.ts (9)
2-14: No issues with updated imports.
These additions and modifications are consistent with the new bucket-based architecture.
19-19: Helper function is clear and concise.
Returningtask.mainBucketConfig.bucketNamedirectly is straightforward, with no identified issues.
54-59: Signature adjustments look good.
Transitioning to theBuckettype and iterating overbucketConfig.replicasis straightforward and aligns well with the new design.
132-151: Refactoring toBucketparameters seems fine.
The updated logic to handle both S3 and Swift viaBucketis coherent. Validation checks forprimary.typare appropriately placed before casting.
285-306: Deletion mirroring changes are correct.
The function has consistent usage ofBucketand properly checks the replica type. No major issues found.
326-347: Copy operation adjustments are consistent.
Switching to the unifiedBuckettype, along with properly handling headers and calling the correct backend, looks fine.
362-379: Bucket creation mirror logic looks good.
The approach for both S3 and Swift remains consistent with the rest of the mirror strategy. No concerns.
385-408: Deletion of buckets is handled correctly.
Properly detects whether the replica is S3 or Swift, creating and modifying the request as needed.
413-417: Multipart completion reuse is a nice approach.
Delegating completion tomirrorPutObjectkeeps the logic simple and cohesive.
none, i.e. there will not be any auth scheme, apart from signature verification.Migration notes
Summary by CodeRabbit
New Features
Refactor
Bucketclass with JSON deserialization capabilities and additional properties.Chores