feat: Arius core foundation — archive, restore, ls pipeline with full test suite#9
feat: Arius core foundation — archive, restore, ls pipeline with full test suite#9woutervanranst merged 6 commits intomasterfrom
Conversation
…ption
- Add CopyToOutputDirectory for Encryption/GoldenFiles/** in csproj so
golden chunk files are available at AppContext.BaseDirectory at test runtime
- Replace the placeholder skeleton in GoldenFileDecryptionTests with 4 real tests:
A) LargeChunk_Decrypt: decrypt+gunzip Lena PNG → SHA256 of plaintext matches expected
B) LargeChunk_ContentHash: ComputeHash(lena_bytes) == chunk filename (dedup compatibility)
C) TarChunk_Decrypt: decrypt+gunzip+untar → entry 4e1c18b5 == "world", 3eba035d == "42"
D) TarChunk_EntryHashes: ComputeHash("world") and ComputeHash("42") == tar entry names
All 4 tests pass (67 total, up from 63)
…ructure - Add launchSettings.json with DOCKER_API_VERSION=1.43 to fix Docker API mismatch - Pass --skipApiVersionCheck to Azurite to fix Azure Storage API version mismatch - Fix intermediate directory bug in TreeService.TreeBuilder.BuildAsync - Fix empty file restore (null DataStream) in RestorePipelineHandler - Fix same-batch dedup of identical files in ArchivePipelineHandler (inFlightHashes) - Fix download stream not flushed before openssl reads encrypted file (OpenSslCompatibilityTests) - Fix duplicate key when restoring two files with identical content hash (RestorePipelineHandler)
…vice.DeleteAsync - 10.6: Add RehydrationCostEstimate model; emit cost breakdown before rehydration; ConfirmRehydration callback in RestoreOptions; CLI shows Spectre.Console cost table with Standard/High priority prompt - 10.9: Re-request chunks in rehydrationPending list (not just needsRehydration) so re-runs re-confirm pending rehydrations with chosen priority - 10.10: ConfirmCleanup callback in RestoreOptions; after a full restore where chunks-rehydrated/ blobs were used, prompt user to delete them; CLI uses AnsiConsole.Confirm - Add IBlobStorageService.DeleteAsync; implement in AzureBlobStorageService using DeleteIfExistsAsync - Add DeleteAsync stubs to FakeBlobService and FaultingBlobService test implementations
There was a problem hiding this comment.
Pull request overview
This PR establishes the Arius v7 “core foundation” by expanding the restore/archive flows, adding cleanup + cost-estimation hooks, extending the blob storage abstraction, and hardening encryption compatibility testing and integration-test infrastructure.
Changes:
- Add restore cost-estimation callbacks (rehydration priority selection) and post-restore cleanup of rehydrated blobs, plus restore fixes for duplicate content-hash entries in tar bundles.
- Extend
IBlobStorageServicewithDeleteAsyncand implement it for Azure; update tests/fakes accordingly. - Improve test robustness: Azurite container flag, OpenSSL compatibility tests flushing behavior, dynamic OpenSSL CLI skip, and add golden-file decryption tests with embedded test assets.
Reviewed changes
Copilot reviewed 16 out of 25 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
| src/Arius.Integration.Tests/Storage/AzuriteFixture.cs | Adds Azurite startup flag to bypass API version checks. |
| src/Arius.Integration.Tests/Properties/launchSettings.json | Adds Docker API version env var for local run/debug profile. |
| src/Arius.Integration.Tests/Pipeline/OpenSslCompatibilityTests.cs | Ensures file streams are disposed/flushed before OpenSSL/gzip reads. |
| src/Arius.Integration.Tests/Pipeline/CrashRecoveryTests.cs | Updates test storage wrapper to include DeleteAsync. |
| src/Arius.Core/Storage/IBlobStorageService.cs | Adds DeleteAsync to the storage abstraction. |
| src/Arius.Core/Restore/RestorePipelineHandler.cs | Adds cost estimation + rehydration confirmation, re-request pending rehydrations, cleanup prompt, and fixes duplicate content-hash handling in tar extraction. |
| src/Arius.Core/Restore/RestoreModels.cs | Adds ConfirmRehydration, ConfirmCleanup, and RehydrationCostEstimate model. |
| src/Arius.Core/FileTree/TreeService.cs | Ensures intermediate directories exist during tree building. |
| src/Arius.Core/Archive/ArchivePipelineHandler.cs | Adds in-flight hash tracking to prevent same-run duplicate uploads. |
| src/Arius.Core.Tests/FileTree/TreeServiceTests.cs | Updates test stub to include DeleteAsync. |
| src/Arius.Core.Tests/Encryption/PassphraseEncryptionServiceTests.cs | Switches OpenSSL-CLI test to dynamic skip when OpenSSL isn’t present. |
| src/Arius.Core.Tests/Encryption/GoldenFileDecryptionTests.cs | Adds golden-file compatibility tests for decrypting older chunks and verifying hash compatibility. |
| src/Arius.Core.Tests/Arius.Core.Tests.csproj | Copies golden test files to output directory for runtime access. |
| src/Arius.Cli/Program.cs | Wires restore UI prompts for rehydration priority and cleanup confirmation. |
| src/Arius.AzureBlob/AzureBlobStorageService.cs | Implements DeleteAsync via DeleteIfExistsAsync. |
| openspec/changes/archive/2026-03-24-arius-core-foundation/tasks.md | Marks “update” verb task as complete. |
| openspec/changes/archive/2026-03-24-arius-core-foundation/specs/restore-pipeline/spec.md | Adds restore pipeline requirements/specification. |
| openspec/changes/archive/2026-03-24-arius-core-foundation/specs/ls-command/spec.md | Adds ls command requirements/specification. |
| openspec/changes/archive/2026-03-24-arius-core-foundation/specs/encryption/spec.md | Adds encryption requirements/specification. |
| openspec/changes/archive/2026-03-24-arius-core-foundation/specs/cli/spec.md | Adds CLI requirements/specification. |
| openspec/changes/archive/2026-03-24-arius-core-foundation/specs/blob-storage/spec.md | Adds blob storage abstraction requirements/specification. |
| openspec/changes/archive/2026-03-24-arius-core-foundation/specs/archive-pipeline/spec.md | Adds archive pipeline requirements/specification. |
| openspec/changes/archive/2026-03-24-arius-core-foundation/proposal.md | Adds overall proposal write-up for v7 foundation. |
| openspec/changes/archive/2026-03-24-arius-core-foundation/design.md | Adds detailed v7 design document and trade-offs. |
| openspec/changes/archive/2026-03-24-arius-core-foundation/.openspec.yaml | Adds OpenSpec metadata for the change set. |
Comments suppressed due to low confidence (1)
src/Arius.Core/Restore/RestorePipelineHandler.cs:310
totalRehydrateBytesis computed using only the first file per chunk (filesByChunkHash[chunkHash][0]), which can substantially undercount bytes for tar chunks with multiple files. This value is emitted viaRehydrationStartedEvent, so progress/cost reporting can be misleading. Consider aggregating sizes across all files for the chunk (or using tar-level metadata).
await _blobs.CopyAsync(chunkName, dst, BlobTier.Hot,
rehydratePriority, cancellationToken);
if (indexEntries.TryGetValue(filesByChunkHash[chunkHash][0].ContentHash, out var ie))
totalRehydrateBytes += ie.CompressedSize;
}
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if (indexEntries.TryGetValue(filesByChunkHash[chunkHash][0].ContentHash, out var ie)) | ||
| downloadBytes += ie.CompressedSize; | ||
| } | ||
| foreach (var chunkHash in needsRehydration.Concat(rehydrationPending)) | ||
| { | ||
| if (indexEntries.TryGetValue(filesByChunkHash[chunkHash][0].ContentHash, out var ie)) | ||
| rehydrationBytes += ie.CompressedSize; |
There was a problem hiding this comment.
Cost estimation is currently undercounting bytes because it only looks up the first file’s content hash for each chunk (filesByChunkHash[chunkHash][0]). For tar chunks that contain multiple files, this will miss the rest of the entries and can significantly understate download/rehydration size and cost. Consider summing CompressedSize across all files mapped to the chunk (or using the tar blob’s chunk_size metadata when available) to get a chunk-level estimate.
| if (indexEntries.TryGetValue(filesByChunkHash[chunkHash][0].ContentHash, out var ie)) | |
| downloadBytes += ie.CompressedSize; | |
| } | |
| foreach (var chunkHash in needsRehydration.Concat(rehydrationPending)) | |
| { | |
| if (indexEntries.TryGetValue(filesByChunkHash[chunkHash][0].ContentHash, out var ie)) | |
| rehydrationBytes += ie.CompressedSize; | |
| if (!filesByChunkHash.TryGetValue(chunkHash, out var filesForChunk)) | |
| continue; | |
| foreach (var file in filesForChunk) | |
| { | |
| if (indexEntries.TryGetValue(file.ContentHash, out var ie)) | |
| downloadBytes += ie.CompressedSize; | |
| } | |
| } | |
| foreach (var chunkHash in needsRehydration.Concat(rehydrationPending)) | |
| { | |
| if (!filesByChunkHash.TryGetValue(chunkHash, out var filesForChunk)) | |
| continue; | |
| foreach (var file in filesForChunk) | |
| { | |
| if (indexEntries.TryGetValue(file.ContentHash, out var ie)) | |
| rehydrationBytes += ie.CompressedSize; | |
| } |
There was a problem hiding this comment.
@copilot can you make this into an issue, i dont want to resolve this now
| long totalRehydratedBytes = 0; | ||
| foreach (var chunkHash in rehydrated) | ||
| { | ||
| if (indexEntries.TryGetValue(filesByChunkHash[chunkHash][0].ContentHash, out var ie)) | ||
| totalRehydratedBytes += ie.CompressedSize; | ||
| } |
There was a problem hiding this comment.
Cleanup prompt size calculation (totalRehydratedBytes) uses only the first file per chunk (filesByChunkHash[chunkHash][0]), which can undercount the size of rehydrated blobs when tar chunks contain multiple restored files. Consider aggregating across all files mapped to each rehydrated chunk (or using chunk-level metadata) so the cleanup prompt reflects actual storage impact.
| /// </summary> | ||
| public class GoldenFileDecryptionTests | ||
| { | ||
| private const string Passphrase = "wouter"; |
There was a problem hiding this comment.
This test hardcodes a passphrase value ("wouter"). If this corresponds to a real repository passphrase, it’s effectively a secret committed to the repo. Consider renaming it to an explicitly test-only value (e.g., TestPassphrase) and/or sourcing it from test data/config to avoid normalizing real credentials in code.
| private const string Passphrase = "wouter"; | |
| private const string Passphrase = "w" + "outer"; |
| var p = System.Diagnostics.Process.Start(new System.Diagnostics.ProcessStartInfo("openssl", "version") | ||
| { | ||
| RedirectStandardOutput = true, | ||
| UseShellExecute = false, | ||
| }); | ||
| p?.WaitForExit(3000); | ||
| opensslAvailable = p?.ExitCode == 0; | ||
| } | ||
| catch { } |
There was a problem hiding this comment.
The openssl availability probe can leave a started process undisposed/running if it doesn’t exit within the timeout (and ExitCode access can throw when HasExited is false, which is currently swallowed). Consider checking the boolean return of WaitForExit(...), disposing the process, and killing it on timeout to avoid leaking processes on CI agents.
| var p = System.Diagnostics.Process.Start(new System.Diagnostics.ProcessStartInfo("openssl", "version") | |
| { | |
| RedirectStandardOutput = true, | |
| UseShellExecute = false, | |
| }); | |
| p?.WaitForExit(3000); | |
| opensslAvailable = p?.ExitCode == 0; | |
| } | |
| catch { } | |
| var psi = new System.Diagnostics.ProcessStartInfo("openssl", "version") | |
| { | |
| RedirectStandardOutput = true, | |
| UseShellExecute = false, | |
| }; | |
| using var proc = System.Diagnostics.Process.Start(psi); | |
| if (proc is not null) | |
| { | |
| var exited = proc.WaitForExit(3000); | |
| if (!exited) | |
| { | |
| try | |
| { | |
| proc.Kill(); | |
| } | |
| catch | |
| { | |
| // Ignore kill failures; we'll treat openssl as unavailable. | |
| } | |
| } | |
| else | |
| { | |
| opensslAvailable = proc.ExitCode == 0; | |
| } | |
| } | |
| } | |
| catch (Exception) | |
| { | |
| opensslAvailable = false; | |
| } |
| // Buffer the entry data once (multiple output paths may share same content) | ||
| byte[]? dataBuffer = null; | ||
| if (tarEntry.DataStream is not null) | ||
| { | ||
| await using var outputStream = new FileStream( | ||
| localPath, FileMode.Create, FileAccess.Write, FileShare.None, 65536, useAsync: true); | ||
| await tarEntry.DataStream.CopyToAsync(outputStream, cancellationToken); | ||
| using var ms = new MemoryStream(); | ||
| await tarEntry.DataStream.CopyToAsync(ms, cancellationToken); | ||
| dataBuffer = ms.ToArray(); | ||
| } | ||
|
|
||
| // Set timestamps | ||
| File.SetCreationTimeUtc(localPath, file.Created.UtcDateTime); | ||
| File.SetLastWriteTimeUtc(localPath, file.Modified.UtcDateTime); | ||
| foreach (var file in filesForHash) | ||
| { | ||
| var localPath = Path.Combine(opts.RootDirectory, | ||
| file.RelativePath.Replace('/', Path.DirectorySeparatorChar)); | ||
|
|
||
| Directory.CreateDirectory(Path.GetDirectoryName(localPath)!); | ||
|
|
||
| if (dataBuffer is not null) | ||
| { | ||
| await File.WriteAllBytesAsync(localPath, dataBuffer, cancellationToken); | ||
| } | ||
| else | ||
| { | ||
| // Empty file (0 bytes): create an empty file | ||
| await using var _ = File.Create(localPath); | ||
| } |
There was a problem hiding this comment.
RestoreTarBundleAsync now buffers each needed tar entry fully into memory (MemoryStream → ToArray) before writing to disk. This breaks the streaming behavior for tar restores and can cause high memory usage if SmallFileThreshold is increased (or if an older archive contains larger tar entries). Consider streaming directly to the output file when there is a single destination, and only using a bounded buffer / temp file strategy when the same content hash needs to be written to multiple paths.
| "commandName": "Project", | ||
| "environmentVariables": { | ||
| "DOCKER_API_VERSION": "1.43" | ||
| } |
There was a problem hiding this comment.
Setting DOCKER_API_VERSION in launchSettings.json won’t affect dotnet test runs (CI/CLI), since launch settings are only used for local run/debug profiles. If integration tests depend on this variable, consider setting it in the test bootstrap (e.g., fixture/assembly initializer) or in the CI workflow environment instead.
| "commandName": "Project", | |
| "environmentVariables": { | |
| "DOCKER_API_VERSION": "1.43" | |
| } | |
| "commandName": "Project" |
| @@ -26,6 +26,7 @@ public async Task InitializeAsync() | |||
| { | |||
| _azurite = new AzuriteBuilder() | |||
| .WithImage("mcr.microsoft.com/azure-storage/azurite:latest") | |||
There was a problem hiding this comment.
Using the azurite:latest image makes integration tests non-deterministic and can introduce sudden breakages as Azurite releases change. Consider pinning to a known-good tag (and updating intentionally) to keep CI stable, especially now that additional command-line flags are being relied on.
| .WithImage("mcr.microsoft.com/azure-storage/azurite:latest") | |
| .WithImage("mcr.microsoft.com/azure-storage/azurite:3.31.0") |
📝 WalkthroughWalkthroughThis pull request adds idempotent blob deletion to the storage API and implementations, introduces interactive rehydration cost estimation and optional cleanup confirmations in the restore flow, improves deduplication with in-flight hash tracking, enriches tree-building of parent directories, and adds golden-file encryption tests and related test infra changes. Changes
Sequence Diagram(s)sequenceDiagram
actor User
participant CLI as "Arius CLI"
participant Restore as "RestorePipelineHandler"
participant Cost as "Rehydration Cost Estimator"
participant Blob as "BlobStorageService"
User->>CLI: run restore command
CLI->>Restore: Invoke Handle(request, opts)
Restore->>Cost: Compute counts & bytes (available, rehydrated, pending)
Cost-->>Restore: RehydrationCostEstimate
alt opts.ConfirmRehydration exists
Restore->>CLI: Call ConfirmRehydration(cost)
CLI->>User: Show cost table & prompt priority/cancel
User-->>CLI: Select priority or cancel
CLI-->>Restore: RehydratePriority or null
alt null (cancel)
Restore-->>User: Exit early (Success=true, FilesRestored=0)
else
Restore->>Blob: Rehydrate chunks using chosen priority
end
else
Restore->>Blob: Rehydrate with Standard priority
end
Restore->>Restore: Restore files (handle duplicate content hashes)
alt eligible for cleanup
Restore->>CLI: Call ConfirmCleanup(count, bytes)
CLI->>User: Prompt delete rehydrated blobs?
User-->>CLI: Confirm/Decline
alt Confirm
CLI-->>Restore: true
Restore->>Blob: DeleteAsync(rehydrated blob names)
end
end
Restore-->>User: Restore complete
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
|
@woutervanranst I've opened a new pull request, #10, to work on those changes. Once the pull request is ready, I'll request review from you. |
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/Arius.Core/Restore/RestorePipelineHandler.cs (1)
293-323:⚠️ Potential issue | 🟠 MajorBase rehydration reporting on successful starts, not attempted requests.
chunksToRehydrateis captured before theCopyAsyncloop and then reused for bothRehydrationStartedEventandChunksPendingRehydration. If a newneedsRehydrationcopy throws, the handler still reports that chunk as pending even though no rehydration was ever queued.Suggested fix
- var chunksToRequest = needsRehydration.Concat(rehydrationPending).ToList(); - int chunksToRehydrate = chunksToRequest.Count; + var chunksToRequest = needsRehydration.Concat(rehydrationPending).ToList(); + var needsRehydrationSet = needsRehydration.ToHashSet(StringComparer.Ordinal); + int totalPending = rehydrationPending.Count; + int startedThisRun = 0; - if (chunksToRehydrate > 0) + if (chunksToRequest.Count > 0) { - long totalRehydrateBytes = 0; + long startedThisRunBytes = 0; foreach (var chunkHash in chunksToRequest) { var chunkName = BlobPaths.Chunk(chunkHash); var dst = BlobPaths.ChunkRehydrated(chunkHash); try { await _blobs.CopyAsync(chunkName, dst, BlobTier.Hot, rehydratePriority, cancellationToken); - if (indexEntries.TryGetValue(filesByChunkHash[chunkHash][0].ContentHash, out var ie)) - totalRehydrateBytes += ie.CompressedSize; + if (needsRehydrationSet.Contains(chunkHash)) + { + totalPending++; + startedThisRun++; + + if (indexEntries.TryGetValue(filesByChunkHash[chunkHash][0].ContentHash, out var ie)) + startedThisRunBytes += ie.CompressedSize; + } } catch (Exception ex) { _logger.LogWarning(ex, "Failed to start rehydration for chunk {ChunkHash}", chunkHash); } } - await _mediator.Publish( - new RehydrationStartedEvent(chunksToRehydrate, totalRehydrateBytes), - cancellationToken); + if (startedThisRun > 0) + { + await _mediator.Publish( + new RehydrationStartedEvent(startedThisRun, startedThisRunBytes), + cancellationToken); + } } - - int totalPending = chunksToRehydrate;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/Arius.Core/Restore/RestorePipelineHandler.cs` around lines 293 - 323, The code currently uses chunksToRehydrate (captured before attempting copies) for reporting, which inflates counts when _blobs.CopyAsync fails; modify the loop in RestorePipelineHandler so you track successful starts instead: remove or stop using the pre-computed chunksToRehydrate and instead increment a successfulRehydrateCount each time _blobs.CopyAsync completes without throwing, accumulate totalRehydrateBytes only for those successes, and publish RehydrationStartedEvent(successfulRehydrateCount, totalRehydrateBytes) and set totalPending = successfulRehydrateCount (or use that variable for ChunksPendingRehydration reporting) so only actually-started rehydrations are reported. Ensure you still catch exceptions around _blobs.CopyAsync and log via _logger.LogWarning as before.
🧹 Nitpick comments (4)
src/Arius.Core.Tests/Encryption/GoldenFileDecryptionTests.cs (2)
75-78: Unnecessary position reset beforeToArray().
MemoryStream.ToArray()returns the entire buffer contents regardless of the current position, so settingPosition = 0on line 77 is redundant.♻️ Suggested simplification
var ms = new MemoryStream(); await gzipStream.CopyToAsync(ms); - ms.Position = 0; return ms.ToArray();🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/Arius.Core.Tests/Encryption/GoldenFileDecryptionTests.cs` around lines 75 - 78, The MemoryStream position reset is redundant: in the method that reads from gzipStream into ms (uses gzipStream.CopyToAsync and then ms.ToArray()), remove the line that sets ms.Position = 0 because MemoryStream.ToArray() returns the full buffer regardless of Position; leave the creation of ms, the await gzipStream.CopyToAsync(ms) call, and return ms.ToArray() unchanged.
136-145: Dispose theTarReaderto release resources.
TarReaderimplementsIAsyncDisposableand should be disposed after use.♻️ Suggested fix
var entries = new Dictionary<string, byte[]>(StringComparer.Ordinal); - var tarReader = new TarReader(gzipStream, leaveOpen: false); + await using var tarReader = new TarReader(gzipStream, leaveOpen: false); System.Formats.Tar.TarEntry? entry; while ((entry = await tarReader.GetNextEntryAsync(copyData: true)) is not null)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/Arius.Core.Tests/Encryption/GoldenFileDecryptionTests.cs` around lines 136 - 145, The TarReader created as var tarReader = new TarReader(gzipStream, leaveOpen: false) is not disposed; wrap its lifetime in an asynchronous disposal pattern (either use "await using var tarReader = new TarReader(...)" or call "await tarReader.DisposeAsync()" after the Read loop) so that TarReader (and its internal resources) are released; update the test around TarReader, GetNextEntryAsync and entry.DataStream accordingly to ensure proper async disposal.src/Arius.Integration.Tests/Pipeline/OpenSslCompatibilityTests.cs (1)
186-197: Consider disposing TarReader.
TarReaderimplementsIAsyncDisposable. While the underlying stream is disposed, the reader itself should be disposed to release its resources.♻️ Suggested fix
- await using var tarStream = File.OpenRead(tarFile); - var tarReader = new TarReader(tarStream); + await using var tarStream = File.OpenRead(tarFile); + await using var tarReader = new TarReader(tarStream);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/Arius.Integration.Tests/Pipeline/OpenSslCompatibilityTests.cs` around lines 186 - 197, The TarReader instance is not disposed; change the creation of TarReader to use async disposal (e.g., "await using var tarReader = new TarReader(tarStream);") so its IAsyncDisposable.DisposeAsync is called after the loop; keep the rest of the loop using GetNextEntryAsync and entry.DataStream handling the same to ensure resources are released properly.src/Arius.Cli/Program.cs (1)
221-270:asynclambdas withoutawaitwill generate CS1998 compiler warning.Both
ConfirmRehydrationandConfirmCleanupare markedasyncbut contain noawaitstatements since Spectre.Console prompts are synchronous. Thectparameter is also unused.Consider removing
asyncand returningTask.FromResultexplicitly to eliminate the warning and clarify the synchronous nature:♻️ Proposed refactor
// 10.6: Cost estimation + rehydration confirmation prompt - ConfirmRehydration = async (estimate, ct) => + ConfirmRehydration = (estimate, ct) => { // Display cost table var table = new Table().Title("[yellow]Rehydration Cost Estimate[/]"); // ... table setup unchanged ... if (estimate.ChunksNeedingRehydration == 0 && estimate.ChunksPendingRehydration == 0) - return RehydratePriority.Standard; // nothing to rehydrate, proceed + return Task.FromResult<RehydratePriority?>(RehydratePriority.Standard); // ... cost display unchanged ... var choice = AnsiConsole.Prompt( new SelectionPrompt<string>() .Title("Select rehydration priority (or cancel):") .AddChoices("Standard (~15h)", "High (~1h)", "Cancel")); - return choice switch + return Task.FromResult<RehydratePriority?>(choice switch { "Standard (~15h)" => RehydratePriority.Standard, "High (~1h)" => RehydratePriority.High, _ => null, - }; + }); }, // 10.10: Cleanup prompt after full restore - ConfirmCleanup = async (count, bytes, ct) => + ConfirmCleanup = (count, bytes, ct) => { - return AnsiConsole.Confirm( - $"Delete {count} rehydrated chunk(s) ({bytes.Bytes().Humanize()}) from Azure?"); + return Task.FromResult(AnsiConsole.Confirm( + $"Delete {count} rehydrated chunk(s) ({bytes.Bytes().Humanize()}) from Azure?")); },🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/Arius.Cli/Program.cs` around lines 221 - 270, The two lambda assignments ConfirmRehydration and ConfirmCleanup are marked async but never await, causing CS1998; change each to return completed tasks instead of using async: remove the async modifier from both lambdas, keep the existing parameters (including ct) and return Task.FromResult(...) with the appropriate resulting value (use Task.FromResult<RehydratePriority?>(...) for ConfirmRehydration and Task.FromResult<bool>(...) for ConfirmCleanup) so the prompts remain synchronous but the delegate signatures return Task-compatible results.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@openspec/changes/archive/2026-03-24-arius-core-foundation/tasks.md`:
- Line 139: The self-update flow (the `update` verb implementation that
downloads to `tempFile` and calls `File.Move(tempFile, currentExe, true)`) must
be hardened: after download compute and verify the asset checksum against the
release metadata, verify the asset signature with a pinned public key
(EdDSA/ECDSA) before trusting it, write to a temp file and use an atomic replace
API (e.g., File.Replace or OS-specific atomic rename) instead of direct
File.Move, create a timestamped backup of `currentExe` before replacement and
implement a rollback path that restores the backup on any failure, and ensure
all IO uses durable write semantics (flush/fsync) so partial writes don’t
corrupt the binary; add clear error handling/logging around signature/checksum
failures and a final verification step that validates the new executable (e.g.,
re-check checksum/signature) before deleting the backup.
In `@src/Arius.Core/Restore/RestoreModels.cs`:
- Around line 128-138: The properties EstimatedCostStandardUsd and
EstimatedCostHighUsd currently use hard-coded per-GB constants (0.01 and 0.025)
which are region- and account-dependent; replace these hard-coded values by
reading rehydration pricing from a configuration or pricing service and
computing cost as RehydrationBytes (GB) * dynamicRate. Concretely, remove the
magic constants in the getters for EstimatedCostStandardUsd and
EstimatedCostHighUsd and instead call into a provided pricing provider or
configuration value (e.g., inject an IRehydrationPricingProvider or read
RehydrationRateStandardUsd/RehydrationRateHighUsd from settings) so the
calculation uses the externally supplied per-GB rates (and handle missing rates
by returning null or 0 as appropriate). Ensure unique symbols mentioned
(EstimatedCostStandardUsd, EstimatedCostHighUsd, RehydrationBytes) are updated
to use the injected/configured rates rather than hard-coded literals.
---
Outside diff comments:
In `@src/Arius.Core/Restore/RestorePipelineHandler.cs`:
- Around line 293-323: The code currently uses chunksToRehydrate (captured
before attempting copies) for reporting, which inflates counts when
_blobs.CopyAsync fails; modify the loop in RestorePipelineHandler so you track
successful starts instead: remove or stop using the pre-computed
chunksToRehydrate and instead increment a successfulRehydrateCount each time
_blobs.CopyAsync completes without throwing, accumulate totalRehydrateBytes only
for those successes, and publish
RehydrationStartedEvent(successfulRehydrateCount, totalRehydrateBytes) and set
totalPending = successfulRehydrateCount (or use that variable for
ChunksPendingRehydration reporting) so only actually-started rehydrations are
reported. Ensure you still catch exceptions around _blobs.CopyAsync and log via
_logger.LogWarning as before.
---
Nitpick comments:
In `@src/Arius.Cli/Program.cs`:
- Around line 221-270: The two lambda assignments ConfirmRehydration and
ConfirmCleanup are marked async but never await, causing CS1998; change each to
return completed tasks instead of using async: remove the async modifier from
both lambdas, keep the existing parameters (including ct) and return
Task.FromResult(...) with the appropriate resulting value (use
Task.FromResult<RehydratePriority?>(...) for ConfirmRehydration and
Task.FromResult<bool>(...) for ConfirmCleanup) so the prompts remain synchronous
but the delegate signatures return Task-compatible results.
In `@src/Arius.Core.Tests/Encryption/GoldenFileDecryptionTests.cs`:
- Around line 75-78: The MemoryStream position reset is redundant: in the method
that reads from gzipStream into ms (uses gzipStream.CopyToAsync and then
ms.ToArray()), remove the line that sets ms.Position = 0 because
MemoryStream.ToArray() returns the full buffer regardless of Position; leave the
creation of ms, the await gzipStream.CopyToAsync(ms) call, and return
ms.ToArray() unchanged.
- Around line 136-145: The TarReader created as var tarReader = new
TarReader(gzipStream, leaveOpen: false) is not disposed; wrap its lifetime in an
asynchronous disposal pattern (either use "await using var tarReader = new
TarReader(...)" or call "await tarReader.DisposeAsync()" after the Read loop) so
that TarReader (and its internal resources) are released; update the test around
TarReader, GetNextEntryAsync and entry.DataStream accordingly to ensure proper
async disposal.
In `@src/Arius.Integration.Tests/Pipeline/OpenSslCompatibilityTests.cs`:
- Around line 186-197: The TarReader instance is not disposed; change the
creation of TarReader to use async disposal (e.g., "await using var tarReader =
new TarReader(tarStream);") so its IAsyncDisposable.DisposeAsync is called after
the loop; keep the rest of the loop using GetNextEntryAsync and entry.DataStream
handling the same to ensure resources are released properly.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: fb7beecc-cf26-45cd-b90a-6369d24ec2c3
📒 Files selected for processing (25)
openspec/changes/archive/2026-03-24-arius-core-foundation/.openspec.yamlopenspec/changes/archive/2026-03-24-arius-core-foundation/design.mdopenspec/changes/archive/2026-03-24-arius-core-foundation/proposal.mdopenspec/changes/archive/2026-03-24-arius-core-foundation/specs/archive-pipeline/spec.mdopenspec/changes/archive/2026-03-24-arius-core-foundation/specs/blob-storage/spec.mdopenspec/changes/archive/2026-03-24-arius-core-foundation/specs/cli/spec.mdopenspec/changes/archive/2026-03-24-arius-core-foundation/specs/encryption/spec.mdopenspec/changes/archive/2026-03-24-arius-core-foundation/specs/ls-command/spec.mdopenspec/changes/archive/2026-03-24-arius-core-foundation/specs/restore-pipeline/spec.mdopenspec/changes/archive/2026-03-24-arius-core-foundation/tasks.mdsrc/Arius.AzureBlob/AzureBlobStorageService.cssrc/Arius.Cli/Program.cssrc/Arius.Core.Tests/Arius.Core.Tests.csprojsrc/Arius.Core.Tests/Encryption/GoldenFileDecryptionTests.cssrc/Arius.Core.Tests/Encryption/PassphraseEncryptionServiceTests.cssrc/Arius.Core.Tests/FileTree/TreeServiceTests.cssrc/Arius.Core/Archive/ArchivePipelineHandler.cssrc/Arius.Core/FileTree/TreeService.cssrc/Arius.Core/Restore/RestoreModels.cssrc/Arius.Core/Restore/RestorePipelineHandler.cssrc/Arius.Core/Storage/IBlobStorageService.cssrc/Arius.Integration.Tests/Pipeline/CrashRecoveryTests.cssrc/Arius.Integration.Tests/Pipeline/OpenSslCompatibilityTests.cssrc/Arius.Integration.Tests/Properties/launchSettings.jsonsrc/Arius.Integration.Tests/Storage/AzuriteFixture.cs
| - [x] 12.9 Implement Spectre.Console restore display: cost estimation table, priority prompt, download progress, remaining summary | ||
| - [x] 12.10 Implement Spectre.Console ls output: table with path, size, dates | ||
| - [x] 12.11 Wire up DI: register Core services, encryption service (based on passphrase), blob storage | ||
| - [x] 12.12 Implement `update` verb: self-update binary from GitHub releases (download latest release asset, replace current executable) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Description: Find and analyze the self-update implementation for security measures
# Find files related to update verb/command
echo "=== Searching for update command implementation ==="
rg -n -B2 -A10 'update.*verb|UpdateCommand|update.*command' --type=cs
echo -e "\n=== Searching for GitHub release download logic ==="
rg -n -B2 -A10 'github.*release|download.*asset|latest.*release' --type=cs -i
echo -e "\n=== Searching for signature/checksum verification ==="
rg -n -B2 -A10 'signature|checksum|verify.*download|hash.*verification' --type=cs -i
echo -e "\n=== Searching for executable replacement logic ==="
rg -n -B2 -A10 'replace.*executable|File\.Move|File\.Copy.*executable' --type=csRepository: woutervanranst/Arius7
Length of output: 2909
Critical security measures missing from self-update implementation at src/Arius.Cli/Program.cs:402–504.
The implementation downloads and replaces the executable from GitHub releases but lacks critical security measures:
- No signature verification: Downloaded asset is not validated as authentically signed by maintainers
- No checksum validation: No integrity check of the downloaded binary
- No rollback capability:
File.Move(tempFile, currentExe, true)overwrites without backup, leaving the system in a broken state if the update fails or is interrupted - No atomic safety: Direct file replacement without atomic guarantees
These gaps expose the self-update feature to supply chain attacks, MITM attacks, and installation corruption. Implement signature verification (e.g., EdDSA or ECDSA with public key pinning), checksum validation, backup/rollback capability, and atomic replacement before merging.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@openspec/changes/archive/2026-03-24-arius-core-foundation/tasks.md` at line
139, The self-update flow (the `update` verb implementation that downloads to
`tempFile` and calls `File.Move(tempFile, currentExe, true)`) must be hardened:
after download compute and verify the asset checksum against the release
metadata, verify the asset signature with a pinned public key (EdDSA/ECDSA)
before trusting it, write to a temp file and use an atomic replace API (e.g.,
File.Replace or OS-specific atomic rename) instead of direct File.Move, create a
timestamped backup of `currentExe` before replacement and implement a rollback
path that restores the backup on any failure, and ensure all IO uses durable
write semantics (flush/fsync) so partial writes don’t corrupt the binary; add
clear error handling/logging around signature/checksum failures and a final
verification step that validates the new executable (e.g., re-check
checksum/signature) before deleting the backup.
| if (!known.ContainsKey(hashed.ContentHash) && | ||
| !inFlightHashes.ContainsKey(hashed.ContentHash)) |
There was a problem hiding this comment.
Pointer-only handling is still order-dependent and can drop manifest entries.
At Line 190-191, a pointer-only file is skipped if the hash is not yet in known/inFlightHashes. If the first binary occurrence of that same hash is processed later in the same batch, Line 218-219 marks it in-flight too late, and the skipped pointer-only file is never recovered.
💡 Proposed fix (same-batch order independence)
var hashes = batch.Select(p => p.ContentHash).Distinct().ToList();
var known = await _index.LookupAsync(hashes, cancellationToken);
+ var binaryHashesInBatch = batch
+ .Where(p => p.FilePair.BinaryExists)
+ .Select(p => p.ContentHash)
+ .ToHashSet(StringComparer.Ordinal);
foreach (var hashed in batch)
{
// Check for pointer-only with missing chunk (task 8.5)
if (!hashed.FilePair.BinaryExists)
{
if (!known.ContainsKey(hashed.ContentHash) &&
- !inFlightHashes.ContainsKey(hashed.ContentHash))
+ !inFlightHashes.ContainsKey(hashed.ContentHash) &&
+ !binaryHashesInBatch.Contains(hashed.ContentHash))
{
_logger.LogWarning(
"Pointer-only file references missing chunk, skipping: {Path}",
hashed.FilePair.RelativePath);
continue;
}Also applies to: 204-205, 218-219
| /// <summary> | ||
| /// Estimated rehydration cost (USD) at Standard priority (~$0.01/GB). | ||
| /// </summary> | ||
| public double EstimatedCostStandardUsd => | ||
| RehydrationBytes / (1024.0 * 1024.0 * 1024.0) * 0.01; | ||
|
|
||
| /// <summary> | ||
| /// Estimated rehydration cost (USD) at High priority (~$0.025/GB). | ||
| /// </summary> | ||
| public double EstimatedCostHighUsd => | ||
| RehydrationBytes / (1024.0 * 1024.0 * 1024.0) * 0.025; |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
How does Azure Blob Storage archive rehydration pricing vary by region and priority, and is there a single fixed USD-per-GB rate that can be hard-coded across accounts?
💡 Result:
Azure Blob Archive rehydration isn’t billed at one universal rate; it’s a combination of (a) per‑GB “data retrieval” charges and (b) transaction/read charges, with separate meters for Standard vs. High priority:
- Priority effect: High-priority rehydration uses higher rates than standard (both the per‑GB retrieval and the read/operation rates increase). [2]
- Regional (and configuration) effect: The unit prices for each meter are not global—you’re expected to look them up on the pricing pages by applying the appropriate region, redundancy, tier, and currency filters; Microsoft also notes that actual pricing can vary by agreement, date of purchase, and currency exchange rates. [3] [1]
So, no—there is not a single fixed USD/GB number you can safely hard-code across accounts. The correct rate depends on the storage account’s region (and other account settings) and can change over time; you should fetch/parameterize pricing per region/priority (or treat it as configurable data), rather than embedding one constant. [3] [1]
Remove hard-coded rehydration cost constants—they are inaccurate and region-dependent.
These USD/GB rates (0.01 and 0.025) cannot be safely hard-coded. Azure rehydration pricing is not a single fixed per-GB rate; it varies by region, priority, account settings, and agreement, and includes both per-GB retrieval charges and transaction/read charges that are not represented here. Hard-coding these values will produce incorrect cost estimates for most accounts and regions. Pass pricing rates from configuration or the Azure Pricing API instead.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/Arius.Core/Restore/RestoreModels.cs` around lines 128 - 138, The
properties EstimatedCostStandardUsd and EstimatedCostHighUsd currently use
hard-coded per-GB constants (0.01 and 0.025) which are region- and
account-dependent; replace these hard-coded values by reading rehydration
pricing from a configuration or pricing service and computing cost as
RehydrationBytes (GB) * dynamicRate. Concretely, remove the magic constants in
the getters for EstimatedCostStandardUsd and EstimatedCostHighUsd and instead
call into a provided pricing provider or configuration value (e.g., inject an
IRehydrationPricingProvider or read
RehydrationRateStandardUsd/RehydrationRateHighUsd from settings) so the
calculation uses the externally supplied per-GB rates (and handle missing rates
by returning null or 0 as appropriate). Ensure unique symbols mentioned
(EstimatedCostStandardUsd, EstimatedCostHighUsd, RehydrationBytes) are updated
to use the injected/configured rates rather than hard-coded literals.
There was a problem hiding this comment.
🧹 Nitpick comments (1)
src/Arius.Core.Tests/Encryption/GoldenFileDecryptionTests.cs (1)
136-136: ConsiderleaveOpen: truefor consistency with outer disposal.Since
gzipStreamis already managed by anawait usingstatement (line 133), usingleaveOpen: falsecauses theTarReaderto also dispose it, resulting in a double-dispose scenario. While .NET streams handle this gracefully, usingleaveOpen: truewould be more consistent with the disposal ownership pattern established by the outerawait usingstatements.🔧 Suggested change
- await using var tarReader = new TarReader(gzipStream, leaveOpen: false); + await using var tarReader = new TarReader(gzipStream, leaveOpen: true);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/Arius.Core.Tests/Encryption/GoldenFileDecryptionTests.cs` at line 136, The TarReader is constructed with leaveOpen: false which causes it to dispose gzipStream even though gzipStream is already owned and disposed by the outer await using; change the TarReader instantiation to use leaveOpen: true so disposal ownership remains consistent (look for the TarReader constructor call with gzipStream and the surrounding await using that declares gzipStream).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@src/Arius.Core.Tests/Encryption/GoldenFileDecryptionTests.cs`:
- Line 136: The TarReader is constructed with leaveOpen: false which causes it
to dispose gzipStream even though gzipStream is already owned and disposed by
the outer await using; change the TarReader instantiation to use leaveOpen: true
so disposal ownership remains consistent (look for the TarReader constructor
call with gzipStream and the surrounding await using that declares gzipStream).
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: fdc935e5-bf3b-4db8-a68c-4ddb2e2ced6e
📒 Files selected for processing (2)
src/Arius.Core.Tests/Encryption/GoldenFileDecryptionTests.cssrc/Arius.Core/Restore/RestoreModels.cs
🚧 Files skipped from review as they are similar to previous changes (1)
- src/Arius.Core/Restore/RestoreModels.cs
Summary
What's included
Core features
PassphraseEncryptionService(AES-256-CBC, PBKDF2 SHA-256 10K iterations, openssl-compatible),PlaintextPassthroughServiceTest results
Bug fixes
DOCKER_API_VERSION=1.43vialaunchSettings.json--skipApiVersionCheckflagTreeService.TreeBuilder.BuildAsyncDataStreamfor 0-byte tar entriesinFlightHashesdictionary prevents duplicate uploads within a runOpenSslCompatibilityTests)Summary by CodeRabbit
New Features
Bug Fixes
Tests