AeroVault Wrapper-Stack and Cryptography: Design Conversation #276

axpnet · 2026-05-27T13:49:03Z

axpnet
May 27, 2026
Maintainer

Anchor for the AeroVault design conversation in Design Threads. The full historical thread lives in the Community Roadmap discussion (the conversation that ran 2026-05-07 to 2026-05-19, with substantial design contribution from @EhudKirsh, credited inline in every AeroVault receipt and in the v3.8.0 release notes).

This post serves two roles at once: a checkpoint of every decision that came out of that conversation, and a starting base for everything that follows. Read it as the shared ground from which the next design questions on AeroVault start, not as a closing summary.

The pipeline at a glance

  +-------------+    +----------+    +-------+    +-------+
  | compression | -> | chunking | -> | crypt | -> |  ECC  |
  +-------------+    +----------+    +-------+    +-------+
    plaintext        cdc + packing    per chunk    on-storage

Each box is a first-class wrapper, not a sub-step buried inside another. Order is non-arbitrary: compression before crypt (so the ratio survives), crypt before ECC (so confidentiality survives parity-byte failures).

Checkpoint: agreed decisions

Wrapper-stack model

Decision: AeroVault is a pipeline of independently composable wrappers, applied as compression -> chunking -> crypt -> error-correction on write and reversed on read.

Why: each wrapper has its own job. Sub-stepping them inside one monolithic transform makes them inseparable for future swaps.

Status: ✅ Shipped in v3.8.0. Codified in docs/AEROVAULT-V3-SPEC.md.

Pipeline order (post-correction)

Decision: compress -> chunk -> crypt -> ECC, with the initial sketch corrected mid-thread.

Original sketch: compress -> chunk -> BLAKE3 -> AES-256-GCM-SIV -> ECC (BLAKE3 listed as its own pipeline stage, which conflated hashing with crypt).

Correction: BLAKE3 is part of the chunking + crypt internals, not its own pipeline stage. The clean shape is the four-stage pipeline above.

Status: ✅ Captured in v3.8.0 release notes and spec.

Small-file packing (Ehud's contribution)

Decision: sub-threshold files (engine-derived ~256 KiB boundary) are concatenated into a pure-concatenation pack with no in-pack frame headers. The manifest is the index, carrying pack_offset + length per file. The "tar-ish framing without metadata bloat", in Ehud's words.

Why: lets the CDC chunker downstream stay multi-MiB chunked regardless of input file size, so dedup stays chunk-aligned even on directory trees full of tiny files.

Status: ✅ Shipped in v3.8.0. Verified end-to-end: a 254-file round trip (250 small + 2 duplicates + 2 large) collapsed to 4 physical chunks with 251 dedup hits and byte-identical extraction.

Error-correction position

Decision: ECC runs last in the pipeline, on encrypted-and-packed bytes.

Why: ECC protects the on-storage form (the bytes the disk or remote backend actually holds). Encryption needs to be inside ECC so confidentiality survives ECC parity-byte degradation.

Status: 🔧 Pipeline slot exercised end-to-end in v3 Experimental. Algorithm choice and parameter tuning still open here for refinement.

Algorithm versioning

Decision: every wrapper carries an explicit algorithm version field in the manifest, with a forward-compat clause for adding new algorithms without breaking older vaults.

Why: lets future swaps (compression library, cipher, ECC scheme) ship without a format-version bump, as long as the wrapper contract is preserved.

Status: ✅ Baked into the v3 manifest format. Already exercised by v1, v2, v3 auto-detection in aeroftp-cli vault.

Cryptography matrix

Layer	Algorithm	Standard	Why this choice
Content encryption	AES-256-GCM-SIV	RFC 8452	Nonce-misuse-resistant. Safe even when the chunked write path can't guarantee unique nonces
Filename encryption	AES-256-SIV	RFC 5297	Deterministic: same plaintext name maps to same ciphertext, preserves lookup without leaking name structure
Master key wrapping	AES-256-KW	RFC 3394	Standard key-wrap for encrypting per-file DEKs under the master key
Key derivation	Argon2id	RFC 9106	Memory-hard KDF, parameters tuned above OWASP 2024 baseline
Optional cascade	ChaCha20-Poly1305	RFC 8439	Defense-in-depth second cipher, enabled in v2 paranoid mode, inheritable into v3
Header integrity	HMAC-SHA512	RFC 4231	Strong MAC on the vault header so any tampering with metadata is detected before a key is used

Position against existing tools:

Tool	Content cipher	Filename cipher	KDF
AeroVault v3	AES-256-GCM-SIV (nonce-misuse-resistant)	AES-256-SIV	Argon2id above OWASP 2024
Cryptomator	AES-256-GCM (nonce-misuse-sensitive)	AES-SIV	scrypt
rclone-crypt	XSalsa20-Poly1305	EME-AES	scrypt

Where it lives in the codebase

Artifact	Path
Formal specification	`docs/AEROVAULT-V3-SPEC.md`
Engine	`src-tauri/src/aerovault/`
Telemetry / receipt	`src-tauri/src/vault_telemetry.rs` (`VaultReport`)
In-app receipt UI	inside the AeroVault dialog, exportable as text
CLI subcommand	`aeroftp-cli vault {create,add,info,extract}` with auto-detection across v1/v2/v3
Activity Log integration	each vault operation logged with the same redaction pipeline as the Debug Panel

What this thread is for going forward

Open design questions and trade-offs on any wrapper in the stack
Algorithm or parameter choices that need community visibility before they land
Interop questions (rclone, Cryptomator, Kopia, others)
ECC algorithm follow-ups now that the v3 pipeline slot is exercised end-to-end
Future wrapper additions or refinements (compression algorithm swaps, new cipher cascades, metadata storage variants, etc.)

If a question is small enough to live in the v4+ Wishlist, it belongs there (wishlist Discussion). If it is wishlist-scoped but specifically about AeroVault, drop a link from there back to a post in this thread for the design context. If it is roadmap-scoped (multi-day, architectural), it belongs in the Roadmap discussion; a link from there to a post here is fine when the AeroVault design framing is the focus.

References

Roadmap discussion (the conversation that produced this checkpoint, including all of @EhudKirsh's substantive design comments): link
AeroVault v3 specification: docs/AEROVAULT-V3-SPEC.md
v3.8.0 release notes (where the wrapper-stack first shipped): CHANGELOG entry
v4+ Wishlist: link
Community Roadmap discussion: link

EhudKirsh · 2026-05-28T02:33:26Z

EhudKirsh
May 28, 2026
Collaborator

I have plenty of more things to add on this topic, but I want to make sure we agree on the organisation first so I know where's the right place to post what. Perhaps you forgot, but I thought we agreed in #271 (comment) to keep #272 as a roadmap exclusively for the wrappers/overlays, and offshoot anything that's exclusive for AeroVault to its dedicated repo. So I didn't expect this comment as a separate post, as I thought you might post a comment like this in #272. If you agree, please rename #272 to "[ROADMAP] Wrappers/Overlays" or a similar name when you're ready, and also make a "[ROADMAP] Mobile App 📱". Although, the latter is of course less of a priority and I don't plan on posting anything on it for a while, but again, I have big ideas there as well. I can think of at least one more idea for another [ROADMAP], but I want to definitively hear that you agree to this structure before opening it as well.

2 replies

axpnet May 28, 2026
Maintainer Author

I was actually trying to better organize the threads using the Discussion section to avoid creating too much confusion in ISSUES and to separate discussions from bugs/reports. It's not a firm decision, it's just a test. Moving the discussions to two different repos might actually disrupt the discussion a bit.

We need to make a final decision whether to discuss them in the AeroVault repo (which exists only for the crate and has intentionally remained at v2 while awaiting the direct upgrade to v4) or to concentrate everything here (since it's the cradle of AeroVault, where it truly lives and operates).

Perhaps the idea of two repos is more technically correct, but in this specific case, it's better to stay here. What do you think? Sorry for the change of heart, but I'm considering it; it's not a firm or fixed idea.

EhudKirsh May 28, 2026
Collaborator

No need to be sorry, you're very polite. I just didn't expect this post because I thought we already agreed on a structure.

I agree that it's best to leave the Issues page just for reporting bugs and security vulnerabilities, and everything else should go in the Discussions tab. I wrote about this in #270 (comment) just now. However, you can create a Discussion tab in the AeroVault repo. It's also possible to start an issue there and later convert it to a discussion in the future.

I prefer to use that repo because it fits better and declutters this repo. Otherwise the AeroVault repo will remain mostly empty in terms of issues, if its issues mostly go here. I see AeroVault as having the potential of being a standalone strictly offline app, that's independent of AeroFTP. So I think that part of its independence is using its own Issues and Discussions tabs.

I see AeroVault is a kind of dependency of AeroFTP. And who knows? Perhaps AeroVault will be a dependency of other apps in the future. It just feels a little out of place that if AeroVault has its own dedicated repo, that its design would be exclusively written on another repo, even the aspects that have nothing to do the wrappers/overlays of AeroSync.

axpnet · 2026-06-03T11:26:03Z

axpnet
Jun 3, 2026
Maintainer Author

Quick security update for this thread.

The current AeroVault format version is now v3, shipped in the standalone crate aerovault 0.4.0 on crates.io, and it is fully backward compatible: it reads existing v2 vaults unchanged. There is no in-place conversion, new vaults are written as v3, old ones keep working.

Why v3 landed now: the deep audit surfaced a chunk-splicing gap (CRYPTO-01). The old per-chunk authentication bound only the chunk index, reset per file under one master key, so someone who could already edit a vault could swap one file's chunk for another's and have it still extract as authentic. v3 binds a per-file random id plus the chunk count and index into the AEAD, on both the inner layer and the cascade, encrypt and decrypt. The file id lives in the AES-SIV-encrypted (authenticated) manifest and the version byte is under the HMAC-SHA512 header MAC, so neither can be stripped to force the old path.

This was a security-driven step; the larger wrapper-stack work continues on its own track.

Verified before shipping: the crate test suite, real round-trips (v3 and v2 both byte-identical, tampered vaults rejected), and a cross-tool round-trip uploading a vault from AeroFTP and pulling it back with rclone, no regressions.

On where this lives: I hear your point about keeping AeroVault in its own repo. The 0.4.0 changelog is committed in axpdev-lab/aerovault and the crate is published on crates.io; this note is just a heads-up here since the audit ran alongside the AeroFTP work.

0 replies

axpnet · 2026-06-08T08:21:03Z

axpnet
Jun 8, 2026
Maintainer Author

@EhudKirsh bringing the Error Correction design here into the AeroVault design thread, since this is the home you and I set for AeroVault design follow-ups. The track itself was opened in the roadmap (#272, section 4). This post reports what is actually built, makes good on the naming point you raised, lays out the placement decision with the full trade matrix and the reasoning behind every call, and puts the open questions in front of you for the pre-tag review pass we agreed on. Nothing here is tagged or in a release. This post is the review surface, on purpose, and everything in it is open to revision, not presented as closed.

Status: development is underway on a working branch (not pushed, not tagged). What is built and green on local gates right now: the embedded Reed-Solomon layer over the cipher blocks (create / scrub / repair, CLI and GUI), and, new since I started this write-up, the metadata-locator protection in section 4 (a corrupted manifest is now rebuilt from its own parity instead of being fatal, with a regression test). The optional detached placement (section 3) is the next slice and is presented here as design for your review before it lands. I am sharing this now to inform and to get your read; the testable release still waits for your review pass, per section 6.

1. Naming: the acronym is gone

Your point from the 2026-05-29 comment: stop using "ECC" for Error Correction, because in a thread full of cryptography it collides with Elliptic Curve Cryptography, and the second C was never well justified. You are right and I agreed. The whole surface is moving to "Error Correction" spelled out: UI strings, CLI help, the docs, and the code symbols and on-disk identifiers too, so the acronym does not survive anywhere a user or a reviewer would read it. If you still spot "ECC" after this lands, it is a miss; flag it.

2. The format decision: v4 is v3 plus Error Correction, not a new format

You framed this exactly right in May: "V4 will simply be V3 plus these added Error Correction files, which a V3 version of AeroFTP can simply ignore." That is the contract, and it held through implementation.

There is no separate v4 format and no separate v4 file. v3 already reserved an extension directory and an extension payload region in its 1024-byte header, with the rule that a reader rejects only critical = true extensions and silently skips non-critical ones. Error Correction ships as a non-critical extension in that slot.

Why this and not a separate _v4 format/file. We considered the separate-format route early and rejected it deliberately:

v3 already carries every hook the wrapper needs (the reserved extension slot, and the per-block cipher_hash that was added in v3 specifically as the future Error Correction detection hook),
a second format would have duplicated the entire reader/writer and thrown away the forward-compat property you asked for,
the algorithm-versioning clause we agreed on already says new wrappers enter as versioned extensions without a format bump. A separate v4 format would have contradicted our own rule.

The result: a pure v3 reader opens and extracts a v4 vault unchanged (it walks past the recovery extension), and "v4" is the feature label for "a v3 vault with the Error Correction wrapper enabled", not a version bump.

3. The decision I want your eyes on: where the recovery data lives

This was a stated preference of mine, not a closed agreement between us, so it is the heart of this review. You pictured Error Correction as files "put to the side" (the par2 and Kopia separate-file model), and later as parity "appended to the end of its chunk". I shipped the first cut as an in-container block (the recovery data sealed inside the single .aerovault).

Rather than argue one model, here is the full comparison across the dimensions that actually decide it, including the append-per-chunk variant you raised, so the reasoning is on the table and not just the conclusion.

3.1 System comparison

Dimension	Embedded (in-container)	Detached (separate file)	Both	Append-per-chunk
Single sealed file = atomic backup	yes, one sealed file	no, two files	yes, the data file stays atomic	yes, one file
"Move one file, silently lose recovery" footgun	impossible	real risk	impossible (data self-protects)	impossible
Survives correlated damage to a data region AND its parity in one failure event	no (same device)	yes (parity on a 2nd medium)	yes (the other copy repairs)	no (chunk and its appended parity die together)
Survives a localized bad region within the parity budget	yes (data-front / parity-back separates them on HDD and optical; weaker on flash)	yes (independent file)	yes (strongest)	partial
Change redundancy ratio or add Error Correction later without rewriting the encrypted blob	no (rewrites the container)	yes (regenerate the detached file only)	yes (via the detached copy)	no (rewrites)
Overhead efficiency on mixed and small chunks	good (~20% fixed-grid, independent of chunk sizes)	good (same payload)	good (same payload)	poor (high relative overhead on small chunks)
Deduplication preserved	yes	yes	yes	complicates the manifest offsets
Forward-compat with v3-only readers	yes (skips the non-critical extension)	yes (never sees the file)	yes	entangles Error Correction with the chunk layout
Manifest locator recoverable (section 4)	yes (embedded metadata parity, implemented)	yes (detached mirror)	yes (best)	no
Audit story	one artifact, one codec	two artifacts	two artifacts, still one codec	one artifact
Operational simplicity (move, upload, sync)	one object	must keep two together	two objects	one object
Delete parity by hand to reclaim space (any file manager)	no (sealed in the container)	yes (select and delete the parity files anytime)	partial (the detached copy is deletable, but the embedded parity still bloats the file)	no
Protected file stays usable without the tool (viewable / editable / true byte size in any file manager)	no (parity inflates the file; only AeroFile knows the embedded percentage)	yes (the file is byte-identical, the detached file size is shown separately)	partial (the detached copy is clean, but the embedded parity still alters the file)	no

3.2 Failure-mode coverage, stated precisely

One thing I want to be exact about, because it is easy to oversell Error Correction and you would catch it: a detached parity copy does not make Error Correction a substitute for a second full backup. Parity at, say, 20 percent cannot reconstruct an entire lost medium; if the whole device holding the data dies, the data is gone and a fraction of parity cannot rebuild it. That is what a second copy is for. What the detached copy actually buys is decorrelation: it puts the recovery data on a different medium so that a single localized failure event cannot take out a block and its only recovery copy at the same time. With that framing honest:

Failure scenario	Embedded	Detached on a 2nd medium	Both
Bit flip or single bad sector in the data region	recovered	recovered	recovered
Localized bad region across several data blocks, within the parity budget	recovered	recovered	recovered
The parity region itself rots	not recoverable (no second copy of parity)	parity is the detached file, so a 2nd copy survives	recovered (the other copy repairs)
Manifest (the cipher_hash map) corrupted	recovered (embedded metadata parity, implemented, section 4)	recovered (detached mirror)	recovered
Header corrupted (the locator of locators)	not recoverable yet (header limit, section 4)	planned (detached file carries header parity)	planned
One localized failure event hits both a data block and the parity protecting it	not recoverable (same device)	recoverable (parity decorrelated)	recoverable
Total loss of the data medium	not recoverable (this is a backup problem, not an Error Correction one)	not recoverable	not recoverable

3.3 The recommendation, with the reasoning

I do not want to pick one model, because the matrix shows they cover different cells and the union covers all of them with one codec. The proposal:

one Reed-Solomon recovery payload, computed once,
three placements: embedded (default, sealed in the container), detached (separate file, for cross-medium decorrelation and for cheap re-parity without rewriting the encrypted blob), and both (if the embedded region is the casualty the detached copy repairs, and the reverse),
scrub and repair resolve the recovery source transparently: an explicit path first, then a detached file auto-detected next to the vault, then the embedded extension, and they always report which source was used.

Why embedded is the default and not detached: it preserves the single sealed file as an atomic backup (an AeroVault property I want to keep), it removes the footgun of separating recovery from data by moving one file, it is one artifact to verify, and it reuses the v3 extension slot with no new mechanism. The default should be the safe, no-surprise choice for the median user.

Why detached exists at all and is not just dropped: your case is real and embedded provably cannot cover it. The parity sits on the same device as the data, so it cannot decorrelate a localized failure that hits both, and it cannot be regenerated or re-rated without rewriting the encrypted blob. detached is the only model that does both. On flash specifically (USB sticks, SSDs) wear-leveling scrambles physical placement, so the embedded data-front / parity-back separation does not guarantee physical separation, and only a detached copy on a second device truly decorrelates. That is exactly the medium people use for the cold backups Error Correction targets.

Why this is cheap to build and not a second engine: the recovery payload is already a self-contained, serializable blob, and the reconstruction path already takes those bytes as an input rather than reading them only from the container. So a detached copy is a placement choice, not a parallel implementation. One codec, one audit pass, your point about audit tractability preserved.

Why append-per-chunk is dropped: once you have embedded plus detached, append-per-chunk adds no coverage the matrix does not already show, and it costs the fixed-grid overhead efficiency (it is poor on small chunks), couples Error Correction to the chunk layout, and lets a localized failure kill a chunk together with its adjacent parity. It is dominated by the union.

4. The metadata locator: now protected for the manifest (implemented)

Scrub locates damaged blocks by the per-block cipher_hash, which lives inside the encrypted manifest. So there is a chicken-and-egg: if the casualty is the manifest itself, scrub has no map to repair from, and the first cut protected the data blocks but not the locator. I started development and closed this for the manifest rather than leave it open:

on seal, when Error Correction is enabled, the engine now also computes a fixed-rate Reed-Solomon parity over the encrypted manifest and stores it as a second non-critical extension dedicated to the manifest, rebuilt on every seal,
that extension is located through the MAC-verified header, whose offsets survive a hit to the manifest region, so it can be found before any cipher_hash is read,
on open, a manifest that fails to decrypt is rebuilt from this parity and retried; a successful AES-256-GCM-SIV authentication on the rebuilt bytes is the correctness proof, so a wrong reconstruction can never be accepted,
repair persists the healed manifest even when no data block is damaged,
it is a non-critical extension, so a pure v3 reader still skips it and the forward-compat contract holds.

A regression test covers it (corrupt the manifest on disk, the vault still opens, extracts byte-identical, and repair heals the on-disk copy), with a negative control proving a non-Error-Correction vault with the same corruption is fatal, so the rebuild is demonstrably what saves it.

The honest remaining limit: this protects the manifest, not the 1024-byte header itself. The header is small, fixed-layout and HMAC-protected, so a header hit currently fails closed at open. Covering the header too is the natural job of the detached copy (the case where the in-file copy is the casualty), and it is on the list below for your review rather than shipped silently.

5. The scheme and its parameters, with the reasoning for each

What is built on the embedded side today, and why each parameter is what it is:

Reed-Solomon over the cipher blocks, shard = cipher block. Chosen because it aligns 1:1 with the cipher_hash-addressable block that v3 already stores, so a damaged block is exactly one erasure unit. No second addressing scheme, which keeps the "one codec, one review pass" story you cared about.
Default 10 data + 2 parity (~20% overhead). Tolerates two lost shards per stripe of twelve. 20 percent is a deliberate balance for the cold-storage and USB or NAS media Error Correction targets; it is not meant to be a single fixed number, which is why it ships as profiles (below).
Fixed-grid payload layout. The earlier prototype carried roughly 200 percent overhead; the fixed grid brings it to roughly 20 percent and makes the overhead independent of chunk sizes, so a vault full of tiny files does not pay a different rate from one large file.
Per-shard BLAKE3 checksum (16 bytes). This localizes which shards are damaged, so Reed-Solomon treats them as known erasures rather than unknown errors. Erasure decoding is both cheaper and twice as powerful for the same parity (it can recover up to K erasures versus K/2 unlocated errors). The checksum runs over ciphertext, so it works before any key is in play.
All-or-nothing repair. Reed-Solomon reconstruction is only correct when the surviving data shards and the parity shards used were themselves intact. Parity lives in an area scrub does not cover, so a rotted parity shard, or more erasures than parity in a stripe, can silently yield wrong bytes. So repair re-verifies every reconstructed block against its authenticated cipher_hash before writing anything, and persists only if all damaged blocks verify; otherwise it leaves the vault byte-for-byte untouched. The reason is concrete: persisting a wrong reconstruction would recompute parity over garbage on the next seal and destroy the very redundancy needed to recover. AES-256-GCM-SIV stays the only authority on tampering; recovery is for getting bytes back, not for trust.
Redundancy as profiles, not a raw percentage. Cloud, USB or NAS, Cold archive, Custom. Users reason about the medium they are protecting, not about a K/N ratio, and cloud durability is already redundant so its profile is light or off by default.

6. The commitment, restated

Per what we agreed: this does not get tagged, v3 does not leave Beta, and there is no milestone tag, until you have had a pre-audit review pass on this design and the spec. No deadline pressure from my side; the point of the gate is that the review happens. Anything you flag is a spec bug by default until shown otherwise, same rule as the code.

7. Open to changes and integrations

Nothing above is closed except where we explicitly agreed (the wrapper stack, the V4 = V3 + Error Correction contract, the recovery-not-just-generation requirement). I would specifically value your read, and counter-proposals, on:

Placement: does the embedded default with optional detached and both match your detached case, or would you set the default differently, or shape the resolution order differently?
The metadata locator: the manifest is now protected by an embedded fixed-rate parity (section 4, implemented); is mirroring it into the detached file and extending the same scheme to the 1024-byte header the right way to close the remaining header limit, or do you have a leaner one?
The parameters: the 10+2 default, the profiles and their ratios, the per-shard checksum width.
The detached format and naming: I have a working name in mind but I would rather settle it, and the on-disk framing (including the vault-binding so a wrong detached file is refused early), with you.
Anything the matrix gets wrong: if a cell is mis-scored or a failure mode is missing, that is a spec bug by the same rule as the code.

As always this track carries your name: the wrapper-stack framing, the V4 = V3 + Error Correction contract, the par2 reference, and the recovery-not-just-generation requirement are yours, and the credit stays inline in the spec and in the vault receipt.

7 replies

axpnet Jun 8, 2026
Maintainer Author

@EhudKirsh this is exactly the review pass the gate was for, thank you. Going point by point, because every one of these lands.

1. "Sidecar" is a property, not the separate-file placement

When referring to Error Correction as a sidecar, the idea is that no matter how you structure it, it doesn't transform the original files that it was applied to [...] I prefer to keep the term Sidecar as a property of certain stages in the processing of data (Overlays) [...] This is in contrast to Compression, Chunking and Encryption, which can't achieve their goals by being Sidecars. [...] So to the same extent, 'embedding' sounds more like a sidecar than 'separating'.

You are right and this is a real correction, not a quibble. I was using "sidecar" to mean "the separate file", and your framing is cleaner: Sidecar is the property of a non-transforming overlay - Error Correction reads the bytes and adds protection without rewriting them, where compression / chunking / encryption must transform to do their job. Under that property the parity can live either embedded or separate; both are "sidecar" in your sense, and embedding is arguably more sidecar-like (the attachment the bike pulls), not less.

So I'm splitting the vocabulary as you ask:

Sidecar becomes the overlay class: a non-transforming overlay whose output rides alongside the originals.
the placement names drop "sidecar" entirely. Proposed: embedded / detached / both (the parity bytes are sealed inside the container, written as a separate file, or kept in both spots). I lean detached over separate/external precisely because of your point - separate/external carry the "independent vehicle" connotation you just dismantled, whereas detached says "not inside the container" without saying "a thing of its own", and it pairs symmetrically with embedded. Tell me which word reads right to you and I'll lock the CLI --recovery-placement and the docs to it.

I see Sidecar as a property of certain overlays, which actually simplifies them and makes them more flexible by letting the storage of the additional useful data to be either separately from the main files or embedded in them.

Agreed, and that's the better mental model - placement becomes "where the sidecar's bytes go", not a different feature.

You also hint at a second non-transforming overlay we haven't discussed:

another notable stage which we've yet to discuss, but it would be too much to discuss it now.

Noted and parked - when you're ready, open it and I'll bring the same treatment.

2. Table 3.1 missing the cost of `both`

Please add a row in table 3.1 or combine it with the last row, which writes that the Both approach doubles the byte size of Error Correction compared to the 2 cols/approaches before it. [...] this is the main downside of having both.

Correct, and it's an honest omission - the matrix sold both as strictly best without pricing it. both stores the recovery payload twice, so it doubles the parity overhead (not the vault, but the Error-Correction bytes: at the 10+2 default, ~20% becomes ~40%). Adding a row:

Dimension	Embedded	Detached	Both	Append-per-chunk
Parity storage cost	1x	1x	2x (parity stored twice)	1x

with a note that beyond processing time and complexity, the doubled parity footprint is the real price of both.

3. Support all of them

since there are significant pros and cons, all of these options should be supported. This is what I'd expect from the best aspiring research platform/app on file management.

That's the plan - embedded (default), detached, both all ship; the matrix shows each owns cells the others don't, so dropping any of them loses coverage. The default stays the safe no-surprise one; the others are a flag away.

4. Different algorithm per placement

I also wonder whether it might be better to use one algorithm, like Reed-Solomon, as either separate or embedded, and another algorithm in the other place. This way covers as much ground as possible with various Error Correction approaches to maximise the protection of data from corruption.

Genuinely interesting, and it would buy algorithmic decorrelation - a bug or blind spot in one codec wouldn't take out both copies in a both setup. The tension is with the "one codec, one audit pass" property I leaned on, which keeps the security review tractable.

My read: the architecture already makes this cheap, because placement is just "where the bytes go" and the codec is a separate choice - so a second algorithm is possible without a parallel engine. But I'd keep single-codec as the default (one audit surface for the median user) and treat a heterogeneous both (e.g. Reed-Solomon embedded + a different code detached) as an explicit opt-in once the single-codec path is audited and shipped. That way we don't pay the second-codec audit cost up front, but the door you're pointing at stays open. Does default-single / opt-in-heterogeneous match what you had in mind, or were you picturing it on by default?

5. Re-hashing cost

won't appending Error Correction require hashing the whole file or chunk again? [...] it's extra complexity and processing, so it's something to think about. Maybe it's possible to easily only hash parts of files.

Good instinct, and the design already does the "hash parts, not the whole" thing you're reaching for. Parity is computed over the cipher blocks, and each shard already carries its own per-shard BLAKE3 computed once at seal. So:

embedded and detached over a vault: no whole-file re-hash - the recovery payload is a self-contained blob and the per-shard checksums are the only hashing, done once.
generating Error Correction over an arbitrary external file (your point 8 below): one pass to shard-and-hash it, but still per-shard, never "the whole file again per chunk".

Append-per-chunk was exactly where this got ugly (it re-touches layout and offsets), which is one more reason it's dropped in favour of embedded + detached.

6. Sidecar format - look at Kopia / par2 first

my only feedback for now is to look at what Kopia and possibly other apps are doing. I think it's best to start with being able to support existing standards, and then improve upon these by making your own. This is what you've done with Cryptomator Vault and AeroVault, and are about to do with AeroCrypt over Rclone's Crypt.

Agreed, and that's the right order - embrace the standard, then improve. par2 is the established parity-sidecar format and Kopia's Reed-Solomon shard layout is the closest living reference, so I'll study both and align the detached-file framing to what's already proven before inventing AeroFTP-specific framing on top (vault-binding so a wrong file is refused early, etc.). Same playbook as Cryptomator -> AeroVault and Crypt -> AeroCrypt.

You can also do it with Syncthing and P2P transfer in #284, and AeroRsync [...] I'd like to expand more on this paragraph, but I will probably do so elsewhere

Looking forward to that - open it where it fits and I'll engage there.

7. CLI flags - settled

--ec and --error-correction seem like reasonable flags, so let's keep both. I'm not a fan of a --ecc flag for Error Correction, as you might have imagined.

Settled: --error-correction as the canonical flag, --ec as the short alias, and --ecc is gone (it dies with the acronym). That closes the open CLI naming question from the follow-up post.

8. Error Correction over any file, idempotency, and `both` ordering

I like the idea of picking just about any file and generating Error Correction for it.

This generalises Error Correction from "a vault wrapper" to "a standalone parity tool over any file" - which is exactly the par2 mental model and a natural fit. I'll scope it as its own surface (a parity-style command over an arbitrary path) so it doesn't get tangled into the vault-only path, but the codec is shared.

It's important to make sure that AeroFTP recognises its own embedded Error Correction so as to not generate Error Correction for another existing Error Correction needlessly.

Agreed - idempotency guard. The engine will detect its own parity (extension marker for embedded, format magic for detached) and refuse to wrap parity-over-parity unless explicitly forced.

For the Both option, it's probably best to generate the Error Correction to a separate file and then embed it, as opposed to the other way around. It's simpler than finding where the file/chunk content ends, and where its Error Correction begins.

Accepted as the implementation order: for both, compute the detached payload first, then embed a copy - never embed-then-extract. Cleaner, no boundary-hunting inside the sealed blob. Good call.

9. Percentage selection (Kopia / QR levels) and absolute byte targets

KopiaUI gives the option of selecting the % in byte size of the input files [...] with 10% being the maximum. [...] QR codes [...] also use percentages: Low ~7%, Medium ~15%, Quartile ~25% and High ~30%. If you can support selecting absolute targets in byte size, that's also welcome.

This is the answer to my "profiles vs raw percentage" question, and your framing is better than mine. I'll expose both: a percentage dial and an optional absolute byte target. The familiar named levels map straight onto the QR scale you cite - Low ~7%, Medium ~15%, Quartile ~25%, High ~30% - which most people already recognise from QR codes, so we get a shared vocabulary for free.

One honest note: AeroFTP's current default is 10+2 (~20% parity), i.e. higher than Kopia's 10% cap and sitting between your QR "Medium" and "Quartile". I think that's right for the cold-storage / USB / NAS media this targets (Kopia leans on already-redundant repositories), but the point is the dial is exposed and the user picks - not a single baked number.

10. "Profile names", JSON, metadata locator

I'm not sure what you mean by "profile names". I don't have more --json feedback currently. I don't know much about the metadata locator and parameters.

That's on me - "profiles" was my term for medium presets (cloud / usb-nas / cold-archive). Given your point 9, I'm collapsing that into one vocabulary: the QR-style named percentage levels (Low/Medium/Quartile/High + custom), so there's no second naming scheme to learn. No "profile names" to settle anymore - they become the percentage levels.

On the metadata locator and parameters (sections 4 and 5): those are implementation-side (how a corrupted manifest is rebuilt from its own parity, shard sizing, all-or-nothing repair) and nothing there needs a decision from you - I flagged them for transparency, not for a ruling. No action needed on your side.

11. Mounting - the aspect I left out

Another important aspect which you left out here, perhaps I didn't highlight it before in #162, is mounting. [...] it would be nice to be able to mount and view any profile before and after each wrapper stage.

You're right that I left it out, and it's a real gap in the post. Mounting exists in the codebase (AeroFTP ships a FUSE layer via fuser), so "mount and view at each wrapper stage" is an extension of something already there rather than a new engine - I'll give it its own design pass against #162 rather than bolt it onto this thread.

if the sidecar is a separate file rather than appended to the end, there will be barely any difference between the view that's on a cloud drive/external hard drive [...] and the view of it before the sidecar is created, because it's just adding separately.

This is a strong argument for the detached placement that the matrix didn't capture: with a detached file the mounted / cloud view is unchanged except for one extra object, whereas embedding rewrites the container the user sees. I'll add this as a row/note in 3.1 (detached = storage view stays stable; embedded = container changes) - it's a genuine point in detached's favour for mount-and-browse workflows, and it pairs with your point 1 (embedding is the more invasive of the two, even though it's the "more sidecar-like" attachment).

Summary of what changes from this pass

Terminology: "Sidecar" = overlay property; placements rename to embedded / detached / both (word for the middle one is yours to confirm).
Table 3.1: add the both = 2x parity cost row, and a detached = stable-storage-view row.
Flags settled: --error-correction + --ec, no --ecc.
Redundancy UI: QR-style named % levels + absolute byte target; "profiles" folded into these.
both ordering: detached-first, then embed.
Idempotency guard + Error Correction over arbitrary files added to the design.
Heterogeneous codecs and mount-per-stage: accepted as directions, scoped as opt-in / their own pass.

Still nothing tagged, v3 stays in Beta, per the gate. Anything above I've mis-read is a spec bug by the same rule. And as always the framing here is yours - the sidecar-as-property correction is going straight into the spec with your name on it.

EhudKirsh Jun 9, 2026
Collaborator

1. Sidecar Property

detached seems fine, but I can think of (at least) two types:

Putting all the Error Correction files in one folder, like at the root directory,
even if part of the Error Correction is in deep subfolders.
Each chunk and file has a sibling chunk or file, which is its Error Correction. In this context, it's important to consider when Chunking and Error Correction are both utilised, since we have an ideal range for chunk sizes (which is case-dependent, as we discussed on a number of factors like how much RAM the user allocates, etc). A downside of the 2nd option is that since Error Correction will realistically only be at most 20% of the byte size, to double Kopia's example, this can result in chunks that are too small, and therefore inefficient, or even rejected for AWS S3 if they're under 5MB. So should the 2nd approach combine multiple of these into chunks that are larger than 5MB? What if they don't have enough byte size in some folders?
Then again, when Chunking is utilised, you flatten all the folders, right? I'm just trying to imagine what this will look like.

At any rate, you might want each of these detached to have its own tag or label.

2. Table 3.1

"Parity storage cost" will be unclear to a layman reader unfamiliar with this post. I strongly suggest using the term Byte Size.

4. Default Picture

I had a picture in my mind of the default Error Correction structure from my chats with AI,
but I don't want to share it in case it misleads you. I don't mind your draft default structure of detached.
Again, I think it's best to look at what Kopia does to learn from it, and then you can take it from there to develop improvements.

6. Another Sidecar Overlay

As for the other sidecar overlay I hinted at,
it doesn't need to have any space reserved like you reserved space for Error Correction in AeroVault V3 for V4.

Let's just say that it's something that should ideally work entirely offline and independent of AeroFTP,
and therefore it deserves its own dedicated repo like https://github.com/axpdev-lab/aerovault.
I am confident that the right move now is to focus on the 4 overlays we discussed, and leave this one for later.

9. Various Target Levels

AeroVault and AeroSync should have a <input type='number'> and a <input type='range'> slider for the level of zst compression,
along with the 3 labelled buttons of Fast, Balanced and Archive. Similarly, AeroVault and AeroSync should also have a
<input type='number'> and a <input type='range'> slider for the percentage in Error Correction with 2, 3 or 4 labelled levels.
Maybe Low ~7% and Medium ~15%, just like QR codes. The idea is that the users will be able to fine-tune an exact percentage from 1% to the maximum, and an exact level of zst, but the labelled buttons are for guidance and familiarity.

axpnet Jun 9, 2026
Maintainer Author

Thanks Ehud, going point by point.

1. detached sub-types & the <5 MB problem. I think the two "flavours" you describe aren't actually two placements: they fall out of whether chunking is on. With chunking ON the vault is already a flattened content-store, so parity lives alongside the chunks; and you're right that raw per-chunk parity would be tiny, so small parity fragments get coalesced into parity blobs above the provider's minimum part size (never sub-5 MB on S3). With chunking OFF (the whole-file overlay) parity is a single sibling sized to the chosen %. So "all parity in one folder" vs "sibling per chunk" map onto chunking-off vs chunking-on, not onto two separate modes. I do agree each resulting layout should carry its own explicit tag in the manifest. One open question for you: do you see any case where you'd want sub-5 MB parity siblings kept separate rather than coalesced?

2. "Parity storage cost". Agreed, that phrasing is jargon. I'll switch Table 3.1 to an explicit byte framing ("parity byte size" / extra bytes written).

4. Default structure. Good: I'll keep detached as the draft default and study Kopia's parity-store layout as the reference before refining.

6. The other sidecar overlay. Fully agree: park it, give it its own offline-first repo when the time comes, and keep this round focused on the four overlays we already scoped.

9. Configurable levels. Yes to both, in AeroVault and AeroSync: a number input + range slider for the zstd level (with Fast/Balanced/Archive as the guided buttons), and the same number+range for the EC percentage with a few labelled stops (Low ~7%, Medium ~15%, like QR codes), fine-tunable from 1% up. Buttons for familiarity, slider for exact control. I'll capture it as the UI spec for when EC lands in the GUI.

Nothing here is in code or tagged yet: this stays the review surface, as agreed.

EhudKirsh Jun 9, 2026
Collaborator

do you see any case where you'd want sub-5 MB parity siblings kept separate rather than coalesced?

I'm not sure. I'm curious to see how AeroSync will handle Error Correction first, which might give me an idea.

axpnet Jun 9, 2026
Maintainer Author

No date to give you yet, by sequence, not by calendar:

Embedded EC over AeroVault is already code-complete on a working branch: create/scrub/repair, CLI and GUI, tests green. It's exactly the review surface from this thread, so it lands in a release once we close this review pass, nothing tagged before then.
AeroSync's EC handling, the part you're curious about, is the next slice after that. It's design-stage right now, so it gets its own pass and I'll bring the concrete behaviour back here for your read before it sets. That's also where the sub-5 MB coalescing question gets answered in practice, so your instinct to see AeroSync first is the right order.

So: AeroVault embedded ships first (review-gated, ready), AeroSync EC follows. No version lock on either, they go out when the review and the design are done, not on a fixed date.

axpnet · 2026-06-11T14:27:04Z

axpnet
Jun 11, 2026
Maintainer Author

AeroVault v3: stress validation for the Beta to Stable promotion

v3 has been the on-disk AeroVault format since aerovault 0.4.0 on crates.io, and it reads existing v2 vaults unchanged (no in-place conversion: new vaults are v3, old ones keep working). Before promoting it from Beta / opt-in to Stable, I wanted a technical green gate beyond the unit suite: a large, adversarial stress battery on the real release binary. Here is the result. Nothing is tagged; this is the validation evidence for the promotion call.

Method

4,914 real CLI operations across 7 scenario families, run in parallel on the release binary (so the Argon2id 128 MiB KDF runs for real on every open). Every case asserts one of two things: a byte-exact round-trip (SHA-256 of the extracted file equals the original), or a correct rejection (wrong input fails cleanly, never silently). A handful of representative commands are shown below; the battery runs thousands.

Scenario families

Round-trip integrity: every format (v1 / v2 / v2 --cascade / v3) by profile (fast / balanced / archive) by size (0 B to 1 MiB, boundary-heavy: 4095/4096/4097, 65535/65536) by content (random / zeros / text / repeating pattern). create, add, extract, SHA-256 must match.
Multi-file vaults + dedup: vaults of 5 to 35 files, identical files added under different names (dedup path), extract-and-verify every entry.
Unicode / hostile filenames: café, 日本語, emoji_🔒, spaces, tabs, quotes, brackets, Greek / Cyrillic / Arabic scripts.
Wrong-password rejection: extraction under hundreds of wrong passwords must fail and never return the real plaintext.
Tamper / integrity fuzzing: flip 1 to 8 random bytes anywhere in the vault, then extract. Corruption must never silently pass (either the authentication layer rejects it, or the flipped bytes landed in slack and the output is still bit-exact). This is the property that matters most for a Stable format.
Bogus / garbage parse: random bytes presented as a .aerovault, where info and extract must fail cleanly and never panic.
Large files: 5 to 100 MB single files, streaming / memory stress.

Results

Scenario family	Cases	Pass
1. round-trip integrity	432	432
2. multi-file + dedup	30	30
3. unicode / hostile names	45	45
4. wrong-password reject	400	400
5. tamper / integrity fuzz	2,500	2,500
6. bogus / garbage parse	1,500	1,500
7. large files	7	7
Total	4,914	4,914 🟢

The first pass of this battery was not clean, and that is the point of running it. It flagged 144 failures, all in one corner: extracting from a v2 (legacy) vault to a named output file. The v3 format and its round-trip were never affected. The cause was a bug in the app's v2 extract wrapper: it derived the output directory from the destination and let the archive name the file itself, so the requested filename was ignored and a second extract to the same name failed with File exists. v1 and v3 already honored the full destination path; v2 did not. It is now fixed so v2 matches v1/v3, covered by a new v1+v2+v3 regression test, and the rerun above is clean. A Stable-promotion gate that surfaces and closes a real backward-compat bug before any tag is doing exactly its job.

Representative commands (the battery runs thousands of these)

Round-trip integrity, byte-exact across formats, profiles, sizes and content shapes:

# v3 / archive profile / 300 KB incompressible
$ aeroftp-cli vault create v.aerovault --vault-version v3 --profile archive
$ aeroftp-cli vault add     v.aerovault payload.bin
$ aeroftp-cli vault extract v.aerovault payload.bin out.bin
$ sha256sum payload.bin out.bin       # identical
9d399285f16a0bcee0d8420444731240018335bbf0415a949cf2c0797113a909  payload.bin
9d399285f16a0bcee0d8420444731240018335bbf0415a949cf2c0797113a909  out.bin

Tamper detection, a single flipped byte in the ciphertext is caught before any plaintext is returned:

$ printf '\x00' | dd of=v.aerovault bs=1 seek=120000 count=1 conv=notrunc
$ aeroftp-cli vault extract v.aerovault payload.bin out.bin
Error: Cipher block hash mismatch for chunk ccb51bfb...   # exit 1, rejected before any output

Wrong password, fails closed, no plaintext leak:

$ AEROFTP_VAULT_PASSWORD="not-the-password" aeroftp-cli vault extract v.aerovault payload.bin out.bin
Error: AES-KW unwrap failed   # exit 1, key unwrap fails, no plaintext produced

Garbage input, random bytes as a vault never panic the parser:

$ head -c 50000 /dev/urandom > junk.aerovault
$ aeroftp-cli vault info junk.aerovault
Error: Failed to read archive: invalid Zip archive: Could not find EOCD   # exit 1, clean rejection

Conclusion

🟢 4,914 / 4,914 cases pass. The integrity guarantee held across 2,500 byte-tamper iterations with zero silent corruption, and 1,500 garbage inputs with zero panics. The one real defect the battery found (the v2 named-extract bug above) is fixed and pinned by a new regression test. Combined with the existing unit suite (full Rust lib suite green) and the v3 audit (CRYPTO-01 chunk-splicing fix shipped in v3), this is a technical green gate. My recommendation: v3 is ready to move from Beta to Stable. As always, nothing here is tagged; flagging it for review first.

0 replies

axpnet · 2026-06-11T14:46:14Z

axpnet
Jun 11, 2026
Maintainer Author

AeroVault v4 (v3 + Error Correction): development follow-up, audit, fixes, and live evidence

Status at a glance:

🟢 Green (done and verified): the Reed-Solomon codec and its single-source extraction, the audit (every critical/high/medium fixed), the full gate (fmt, clippy, 2,214 lib tests, audit), and the live evidence (8/8 scenarios). The engineering is settled.
🟡 Yellow (still open): the three pre-tag decisions at the bottom (extension names, default placement, level labels), and therefore the tag/release itself. These are calls to make, not code to write.

Following the Error Correction design I posted here on 8 June, I took the track end-to-end internally so the pre-tag review has something concrete to push on. The point of building it now is exactly to surface technical problems early, before anything is tagged, rather than discover them after a release. Nothing below is tagged or in a release. This is still the review surface; everything is open to change on your read, @EhudKirsh.

What is built (recap)

v4 is not a new format, it is v3 + an Error-Correction layer, a non-critical extension, so a v3-only reader still opens a v4 vault (it just ignores the parity). The slices, as built:

Reed-Solomon codec extracted into a shared error_correction/ module (one source of truth for both vault and sync, so they can never drift on the wire format).
Vault parity, three placements: embedded (in-container, refreshed on every seal), detached (a sibling .aerovault.rec that leaves the encrypted container byte-identical, par2's model, and it lets you add parity to a vault created without it), and both.
AeroSync .aerorec parity sidecars, so a bit-rotted remote backup can be repaired on the next pull, without the original. This is the gap rclone has nothing for and Kopia can't add to an existing repo.
CLI (vault create --error-correction, scrub, repair, export-parity, strip-parity; sync --error-correction), GUI, and 47-locale i18n.

Audit: final assessment

I ran a full post-implementation audit (four independent review passes: parser/security, Reed-Solomon correctness, sync integration, and code quality/duplication). Headline:

The repair trust-model is sound. Every repair re-verifies the recovered bytes against an authenticated value before persisting: the vault header MAC, the AEAD-decrypted manifest's cipher_hash, or the sync file's expected SHA-256. A foreign or corrupt parity file can therefore only make repair fail, never overwrite good data. (Verified live, see below.)
The codec extraction is clean, single definition, zero duplicated shard math, vault format byte-identical (the v3 test suite is untouched and green).
The exploitable gaps were in the parsers that read untrusted sidecar bytes, and they are fixed.

Severity	Finding	Resolution
Critical	parity-payload parser could integer-overflow / over-allocate on a crafted header (sidecars are read from untrusted remotes)	checked arithmetic + bound-before-allocate, regression-tested
Critical	the GUI sync compare didn't exclude `*.aerorec`, so sidecars could be deleted as orphans	sidecars now always excluded from comparison
High	sync sidecar parser could over-allocate on a forged segment count	bounded before allocation
High	the GUI didn't remove a file's sidecar when the file was deleted	paired sidecar now cleaned up
High	the sync cost preview under-reported parity size by up to ~600x for small files	now computed from the real grid geometry, locked by an equality test
Medium	an unrepairable download could be reported healthy	added an explicit `verify_failed` counter

All critical/high/medium items are fixed on the branch, plus a set of low/quality items (dead-code removal, dependency pinning, +9 direct codec tests). Full report lives in the dev appendix.

Gate: all green

cargo fmt clean, clippy -D warnings --tests (lib and CLI) 0 warnings, full Rust lib suite 2,214 pass / 0 fail, cargo audit clean, TypeScript tsc clean, 63 frontend unit tests pass, CLI release build OK.

Live evidence (real CLI, no mocks)

Embedded parity: inject damage, detect it, repair it, byte-verify the recovered file:

$ aeroftp-cli vault create v.aerovault --error-correction --recovery-level 20
$ aeroftp-cli vault add v.aerovault secret.bin        # 300 KB incompressible
$ aeroftp-cli vault scrub v.aerovault --json
{ "checked": 1, "count": 0, "damaged": [], "parity_source": "embedded" }

# corrupt 48 bytes inside the encrypted data section, then scrub again:
$ aeroftp-cli vault scrub v.aerovault --json
{ "checked": 1, "count": 1, "damaged": [ { "id": "e840a885...", "on_disk_len": 300051 } ], "parity_source": "embedded" }

$ aeroftp-cli vault repair v.aerovault --json
{ "repaired": 1, "damaged": 1, "dry_run": false, "parity_source": "embedded" }

$ aeroftp-cli vault extract v.aerovault secret.bin out.bin
$ sha256sum secret.bin out.bin     # identical, recovery is bit-exact

Detached parity, and the safety property that matters most: a hostile recovery file is rejected, the vault is left untouched, and the genuine parity then repairs the same damage:

$ aeroftp-cli vault export-parity v2.aerovault
Wrote recovery file v2.aerovault.rec (12 shards, 300051 bytes protected, 20.1% overhead)

# a random 5 KB ".rec" handed to repair on a damaged vault:
$ aeroftp-cli vault repair v2.aerovault --parity bogus.rec --json
Error: recovery file does not match this vault   (exit 1, vault unchanged)

# the real sidecar repairs it:
$ aeroftp-cli vault repair v2.aerovault --parity v2.aerovault.rec --json
{ "repaired": 1, "parity_source": "explicit" }

(8/8 live scenarios pass, including strip-parity refusing to leave a vault with zero recovery.)

Interesting ideas the audit surfaced

These are not committed scope, they are the genuinely novel directions the audit turned up, worth your eyes:

Tiny-file EC policy. The fixed-grid v2 codec floors shards at 4 KiB, so a 100-byte file gets an ~8 KB sidecar (huge in percentage, tiny in absolute terms). For sync that's wasteful; a minimum-benefit gate (or a sub-4 KiB "tiny" profile) would skip or shrink parity for micro-files. Honest and cheap.
Per-directory recovery bundle. One .aerorec per protected subtree instead of one-per-file, which directly fixes the "object count doubles on the remote" cost on per-request-billed backends. A real improvement over par2's one-set-per-file framing for backup trees.
Provider-adaptive parity. S3-class storage already replicates (Kopia calls cloud ECC "overkill"); the consumer NAS/FTP targets are the ones with no built-in scrub. The Plan tab could recommend EC by backend reliability instead of treating all remotes alike.
Cross-file (global) parity. par2-style parity spread across a file set would recover whole-file loss, not just localized rot, a superset of what a single-file sidecar can do, and a natural fit for the backup story.

Remaining work

v4 EC: remaining
├── Pre-tag, needs your call (this thread)
│   ├── sidecar extension names (.aerorec / .aerovault.rec)
│   ├── placement default (detached vs embedded)
│   └── the percentage-level UI (Low / Medium / Quartile / High)
├── Tracked follow-ups (non-blocking)
│   ├── preserve remote mtime on the engine repair path
│   ├── reconcile CLI/library default percentage
│   ├── GUI verify-failed counter (engine done) · resume parity hole
│   └── expose an EC level on the MCP sync surface
└── Future phases (designed)
    ├── windowed streaming → lift the 256 MiB sync size cap (container already multi-segment)
    └── background scrub + a health badge on vault cards

Decisions still open

Three calls are still open before this can be tagged:

Sidecar extension names. .aerovault.rec for a vault's recovery file, .aerorec for AeroSync parity sidecars.
Default parity placement. Detached (parity in a sibling file, so the encrypted container stays byte-identical and the storage/browse view does not change) vs embedded (parity inside the container). Current lean: detached.
Recovery-level UI labels. The percentage levels surface as Low / Medium / Quartile / High.

Everything else in the tree above is decided. The tag stays blocked until these are settled; any of them can also move back into the roadmap if that is a better home.

7 replies

EhudKirsh Jun 11, 2026
Collaborator

multipart-upload minimum part size

So in other words, is that chunking of large buckets into smaller chunks only for transfer as opposed to at rest for S3? I use the word bucket instead of a file here because we refer to S3.

EhudKirsh Jun 12, 2026
Collaborator

Also, you might want to add to table 3.1 above the detached advantages I just listed.
This is one case in which the Both option isn't benefiting from all the advantages of both.

axpnet Jun 12, 2026
Maintainer Author

Both good follow-ups.

On S3: yes, exactly. Multipart is purely a transfer mechanism, the parts are reassembled server side into one object, so at rest there is no chunking and nothing is stored in pieces. The 5 MB minimum applies only to a part during the upload protocol, never to the stored object, which is why a sub-5 MB file (or a small .aerocorrect sidecar) goes up fine as a single PUT.

On table 3.1: agreed on both counts. I will add the two detached advantages you listed (delete any parity file by hand from any file manager to reclaim space, and the protected files stay viewable and byte-measurable since EC is non-transformative), and I will flag that Both does not inherit all advantages of both placements: because it also writes embedded parity, it loses the clean byte size and the hand-reclaimable space that are exclusive to pure detached. I will update the table accordingly.

axpnet Jun 12, 2026
Maintainer Author

Done: table 3.1 now lists the two detached advantages (delete parity by hand to reclaim space, and the protected file stays usable and byte-measurable), and both show Both as partial since the embedded parity cancels them. I also renamed the placement column to detached across tables 3.1 and 3.2 and aligned the prose to match.

axpnet Jun 12, 2026
Maintainer Author

Link to the updated table, for reference: #276 (comment)

axpnet · 2026-06-12T09:55:16Z

axpnet
Jun 12, 2026
Maintainer Author

🟢 AeroVault v4 EC follow up: windowed streaming, the size cap is lifted

Continuing the large-file thread: the detached .aerocorrect error correction no longer loads the whole file into memory and is no longer capped at 256 MiB.

How it works, briefly:

A large file is tiled into fixed 64 MiB windows, and each window carries its own independent Reed-Solomon parity segment inside the same .aerocorrect sidecar. The format was already multi-segment (it is what lets one vault sidecar protect header, manifest and data separately), so this just uses that same shape for size.
Generation, verification and repair touch at most one window of plaintext at a time, so peak memory is bounded by the window, not by the file.
Repair is window-local and all-or-nothing: each window is rebuilt from its own parity into a temporary file, and the original is replaced atomically only if the whole repaired stream hashes back to the expected value, otherwise the original is left byte-for-byte untouched.

Live evidence (real CLI, MEGA, no mocks)

Scenario	Input	Result
Vault EC, corrupt then repair	250 KB vault	repaired from detached `.aerocorrect`, extract byte-identical
Sync EC, small file	300 KB	single window, magic `AEROCORR`, unchanged behaviour
Sync EC, large file	130 MiB	3 windows (64 + 64 + 2 MiB), sidecar 20 MiB, segments = 3
Verify on download, large	130 MiB	byte-identical, peak RSS 70 MB (the file is never fully in RAM)
Windowed repair	multi-window	rebuilt per window, all-or-nothing (unit proven)

Caps and safeguards

Item	Value	Why
Window size	64 MiB	balances per-window memory against the per-window segment table
Default per-file cap	1 GiB (overridable per call)	the sidecar is still held in memory during repair and grows with the file
Plaintext memory	one window	streaming generation, verification and repair
Repair safety	atomic temp then rename	the original is untouched unless the repaired stream verifies
Layout check	window tiling validated	a forged or foreign sidecar layout is rejected before any byte is written
Binding	content SHA-256	one format for any file, as agreed
Download guard	oversize sidecar rejected	a remote cannot serve a giant sidecar for a tiny file

Tests executed

cargo test --lib: 2223 passed, 0 failed, including 8 new windowing tests (window tiling, windowed size estimate equality, multi-window round trip, per-window repair, streaming file generation equals in-memory, all-or-nothing on an unrecoverable window).
cargo test --bin aeroftp-cli: 219 passed, 0 failed.
clippy on lib and bin with --tests and -D warnings: clean.
cargo fmt --check, cargo audit, frontend typecheck, 257 frontend unit tests, 47 locales validation, release build: all green.

One honest note: the plaintext is fully streamed, but the sidecar itself is still read into memory on the repair path, which is why the default cap is 1 GiB for now. Lifting it further needs a streaming sidecar (parse and fetch parity window by window), which is a separate follow up.

Next: a top-level correct subcommand to apply this to any standalone file is the natural next step, and it also lets us live-test windowed repair end to end.

7 replies

axpnet Jun 12, 2026
Maintainer Author

Good questions, in order.

How the tests corrupt things, and the format point. You are right that a flip can change how a parser reads a file: corrupt a length or an offset and the structure is suddenly invalid. Error correction sidesteps that entirely, it protects the raw byte stream, so it restores the exact original bytes regardless of what the corruption did to the format interpretation. The harness just XORs a byte (or wipes a whole window) at chosen offsets, in the file and, for the sidecar tests, in the .aerocorrect itself. So it is real bit-rot, not a structural edit: the magic, header offsets and sidecar layout are all still present, the instance is just damaged until EC rebuilds it.

The Crypt wrapper is the key insight. EC runs LAST in the wrapper pipeline (compression, chunking, crypt, then EC), so inside a .aerovault it protects the CIPHERTEXT, not the plaintext. The original file format is already gone at that point, and EC never sees or cares what it was: it protects the container bytes as-is. Repair restores the ciphertext, and only then does decryption give you the original file back. Corruption and recovery both happen at the container level.

Corrupting the file only, versus the file and its sidecar.

File damaged, sidecar intact: repair reads the intact parity, reconstructs the damaged region, re-verifies it against the authenticated cipher_hash / header MAC, and only then persists. Recovers up to the parity budget.
File and sidecar both damaged: the .aerocorrect carries its own BLAKE3 integrity over the whole thing, so any flip in it makes it fail to parse and it is rejected wholesale. If that vault also kept embedded parity (the Both placement), repair falls back to the embedded copy; otherwise there is nothing to repair from and the damaged vault is left byte-for-byte untouched, never a wrong overwrite. Same property par2 has: lose the recovery data too and you lose the recovery, which is exactly why the parity location is a choice.

On "v4 = v3 plus a detached sibling": yes, and embedded is already there too. Placement is a three-way choice:

detached (default): the .aerovault stays byte-identical to a plain v3, parity lives in the .aerocorrect sibling, the storage view stays clean.
embedded: parity is a non-critical extension stored INSIDE the .aerovault, auto-refreshed on every seal. A v3 reader just ignores it, so it stays forward-compatible.
both: embed AND write the sibling, for two independent recovery locations.

So a user who wants the parity inside the container already gets it, exactly as you suggest.

And your last point is exactly right. Embedded placement is literally "EC appended as trailing bytes", and it works for any format that tolerates trailing data (the vault's extension area is built for it). The catch is that not every format ignores trailing bytes, so the universal version of your idea is the detached sibling: it protects ANY file without touching it. That is why detached is the default, and the planned correct subcommand will do exactly that for any standalone file, a .aerocorrect written next to it, no special reserved space required.

axpnet Jun 12, 2026
Maintainer Author

A few diagrams to make the above concrete.

1. Where error correction sits. EC is the last wrapper, so it protects the ciphertext, not your file. It never decodes the format, it just restores the exact bytes, then decryption gives the file back.

2. Parity placement. Detached is the default (the .aerovault stays byte-identical to a plain v3, parity in the sibling), embedded keeps the parity inside the container, both writes it in both places (in-container plus a sibling).

3. Recovery, file only versus file and sidecar. With the sidecar intact you reconstruct and re-verify against the authenticated hash. A lightly damaged sidecar self-heals first (its replicated locator is repaired and rotted parity shards are routed around) and still recovers within the parity budget. Only damage past that budget falls back to the embedded copy if one exists, otherwise the vault is left byte-for-byte untouched, never a wrong overwrite.

EhudKirsh Jun 12, 2026
Collaborator

Nice diagrams.

Please add the word 'sibling' to the both as well, so there's absolutely no confusion as to why it's not there but it's in the detached scenario.

Also, surely if .aerocorrect is only slightly corrupted it can still recover .aerovault?

axpnet Jun 12, 2026
Maintainer Author

Thanks, glad they help.

Good catch on both, you're right that it should say "sibling" there too. both means the parity lives in two places at once, the embedded copy inside the container AND a sibling .aerocorrect next to it, so the word belongs in that panel exactly like in detached. I've updated the placement diagram so the three read consistently: detached = sibling only, embedded = in-container only, both = in-container plus sibling.

And on a slightly-corrupted .aerocorrect still recovering the vault: yes, and you pushed me to make it actually true rather than leave it all-or-nothing, so I did. The recovery diagram showed the end state (recovered, or left untouched) but not the in-between step where the sidecar heals itself first, so I redrew it to show that step explicitly.

The recovery file now self-heals. Its small critical metadata (the header, the segment directory, the per-window geometry, the part that locates everything) is stored redundantly, so a stray flip there is repaired from a good copy before anything else runs. The bulk of the file is parity, and that was already resilient: every shard, data and parity, carries its own checksum and reconstruction treats a mismatching shard as an erasure and routes around it. So a lightly-damaged .aerocorrect reconstructs the vault as long as the damage stays inside the parity budget.

The safety property is unchanged: recovery only writes after the rebuilt bytes re-verify against the authenticated values (the vault's header MAC and manifest cipher_hash, or the file's content hash for AeroSync). Damage beyond the budget makes the repair fail and leaves the original byte-for-byte untouched, never a wrong overwrite. Lose the recovery data past what it can self-heal and you lose the recovery, the same par2 property that makes the parity location a choice.

Both updated diagrams are refreshed in place in the diagrams post above.

EhudKirsh Jun 12, 2026
Collaborator

Considering that if you still have an uncorrupted file, you can always generate it new EC, maybe the focus of EC should be just on recovering the original file as opposed to healing itself? Perhaps I misunderstand Error Correction, but is the self-healing of .aerocorrect coming at the byte size expense of storing parity for the main file like .aerovault?

axpnet · 2026-06-12T15:48:29Z

axpnet
Jun 12, 2026
Maintainer Author

Good question, and no, the self-healing does not come out of the main file's parity budget.

The sidecar has two parts: the parity (the Reed-Solomon shards that actually rebuild your .aerovault), and a tiny locator (the segment directory, the content hash, and per-window geometry). Only the locator is stored in triplicate. The parity itself is stored exactly once and is completely unchanged.

The locator is a few hundred bytes, and it is fixed regardless of file size. Concretely, the redundancy adds about 360 bytes total. On a 1 MB file the whole sidecar is 154,621 bytes at the medium level (15.4% overhead), of which the self-healing redundancy is those ~360 bytes, so roughly 0.2% of the sidecar and 0.04% of the file. Protect a 1 GB file and it is still ~360 bytes, now a rounding error. So self-healing does not reduce how much damage to the main file you can recover.

On the "maybe EC should just focus on recovering the original" point: that is exactly the focus, and self-healing serves it. You are right that if you still have a good original you can always regenerate parity, but EC is for the case where you do not. The failure I wanted to close is this one: your .aerovault is damaged AND the recovery file takes one stray bit flip in its metadata. The old framing put a single checksum over the whole sidecar, so that one flip rejected the entire recovery file, and you lost the repair even though 99.99% of the parity was intact. Triplicating only the locator means a lightly rotted sidecar still repairs the file, at the cost of those few hundred bytes. The parity shards already self-check individually, so a rotted parity shard is just treated as an erasure and routed around, no wholesale checksum needed.

So it is not "spend parity to heal the sidecar". It is "spend ~360 fixed bytes so a slightly damaged sidecar is still usable when you need it most".

3 replies

EhudKirsh Jun 12, 2026
Collaborator

Maybe I'm up to something and maybe I'm up to nothing here, but if the issue is that a cryptographic hash may fail due to a bit flip,
why not consider fuzzy hashes here as well? As a 2nd plan for BLAKE3. To quickly evaluate how corrupted .aerocorrect becomes.

axpnet Jun 12, 2026
Maintainer Author

I like the instinct, and you have put your finger on the real property: a cryptographic hash is all-or-nothing, it tells you "identical or not" and nothing about degree. So the question "can we measure how damaged this is" is a good one.

The catch is that a recovery path can never act on "approximately right". The whole safety model is that we re-verify the rebuilt bytes against an exact hash before we persist anything, so a sidecar can only make a repair fail, never corrupt good data. A fuzzy match that says "this copy is 95% similar to what it should be" cannot be used as the source of truth, because we still do not know which 5% is wrong, and writing it back would be exactly the silent corruption we are trying to prevent. So fuzzy hashing cannot replace BLAKE3 for the integrity decision itself.

But here is the nice part: the "how corrupted" meter you are asking for already exists, and it is sharper than a single fuzzy score. The format is built from many small self-checking pieces. Every Reed-Solomon shard carries its own checksum, and the locator is stored in three copies each with its own checksum. So instead of one fuzzy number for the whole file, we already know exactly which shards and which copies failed. That gives a precise, actionable map: "12 of 15 shards intact in this window, 2 of 3 locator copies good", and from the shard count versus the parity budget we can even say up front whether a repair will succeed. That is strictly more information than a similarity score, and it is the same data reconstruction already uses to route around the damage.

There is also a practical issue with fuzzy hashes specifically here: the locator is tiny, well under a kilobyte, and ssdeep / TLSH style hashes want much more input to be meaningful and can collide, so on this data they would be weaker, not stronger.

Where your idea does land cleanly is diagnostics, not recovery. A "health" readout on verify (something like "this sidecar is 90% intact, repair likely" or "too damaged to recover") is genuinely useful, and we can build it for free from the shard and copy failure counts we already compute, no fuzzy hash and no new dependency. If that is the experience you were picturing, I am happy to surface it on the verify path.

And to be fair to fuzzy hashing, it is a great tool in its own domain, finding near-duplicate or similar files, which is a different problem from deciding whether a recovery file is byte-exact. Thanks for pushing on this, it is the right kind of question.

EhudKirsh Jun 12, 2026
Collaborator

Also, BLAKE3 is faster, so it's best to start with it, and if a task can use either cryptographic or fuzzy hashes (hard to imagine them as alternatives), it's best to pick cryptographic for the speed.

I wasn't thinking too deep, I just noticed a pattern and connected 2 dots.
Thanks for letting me know more about the use hashes in Error Correction.

axpnet · 2026-06-12T22:12:14Z

axpnet
Jun 12, 2026
Maintainer Author

AeroVault v4 (v3 + Error Correction): ready to ship, asking for your go-ahead

🟢 This is engineering-complete and staged on every surface. Before I tag the release I want your read, @EhudKirsh, since we agreed v4 would go out on your review pass, and your testing and design input shaped most of it.

Where it stands:

The self-healing .aerocorrect format is done and the gate is fully green: cargo fmt, clippy -D warnings on the lib and the CLI with --tests, the full Rust test suite, cargo audit, the TypeScript and frontend unit tests, and 47-locale i18n at 100%. Live evidence on real backends holds, including the self-healing path on a corrupted sidecar recovering end to end.

Beyond the unit suite there is now a repeatable, seeded stress campaign for the self-healing format, run on the release binary. 2,017 end-to-end cases drive the real correct gen, verify and repair, each asserting one of two outcomes and never a silent wrong answer: a byte-exact recovery, or a correct fail-closed that leaves the original untouched. It is fully deterministic, so the same seed reproduces every byte.

Scenario	Cases	Pass
Damage within the parity budget recovers	240	240
Damage past the budget fails closed, file untouched	240	240
Corrupted locator copy heals from a good sibling	240	240
Two or three differing copies reconstruct by byte majority	240	240
Rotted geometry header heals from the directory copy	240	240
Destroyed locator fails closed	240	240
Foreign or garbage sidecar rejected, file left intact	120	120
Rotted parity shard routed around	180	180
Round-trip, 0 bytes to 1 MiB	144	144
Cross-implementation, app and `aerovault` crate	120	120
Multi-group, 24 MiB over several Reed-Solomon groups	4	4
Multi-window, 65 MiB over multiple windows	3	3
`aeroftp_correct_*` MCP tools end to end	6	6
Total	2,017	2,017 🟢

The standout: the app and the standalone aerovault crate emit byte-identical sidecars and repair each other's output, and the three aeroftp_correct_* MCP tools pass the same gen, verify, corrupt, repair, recover cycle.

The earlier open decisions are settled, the way this thread landed them. There is one unified detached recovery format, .aerocorrect, content-SHA bound, shared byte-for-byte between the app and the standalone aerovault crate (a cross-implementation fixture pins that). Detached is the default placement, so the encrypted container stays byte-identical to a plain v3. Overhead is the Low / Medium / Quartile / High levels (or a raw percentage).

Your two recent questions are answered in-thread: the self-healing redundancy costs about 360 fixed bytes and does not touch the main file's parity budget, and the fuzzy-hash idea is better served by the per-shard and per-copy failure counts we already compute (which I am happy to surface as a health readout on verify).

Your V3 Beta report on the crate discussion is handled too. Extract all, a first-class add-folder picker, the receipt save dialog, the "Time elapsed" wording, and a pre-flight memory guard so a very large input warns instead of OOMing are all done. Drag-and-drop onto the create screen, the change-mode action, true streaming vault I/O, and the .aerozip idea are honest follow-ups, and I will reply on that thread with the specifics.

On framing: there is no beta label. The validation here is extensive and the audit is continuous, so refinements land in later releases rather than gating this one. If you are comfortable, I will take that as the go-ahead and tag it. If anything still looks off to you, this is exactly the moment to say so and I will hold.

Thank you for the depth you brought to this one. It is a better format because of it.

4 replies

axpnet Jun 12, 2026
Maintainer Author

AeroVault compression, measured: AeroVault v3 vs zip / 7z / tar

Following the readiness note above, the one thing I would rather show than assert is how AeroVault actually compresses, so here is a full benchmark against the zip / 7z / tar encoders we ship.

Every format is driven through the one AeroFTP CLI, so the comparison is self-consistent: same machine, same corpora, same harness, timed under /usr/bin/time -v. Three repetitions, median reported. Every single row is round-trip verified, compress then extract, with the SHA-256 of the recovered file matching the original. 120 rows, zero round-trip failures.

The corpora span the honest range from compressible to incompressible:

Corpus	Shannon entropy (bits/byte)	Nature
text	4.38	English-ish prose
random	8.00	CSPRNG bytes (incompressible)
logs	4.82	repetitive structured logs
mixed	6.10	50% prose / 30% logs / 20% random

Shannon entropy is H = - Σ pᵢ · log₂ pᵢ over the byte distribution, sampled across the whole file (8 bits/byte means incompressible).

Compressed size as a percentage of original, by corpus

text corpus at 32 MiB

Format	% of orig	% saved	Comp MB/s	Ext MB/s	Peak RSS MB	Enc
aerovault-fast	21.7	78.3	31.5	49.9	180	AES
aerovault-balanced	20.5	79.5	20.0	49.8	180	AES
aerovault-archive	15.6	84.4	2.0	52.7	180	AES
zip-aes	19.2	80.8	27.4	275.9	84	AES
7z-aes	15.1	84.9	0.9	63.1	131	AES
tar.gz	19.3	80.7	28.5	336.9	55	-
tar.xz	15.1	84.9	1.2	102.2	131	-
tar.bz2	12.8	87.2	11.3	39.4	55	-

mixed corpus at 32 MiB

Format	% of orig	% saved	Comp MB/s	Ext MB/s	Peak RSS MB	Enc
aerovault-fast	31.8	68.2	32.4	49.0	180	AES
aerovault-balanced	31.2	68.8	23.5	49.7	180	AES
aerovault-archive	28.4	71.6	1.8	47.7	185	AES
zip-aes	30.5	69.5	38.7	295.4	84	AES
7z-aes	28.4	71.6	1.6	85.9	144	AES
tar.gz	30.5	69.5	40.8	401.5	55	-
tar.xz	28.4	71.6	2.2	149.7	144	-
tar.bz2	26.9	73.1	6.3	29.5	56	-

The speed vs size tradeoff, with the Pareto frontier

Speed and size pull against each other, so the chart marks the Pareto frontier: a format is on it when no other is both smaller and faster, and the hollow points are dominated (some frontier format beats them on both axes at once). The faint green wedge below-right of the curve is the unreachable small-and-fast corner, which is why it sits empty.

Compress and extract throughput

Peak memory during compress, and time vs input size

What the numbers say

The KDF floor is real and deliberate. On the smallest input AeroVault spends about 778 ms to compress against zip-aes at 46 ms, because Argon2id at 128 MiB runs once per container. That is brute-force resistance, not a slow compressor. The clear lever is a cached-key batch mode so the KDF is paid once across many small containers rather than per file.

On ratio, AeroVault is competitive at the dense end. On text, aerovault-archive lands at 15.6% against 7z at 15.1%, a 0.5 point gap, and the very tightest is tar.bz2 at 12.8%. AeroVault-fast trades some ratio for far more speed: about 16x faster than the dense AeroVault-archive profile and roughly 33x faster than 7z, which is the right default for an interactive app.

One honest result the frontier surfaces: on the text corpus tar.bz2 dominates 7z, tar.xz and AeroVault-archive, because it is both smaller and faster than all three. This is corpus-specific, bzip2 is unusually strong on prose, but it is a fair signal that the densest-but-slowest profiles are not the rational pick on text.

On incompressible data every format sits near 100%, with the largest expansion being tar.bz2 at 100.5%. That delta over 100% is the honest cost of framing and encryption on already-random bytes.

Two concrete improvement levers fall out of this. The 7z integration currently ignores --level and runs LZMA2 at a fixed preset, so wiring the level through would let users trade speed for ratio on 7z like they already can on zip and the tar filters. And a denser AeroVault profile, LZMA2 or zstd at max, would close the last half point to 7z for users who want the smallest possible encrypted container.

Output was byte-stable across all repetitions, and the run is reproducible from the same seed and binary.

EhudKirsh Jun 13, 2026
Collaborator

On incompressible data every format sits near 100%, with the largest expansion being tar.bz2 at 100.5%. That delta over 100% is the honest cost of framing and encryption on already-random bytes.

Can't there be a mechanism to reject increases in byte sizes due to compression and simply go with the original files instead?

EhudKirsh Jun 16, 2026
Collaborator

Let me ask the obvious question: why not switch the Archive and Balanced options from .zst to .tar.bz2 and .tar.gz/.zip respectively?
It seems like a better option. And to also show this chart wherever there is compression such as in AeroVault, AeroSync and AeroFile.
If .7z and tar.xz aren't the best, users should know this. Also, you can add .zst, esspecially the Fast option, to AeroFile.
With just a little bit of smart graph benchmarking like this, AeroFTP can outperform Kopia and Restic on compression.

axpnet Jun 16, 2026
Maintainer Author

Good question, and the benchmark is the right lens for it, so I am logging this as a tracked item rather than flipping the defaults on the spot.

The nuance the chart shows is that the best codec is corpus dependent. tar.bz2 dominated on prose specifically (bzip2 is unusually strong on text), but on mixed and incompressible data the gap closes and zstd keeps a large speed lead, which is exactly why the Fast default leans on zstd. So "Archive to bz2, Balanced to gz" is a real candidate, not a clear universal win, and that is the kind of call a per-surface benchmark should make rather than a snap switch.

Tracked as one item: re-evaluate the Archive and Balanced default codecs against the benchmark, surface the comparison chart wherever compression is offered (AeroVault, AeroSync, AeroFile) so the trade is visible instead of hidden, flag where 7z and tar.xz are not the rational pick, and add zstd (Fast especially) to AeroFile. The benchmark harness already exists, so this is evaluation and UI work, not new measurement.

Thanks for pushing on this.

axpnet · 2026-06-13T09:34:12Z

axpnet
Jun 13, 2026
Maintainer Author

After a docs and code review: naming clarity, AeroVault v4, and AeroCrypt Profiles

An honest note to open. We have just done a thorough pass over our documentation and our code together, and we found a number of things worth tidying, starting with terminology. Clear and consistent naming is something this community has asked for more than once, and it was overdue, so we are doing it in the open.

Three things come out of that review. First, we are settling on a clear name for our native client-side encryption overlay: AeroCrypt. The docs have referred to it as AeroFTP-Crypt, and the word overlay had drifted across surfaces, so we are fixing the vocabulary. Second, we are upgrading AeroCrypt from a CLI beta to a first-class component in the GUI. Third, we are adding the new AeroCrypt Profiles on top of it. The rest of this note is the concrete next step that gets us there.

To be precise about terms, the earlier request on this thread was for AeroVault v3 plus Error Correction, set out in the two comments above, the ready to go note and the benchmark. That addition, Reed-Solomon parity computed over the v3 ciphertext, is exactly what we mean by v4. The work is built, audited and benchmarked, so we are advancing with it.

v4 is the natural and awaited destination this thread has been building toward, not a detour. In plain terms it means a vault can now survive partial corruption: if bytes rot on disk, or a transfer damages part of the container, the Reed-Solomon parity rebuilds the damaged shards and the vault still opens, with verification happening before decryption so a tampered container fails closed. The overhead is a level you pick on the same QR style scale, and the Beta label comes off.

The same step consolidates the AeroVault cryptography into one audited engine, and that is what carries us to the rest of the destination. The native AeroCrypt overlay, which has matured through the CLI as a beta, comes to the GUI bound to a saved connection as an AeroCrypt Profile, so any commodity remote becomes a transparent zero knowledge store you browse in the normal dual panel, both sides encrypted and readable only inside AeroFTP. rclone-crypt stays beside it as the interop lane, the way Cryptomator sits beside AeroVault: we support the foreign format and we lead with ours. We will open a dedicated roadmap Discussion to carry that part forward.

To keep the two clearly apart, since they are easy to confuse, a quick reminder of the distinction:

	AeroVault	AeroCrypt overlay
Shape	one container file holds many	per-file, one local file maps to one remote object
On disk	a single `.aerovault` you can move around	a folder of encrypted blobs with obfuscated names
File names	kept inside the encrypted manifest	obfuscated per object on the store
Cross-file dedup	yes	no, so the one-to-one map and per-file sync hold
Browse and transfer	open the whole container	object by object, in the normal dual panel
Best for	one portable sealed archive	turning any remote into a transparent encrypted store

Both are built on the same audited crypto core, one codec and one audit pass: only the shape after encryption differs.

To be clear, this is not a request for sign off, it is a status update: we are advancing to v4, and gathering remarks from the community along the way. The overlay chapter then begins on top of it. Two points where your input would help most: whether the encrypted scope should default to a subfolder so a profile is never accidentally half encrypted, and whether the platform cipher should be chosen automatically or offered as an explicit toggle. Thoughts from anyone here are welcome, on these or on the terminology cleanup itself.

Edit, after a fuller review pass. To keep the record straight: AeroCrypt for our own improved overlay, and Rclone Crypt for the interoperable one, were already settled with Ehud in the Wrappers/Overlays roadmap (#272), as part of the C set (AeroChunk, AeroCompress, AeroCrypt, AeroCorrect) and the convention that the Aero prefix signals something new or improved, not a Rust port. So the naming here is finishing that alignment on the surfaces that still lag, mainly the docs that still say AeroFTP-Crypt, not a fresh decision. We are consolidating the full thread and the documentation into one checkpoint so none of that history is lost. The naming, like the wrapper stack itself, is Ehud's.

0 replies

EhudKirsh · 2026-06-13T14:55:23Z

EhudKirsh
Jun 13, 2026
Collaborator

There should absolutely be a choice between using Rclone Crypt and AeroCrypt, any time any path is encrypted. Considering that there are risks such as nonce-misuse with Rclone Crypt and the experimental nature of AeroCrypt, I think that both should be opt-in.
So the user won't just tap Enter on the default setting, but actively click (GUI) or type (CLI) their preferred encryption algorithm.

I was going to write more about this in length in #272, but it's best to write about this now since Crypt for AeroSync seems to be right around the corner. Basically, I'm sitting on a figurative mountain of notes where I write ideas as soon as I come up with them
(all the time), and I'm trying to keep up with your fast development pace (which I'm NOT complaining about) and not miss opportunities to share these with you before you reach a stage and publish some.
I also prefer not to ask to delay you or ask how you should manage your time, as I did at the beginning of #272.
I also don't want to just dump all of my notes without properly organising and formatting them as I do in my comments.

I know that I wrote before that I like the idea of each wrapper being its own profile, but I'm rethinking this. With Rclone,
the number of remotes I have is about twice the number of cloud drives, since I set up a Crypt remote for almost every cloud drive.
So that's 2N number of remotes, where N is the number of cloud drives. With up to 4 overlays in this system, that's up to 5N.
To declutter, I suggest an alternative: each profile can have a set of clickable badges in the My Servers page.
Clicking on a wrapper's badge displays the view of items in the profile after unwrapping, so it's like mounting Rclone Crypt remotes,
which show files in a remote after peeling a layer of encryption, like TOR Onion with multiple layers of encryption.

So the order is:

AeroFile: Shows plaintext original files in their local paths.
AeroCompress: Shows the files in the server after extracting (decompressing).
AeroChunk: Shows the files in the server in their original number before chunking.
Crypt (Aero or Rclone): Shows the files in the server after decrypting.
AeroCorrect: Shows the files in the server without the .aerocorrect files or embedded sidecar.
The icon of the profile in the My Servers page OR right-click and click '▷ Connect': Shows the files exactly as they are in the server.

Basically, the first wrapper, whichever it is, shows the files in their original plaintext form when clicking on it or mounting it.
So, for example, without AeroCompress, it will be AeroChunk. I have more to discuss regarding the order of these two, by the way.
Each step should have a Before and After switch, with the '▷ Connect'/icon showing the overall Before and After of the whole set.

I'm guessing that previewing these remote files and their metadata will be in RAM as opposed to temporary files. Whatever AeroFTP does currently when previewing the files that are in cloud drives, as well as what Rclone does in normal and Crypt remotes.

To perform Rclone's equivalent of lsf or tree on a Crypt remote, aeroftp-cli will like use a --wrapper or --overlay flag.
AeroMount will be able to mount each overlay if a user so chooses.

Each set will have its own path. When using Rclone, I personally like to direct Crypt remotes to "Crypt" subfolders. This leaves plaintext sibling items clearly separated from encrypted files and folders, including items that some cloud drives create and refuse to remove.
I like the idea of having one remote path for the entire set of all 4 overlays for simplicity. If a user wants multiple paths, either allow multiple sets for the same profile or duplicate the profiles and have a different set in each. By the way, I think now that it's best to make it impossible to change the order of the overlays, only choose which to use for each set. So if someone wants to try to compress Crypt, they can try that by using multiple sets, possibly by using multiple profiles. But for almost every (serious) use, the order shouldn't be flexible.

As for the symbol of each overlay badge, I plan to write about it in #272.
If you want to start with Crypt, you should use the lock 🔒 symbol, like "E2E" profiles have.

I was thinking of suggesting even having the ability to have multiple Crypt wrappers in the same set for paranoid defence-in-depth double-encryption, but I think it's best to achieve this with only a single Crypt overlay. So when setting up any overlay, the user will choose its settings, including the number of layers of encryption in Crypt, each having a different algorithm: XSalsa, AES, ChaCha, etc.
Just in case a user distrusts using only one algorithm. Perhaps there should be a way to unwrap these layers individually to debug,
but otherwise, a user only needs to worry about keeping one password and seeing the overall Before and After, and that's it.
Just like with AeroVault V2.

1 reply

axpnet Jun 16, 2026
Maintainer Author

This is exactly the kind of note worth posting now rather than holding for #272, thank you.

There should absolutely be a choice between using Rclone Crypt and AeroCrypt, any time any path is encrypted. Considering that there are risks such as nonce-misuse with Rclone Crypt and the experimental nature of AeroCrypt, I think that both should be opt-in.
So the user won't just tap Enter on the default setting, but actively click (GUI) or type (CLI) their preferred encryption algorithm.

Agreed, and that is exactly how v4.0.5 behaves already: there is no default cipher, you actively choose AeroCrypt or Rclone Crypt by clicking in the GUI or typing in the CLI, so nobody encrypts by tapping Enter.

On the experimental worry, that is precisely why we just ran a severe pre-release audit on the native overlay and hardened it (the green note above). AeroCrypt is built on AES-256-GCM-SIV, which is nonce-misuse resistant by design, the very risk you point at in Rclone Crypt's construction, and the format now binds length and carries a key-bound config MAC. So the two genuinely sit on different footings, and both stay opt-in.

I know that I wrote before that I like the idea of each wrapper being its own profile, but I'm rethinking this. With Rclone, the number of remotes I have is about twice the number of cloud drives ... So that's 2N number of remotes ... With up to 4 overlays in this system, that's up to 5N.
To declutter, I suggest an alternative: each profile can have a set of clickable badges in the My Servers page. Clicking on a wrapper's badge displays the view of items in the profile after unwrapping ... like TOR Onion with multiple layers of encryption.

I think you are right, and the 2N to 5N explosion is the real argument against wrapper-as-profile. The badge set on a single profile reads as the cleaner direction, and it lines up with what AeroCrypt Profiles is meant to be: one saved connection, a set of clickable wrapper badges, each badge a before/after view after peeling that layer, the Connect action showing the overall before/after. That is the model I want to design around.

So the order is: AeroFile ... AeroCompress ... AeroChunk ... Crypt (Aero or Rclone) ... AeroCorrect ...
I think now that it's best to make it impossible to change the order of the overlays, only choose which to use for each set.
Each set will have its own path ... I like the idea of having one remote path for the entire set ...
I think it's best to achieve this with only a single Crypt overlay ... the user will choose its settings, including the number of layers of encryption in Crypt, each having a different algorithm ...
previewing these remote files and their metadata will be in RAM as opposed to temporary files ...
aeroftp-cli will like use a --wrapper or --overlay flag. AeroMount will be able to mount each overlay ...
If you want to start with Crypt, you should use the lock 🔒 symbol, like "E2E" profiles have.

I agree with this whole shape as the design direction: a fixed wrapper order (AeroFile, AeroCompress, AeroChunk, Crypt, AeroCorrect) where you choose which layers to enable rather than reorder them, one remote path per set with multiple sets when someone wants different paths, defense in depth living inside a single Crypt overlay as a setting rather than as multiple Crypt badges, and 🔒 for the Crypt badge to match the E2E profiles. One detail I want to keep open is the AeroCompress versus AeroChunk ordering you flagged, since I am not yet certain which should sit first.

One point of fact rather than design: previews already resolve in RAM today, not via temp files, so that part of your guess matches how AeroFTP works now.

I do not want to turn this into commitments in the release thread, so I am not going to promise specific surfaces here. The honest next step is that this is the AeroCrypt Profiles chapter, and its detailed design (the badge set, the order question, per layer views and mounting, the symbols) belongs in the dedicated Wrappers/Overlays roadmap. I will carry your model there so we converge on it properly, and none of it gets lost. For v4.0.5 the scope stays what is built and audited: AeroVault v4, and AeroCrypt hardened and first-class, both opt-in.

On your earlier benchmark question, which I would rather answer with a fix than a promise:

Can't there be a mechanism to reject increases in byte sizes due to compression and simply go with the original files instead?

🟢 Done, and it ships in the imminent v4.0.5. ZIP compression now decides per entry: if a deflate pass would not make a file smaller, that file is stored as-is instead of deflated, so packing incompressible data no longer inflates it from compression, while compressible files still deflate normally. To be precise about scope, this is the per-entry store mechanism that ZIP is built for. The single-stream codecs (tar.gz, tar.xz, tar.bz2) still carry their small inherent framing on already-random bytes, which is the unavoidable cost of choosing that container, not compression expanding the payload.

axpnet · 2026-06-16T08:19:13Z

axpnet
Jun 16, 2026
Maintainer Author

AeroCrypt, audited and hardened before it ships green

🟢 You flagged the experimental nature of AeroCrypt, and you were right to. So before AeroCrypt rides into v4.0.5 as a first-class component, we put the native overlay through a severe pre-release audit with one target: a strict external reviewer should find nothing. It now passes that bar, and here is the whole pass in the open.

This also answers your opt-in point directly: there is no default cipher. You actively pick AeroCrypt or Rclone Crypt, by clicking in the GUI or typing in the CLI, exactly as you asked. We lead with AeroCrypt and keep Rclone Crypt beside it as the interop lane.

Encryption that browses like any other server

There is a second thing worth calling out, which you pointed at earlier. Rclone Crypt used to open in its own modal, a bit like an AeroVault container. Now both overlays, AeroCrypt and Rclone Crypt, are integrated directly into the normal dual panel. You unlock once, and from then on the decrypted folder reads, navigates and transfers exactly like a connected server, with clear names on your side and ciphertext on the provider's.

The practical payoff is that this turns any cloud into a zero-knowledge target. You get the same experience you would have on a natively encrypted provider like Filen, MEGA or Internxt, except you can have it on any of the 28 backends, even the ones that offer no encryption of their own, with the outer layer keyed only by you. On a provider that already encrypts at rest, the overlay is a second independent layer on top of theirs, genuine double encryption where the key that matters never leaves your device.

What the audit looked at

Five independent reviewers swept the whole native surface (the shared crypto core, the file format, the Tauri/GUI provider commands, the CLI crypt subcommand, and the merge integrity), then every finding was verified against the code, and the two highest-impact ones were reproduced live on a real backend before and after the fix.

Findings, and what we did

Severity	Finding	Status
HIGH	Unauthenticated config let a hostile remote downgrade the KDF/cipher (`version` chosen from an unsigned file)	FIXED: key-bound config MAC + reject missing/unknown version
MEDIUM	No length binding: a truncated object decrypted to a shorter plaintext, silently	FIXED: per-file block-count binding (v3)
MEDIUM	Legacy v1 still writable, safe only by an implicit invariant	FIXED: v1 and v2 are read-only, new writes are v3
HIGH (data loss)	Re-running `crypt init` on an existing overlay rotated the salt and orphaned every file	FIXED: refuse unless `--force`
LOW	Empty crypt password accepted by ls/put/get	FIXED: rejected
LOW	Tempfile failures could panic, header slice used `expect`	FIXED: clean exit codes, `map_err`
LOW/INFO	Config OOM, KDF on the async thread, non-atomic plaintext write, key zeroization	FIXED: size cap, `spawn_blocking`, atomic temp+rename, `Zeroizing`

Two of these were not theoretical. We reproduced them live on a real backend on the shipping binary, then re-ran the same probes after the fix.

The format, in one view

AeroCrypt writes one encrypted object per file (AECR). The hardened v3 layout binds length into the ciphertext:

AECR v3 object
┌────────┬─────────┬───────────────┬───────────────┬───────────────┐
│ MAGIC  │ version │ total_blocks  │ wrapped DEK   │ block[0..T]    │
│ "AECR" │  0x03   │  u64 (LE)     │ AES-KW, 40 B  │ per 64 KiB     │
└────────┴─────────┴───────────────┴───────────────┴───────────────┘
header = 4 + 1 + 8 + 40 = 53 bytes
block[i] = nonce(12) ‖ AES-256-GCM-SIV( DEK, plaintext_i, AAD_i ) ‖ tag(16)
AAD_i    = "AeroCrypt overlay v3 block" ‖ i(u64 LE) ‖ total_blocks(u64 LE)

Each block authenticates both its own index and the total count, and the reader requires that exactly total_blocks blocks are present with no trailing bytes.

Why truncation now fails closed (the math)

Let a file be T = ceil(n / 65536) blocks. Every block is sealed as

C_i = GCM-SIV_DEK( P_i , AAD_i ),  AAD_i = prefix ‖ i ‖ T

An attacker who drops the last k blocks leaves T - k valid blocks. Decryption of each surviving block still succeeds (its AAD is intact), but the reader checks

blocks_read == T   AND   bytes_consumed == len(data)

so T - k != T is rejected. Appending blocks or bytes fails the same two checks. Reordering fails because AAD_i pins the index. Cross-file splicing is impossible because the DEK is fresh-random per file.

Why a tampered config (or wrong password) now fails closed

The v3 config carries a key-bound MAC:

master_key = Argon2id( password , salt , 128 MiB, t=4, p=4 )
config_mac = HKDF-SHA256( master_key , "config MAC" ‖ version ‖ kdf_params ‖ salt )

It is recomputed and compared in constant time on every unlock. Since master_key already depends on salt, an attacker who rewrites version or salt cannot forge a matching config_mac without the password, so the downgrade is rejected. The same check turns a wrong password into a clear message instead of an empty listing.

Where it sits next to rclone crypt

For completeness, here is the rclone crypt format we read and write on the interop lane, and the two formats side by side. Rclone crypt is mature and widely used; its content cipher is a stream AEAD, so it is not nonce-misuse resistant by construction, which is the property you flagged and the reason AeroCrypt leans on AES-256-GCM-SIV.

Live evidence (real backend, real CLI, no mocks)

Probe	Before the fix	After the fix
Round trip, 0 B / sub-block / exact 64 KiB / multi-block / unicode name / nested	byte-identical	byte-identical
Truncate a multi-block object, re-upload, then read	exit 0, silent truncated plaintext	exit 4, rejected (decrypt fails closed)
Re-init an existing overlay (same password)	original file orphaned, undecryptable	refused (exit 9), `--force` to override
Empty crypt password on `put`	accepted, wrote unrecoverable object	refused (exit 5)
Wrong password	silent empty listing, exit 0	clean MAC error, exit 6

Gate

cargo fmt clean, clippy --all-targets -D warnings zero, the AeroCrypt codec suite 21/21, the full Rust test suite green (2297 lib + 384 integration and friends), and the frontend untouched (typecheck zero, 263 unit tests, 47 locale i18n at 100 percent). The CLI and the GUI share one codec, so the format guarantees are identical on both, and the data-loss and empty-password guards are mirrored on both surfaces.

What this means

AeroCrypt is no longer a CLI beta with rough edges. It is an audited, length-bound, tamper-evident overlay built on AES-256-GCM-SIV (nonce-misuse resistant by design, which is the property Rclone Crypt's construction does not give you), and it is opt-in with no silent default. That is the footing we wanted before it becomes the led encryption option in the GUI, with Rclone Crypt staying beside it for interop.

The full write up lives in our internal audit appendix, happy to share any specific part if useful.

5 replies

axpnet Jun 16, 2026
Maintainer Author

A sneak peek at AeroCrypt Overlay in action

axpnet Jun 16, 2026
Maintainer Author

NOTE: The AeroCrypt and AeroVault audit was performed by Claude Fable with MAX effort within 72 hours of public release. It consumed an entire 5-hour session in about 15 minutes, orchestrating Opus agents. Interestingly, it didn't find many vulnerabilities, although some appeared serious. Given the excessive token consumption, I stopped using it, and within a few hours it was suspended by the US government, as you know.

EhudKirsh Jun 16, 2026
Collaborator

I've heard about the drama between Anthropic and the US government. I thought it was only threats.
I didn't realise that it would affect your ability to use Claude. If this is serious, you might want to open an Announcement 📢 discussion to go into more detail on what's going on and what's your plan forward with your AI assistants and AeroAgent.

At any rate, I was just in the middle of writing feedback for the screenshots you shared:

We need to think about how to make the Quick Connect page not too cluttered with all the new overlays. I suggest considering making a 'Wrappers/Overlays' collapsible section, and ticking it reveals each overlay as a tickable collapsible sub-section.
So there's no need to see the password and salt fields, algorithm selection, etc, unless Crypt is ticked.
I think the title of that encryption/crypt section should be something like 'Crypt (Encryption)'.
- No need to mention the words 'Profile' or 'overlay', because those are obvious and would also be true for the other 3 overlays,
  so there's no point repeating them.
- 'Aero' shouldn't be in the title if it's only one of the options.
- I think Rclone Crypt should be on the left since it's older.
I don't think there should be too much text there, especially "Readable only from AeroFTP" when Rclone Crypt is selected.
Surely Rclone should be able to decrypt what AeroFTP encrypts with Rclone Crypt.
Shouldn't the salt field also have an '👁'?
I think it's best if the 'Password' and 'Salt (optional)' fields are labelled from above, like the other fields, not just with placeholders.

axpnet Jun 16, 2026
Maintainer Author

Thanks Ehud, and no need to worry on the Claude side, my access is fine and AeroAgent is unaffected, it was a lighter situation than it sounded. I will keep an Announcement in mind if anything material changes, but there is nothing to report right now.

All five UI points from your screenshot feedback are done on the way to v4.0.5. 🟢

We need to think about how to make the Quick Connect page not too cluttered with all the new overlays. I suggest considering making a 'Wrappers/Overlays' collapsible section.

Done. There is now a collapsible "Wrappers / Overlays" section, collapsed by default, so nothing shows until you open it. Crypt is the first sub-section inside it and future overlays slot in alongside, so the password, salt and algorithm fields only appear once you tick Crypt and pick a type. The section auto-opens when you edit a profile that already carries an overlay.

I think the title of that encryption/crypt section should be something like 'Crypt (Encryption)'. No need to mention 'Profile' or 'overlay'. 'Aero' shouldn't be in the title if it's only one of the options. I think Rclone Crypt should be on the left since it's older.

Done. The section is titled "Crypt (Encryption)", with no "Profile", "overlay" or "Aero" in it. Rclone Crypt is now the left button and AeroCrypt (native) is on the right.

I don't think there should be too much text there, especially "Readable only from AeroFTP" when Rclone Crypt is selected. Surely Rclone should be able to decrypt what AeroFTP encrypts with Rclone Crypt.

Agreed, that line was wrong for the interop case and is fixed. The general hint is shorter now, and the readability note is per type: Rclone Crypt reads "Standard rclone crypt format: also readable by rclone with this password", while the native format keeps "readable only from AeroFTP".

Shouldn't the salt field also have an '👁'?

Done, the Salt field now has the same show/hide eye toggle as the password.

I think it's best if the 'Password' and 'Salt (optional)' fields are labelled from above, like the other fields, not just with placeholders.

Done, both fields now have a label above them, with the placeholder kept only as a secondary hint.

These ship in v4.0.5. Thanks again for the precise pass.

axpnet Jun 16, 2026
Maintainer Author

Uh oh!

AeroVault Wrapper-Stack and Cryptography: Design Conversation #276

Uh oh!

Uh oh!

axpnet May 27, 2026 Maintainer

The pipeline at a glance

Checkpoint: agreed decisions

Wrapper-stack model

Pipeline order (post-correction)

Small-file packing (Ehud's contribution)

Error-correction position

Algorithm versioning

Cryptography matrix

Where it lives in the codebase

What this thread is for going forward

References

Replies: 11 comments · 36 replies

Uh oh!

EhudKirsh May 28, 2026 Collaborator

Uh oh!

Uh oh!

axpnet May 28, 2026 Maintainer Author

Uh oh!

EhudKirsh May 28, 2026 Collaborator

Uh oh!

axpnet Jun 3, 2026 Maintainer Author

Uh oh!

Uh oh!

axpnet Jun 8, 2026 Maintainer Author

1. Naming: the acronym is gone

2. The format decision: v4 is v3 plus Error Correction, not a new format

3. The decision I want your eyes on: where the recovery data lives

3.1 System comparison

3.2 Failure-mode coverage, stated precisely

3.3 The recommendation, with the reasoning

4. The metadata locator: now protected for the manifest (implemented)

5. The scheme and its parameters, with the reasoning for each

6. The commitment, restated

7. Open to changes and integrations

Uh oh!

Uh oh!

axpnet Jun 8, 2026 Maintainer Author

1. "Sidecar" is a property, not the separate-file placement

2. Table 3.1 missing the cost of both

3. Support all of them

4. Different algorithm per placement

5. Re-hashing cost

6. Sidecar format - look at Kopia / par2 first

7. CLI flags - settled

8. Error Correction over any file, idempotency, and both ordering

9. Percentage selection (Kopia / QR levels) and absolute byte targets

10. "Profile names", JSON, metadata locator

11. Mounting - the aspect I left out

Summary of what changes from this pass

Uh oh!

Uh oh!

EhudKirsh Jun 9, 2026 Collaborator

1. Sidecar Property

2. Table 3.1

4. Default Picture

6. Another Sidecar Overlay

9. Various Target Levels

Uh oh!

axpnet Jun 9, 2026 Maintainer Author

Uh oh!

Uh oh!

EhudKirsh Jun 9, 2026 Collaborator

Uh oh!

axpnet Jun 9, 2026 Maintainer Author

Uh oh!

axpnet Jun 11, 2026 Maintainer Author

AeroVault v3: stress validation for the Beta to Stable promotion

Method

Scenario families

Results

Representative commands (the battery runs thousands of these)

Conclusion

Uh oh!

axpnet Jun 11, 2026 Maintainer Author

AeroVault v4 (v3 + Error Correction): development follow-up, audit, fixes, and live evidence

axpnet
May 27, 2026
Maintainer

Replies: 11 comments 36 replies

EhudKirsh
May 28, 2026
Collaborator

axpnet May 28, 2026
Maintainer Author

EhudKirsh May 28, 2026
Collaborator

axpnet
Jun 3, 2026
Maintainer Author

axpnet
Jun 8, 2026
Maintainer Author

axpnet Jun 8, 2026
Maintainer Author

2. Table 3.1 missing the cost of `both`

8. Error Correction over any file, idempotency, and `both` ordering

EhudKirsh Jun 9, 2026
Collaborator

axpnet Jun 9, 2026
Maintainer Author

EhudKirsh Jun 9, 2026
Collaborator

axpnet Jun 9, 2026
Maintainer Author

axpnet
Jun 11, 2026
Maintainer Author

axpnet
Jun 11, 2026
Maintainer Author

EhudKirsh Jun 11, 2026
Collaborator

EhudKirsh Jun 12, 2026
Collaborator