Conversation
Introduce content AAD binding (content encryption version 4) to tie ciphertext and chunks to their storage_key, adding version-aware encryption/decryption and a new streaming-v2 format. Add AAD support across chunked encoders/decoders, streaming encoders/decoders, and make chunk metadata mark v2 when AAD is used. Implement parallel chunk downloads with a concurrency limit to improve throughput. Replace the RwLock<HashMap> forest cache with a DashMap-based per-bucket cache that stores timestamps and a dirty flag, add TTL handling, invalidate APIs, and make FulaClient Cloneable. Add dashmap dependency. HPKE: improve getrandom error messages, expose a check_entropy() helper, and rename encrypt_for_multiple to wrap_dek_for_multiple (deprecated alias kept). Add constants for the new content crypto version and other helpers, plus several parallel key-rotation utilities and tests. Backwards compat: legacy version 2 decryption remains supported.
Introduce sharded private-forest format (v3) and client-side handling. Adds ShardedPrivateForest, ShardManifest, ForestShard, EncryptedShardManifest, EncryptedForestShard, derive_shard_key, shard_for_path, compute_initial_shard_count, detect_forest_format and related constants (SHARDED_MIGRATION_THRESHOLD, RESHARD_THRESHOLD, MAX_SHARDS). Update fula-crypto exports for sharded types. Update encrypted client to support both monolithic and sharded forests: replace forest cache struct with ForestCacheEntry enum, add is_forest_sharded, load_shard, load_all_shards, ensure_forest_loaded, save_sharded_forest, migrate_to_sharded, migrate_to_monolithic, forest_file_count and shard-aware logic for put/get/list/delete/flush operations. Implement auto-migration/resharding flows, lazy shard loading, and per-shard dirty tracking. Switch forest detection from x-fula-forest header to deterministic index format detection (detect_forest_format) and remove usage of the x-fula-forest metadata header. Update rotation scanning to skip deterministic shard keys. Several API paths updated to handle both formats and ensure backward compatibility.
Introduce streaming & concurrency improvements and security fixes from audit2: add streaming download API (get_object_decrypted_to_writer/get_object_chunked_to_writer) to stream decrypted data to a writer with bounded memory; add bounded parallel chunk download/upload with semaphores and a MAX_CONCURRENT_CHUNK_UPLOADS constant; centralize parallel download logic into download_chunks_parallel. Add AAD support for chunked encoding (with_aad_and_chunk_size / with_aad) which produces streaming-v2 format and bind chunk AAD to storage key (upgrade metadata version to v4). Prevent leaking content_type in unencrypted chunk metadata and provide cleanup on partial upload failure (best-effort deletion of uploaded chunks). Change delete ordering to update/save forest before deleting storage and add best-effort chunk-object deletion when removing chunked files. Ensure forest path handling is normalized (normalize_dir_path) and dirty cache entries are retained on invalidation. Update share validation to use the original (unobfuscated) key and propagate this change through Flutter/wasm/js bindings. Add extensive unit tests (tests/audit2_tests.rs) and chunked-encoder/decoder tests to cover v3/v4 behavior, ordering, and AAD properties.
Address multiple security/audit findings and robustness issues for encrypted forests and chunked content. Key changes: - Track per-shard ETags in the forest cache and persist/update them after conditional shard PUTs so writes can use If-Match to detect races (concurrent-writer protection). Roll back/clean up on failures during multi-phase uploads/migrations. - Strengthen chunked download verification: switch to a verified streaming decoder, check chunk counts and total size, then finalize/verify the Bao root hash to detect truncation/tampering. - Add opt-in cross-session replay protection via per-bucket sequence floors (get_forest_sequence / set_forest_sequence_floor / check_sequence_floor) and enforce during forest loads. - Defense-in-depth to block plaintext responses for paths known to have been uploaded encrypted: add an "encrypted" flag to ForestFileEntry, mark entries on encrypted uploads, and refuse plaintext reads when the forest says an object must be encrypted. - Harden HPKE decryption error handling to collapse attacker-controllable failures to a generic authentication error while preserving local key parse errors. - Improve listings/memory by avoiding unnecessary intermediate cloning when building DirectoryListing and list_all_files. - Tests updated to include the new `encrypted` field defaulting to false where needed. These changes are primarily to satisfy audit fixes (C-AUDIT-001/002/003/004/008) and to make upload/download flows more atomic and tamper-resistant.
NEW-7.2 — WAL + optimistic-merge retry for the disjoint-shard writer
race. Two concurrent writers touching disjoint shards could leave S3
with a manifest whose shard_sequences disagreed with the shard blobs
actually persisted, tripping the per-shard AEAD seq check on every
later read and locking readers out of those shards. Adds a per-bucket
WAL (crates/fula-client/src/wal.rs) MAC'd with a domain-separated
derived key, records each dirty Insert/Remove, and — critically —
appends a WalEntry::ShardWrote{ idx, seq, etag } the moment a phase-1
shard PUT succeeds so replay can reconcile the winning manifest's
shard_sequences/shard_etags to what S3 actually holds. save_forest /
save_sharded_forest now loop up to MAX_FLUSH_RETRIES on 412, evicting
the cache, re-loading the winner, replaying the WAL (ShardWrote first,
then Insert/Remove), and re-flushing. Startup recovery uses the same
replay path, gated once-per-session by wal_recovered_buckets.
NEW-2.1 — forest_entry_requires_encryption now calls
ensure_forest_loaded first and is async, so a cold cache no longer
creates a window where a storage backend serving plaintext for an
encrypted path is silently accepted. Updated all 5 read sites.
NEW-7.1 — rotate_bucket_with_journal now BLAKE3-MAC's each journal
line with a key derived via KeyManager::derive_path_key("rotation-
journal-mac:<bucket>"); forged lines fail verification and are
re-rotated on the next run.
NEW-L.4 — the journal file is guarded by an exclusive fs2 lock;
concurrent runs return ClientError::RotationInProgress instead of
garbling the journal.
NEW-L.7 — the v5→v6 downgrade guard is persisted per-bucket under the
state dir (atomic temp+rename) and re-loaded on cold start, so a
forged v5 manifest served across a client restart is still refused.
Also fixes a related latent bug in load_shard: the "already loaded"
short-circuit was returning cached bytes without re-checking
expected_seq against manifest.shard_sequences, which would have hidden
post-412 inconsistencies on a retry. The check is now unconditional.
Out of scope / known edges:
- Narrow crash window between a successful phase-1 shard PUT and
the wal::append(ShardWrote) that records it. Recovery from that
window is outside NEW-7.2 and is tracked separately.
- NEW-2.2/2.3/3.1 and NEW-1.1/1.2/L.1/L.2/L.3/L.5/L.6 — ignored or
by design per the audit plan.
Tests: tests/audit3_wal_rotation_tests.rs (4 tests) covers forged-MAC
rejection, exclusive-lock contention, monolithic 412→WAL replay, and
sharded 412→WAL replay (the last was the one that exposed the
ShardWrote gap before the reconcile-on-replay logic was added).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
t solve this issue:
Dimension: Scalability — files
Fula mechanism (evidence): Auto-shard at 5 k files (private_forest.rs:740); MAX_SHARDS=256 ceiling → ~2.56 M
files/bucket practical limit (NEW-L.6)
WNFS mechanism (evidence): HAMT branching=16, bucket=3 (wnfs-hamt/constants.rs); no hard ceiling; O(log₁₆ N)
Fula: 3.5
WNFS: 4.5
Improve-able in fula?: Yes — lift MAX_SHARDS + add HAMT-inside-shard, or per-shard sub-shard. Forward-only
migration; planned
Add buffered chunked-download path and per-chunk timeouts, plus a set of v7/indexing and migration improvements. Key changes: - Config: add per_chunk_download_timeout and buffered_download_max_bytes with sensible defaults. - Downloads: implement buffered download APIs (root-verified-before-emission) and internal helpers; enforce buffered size ceiling and per-chunk fetch timeout to avoid stalls. - Concurrency: replace ShardedHamt Mutex with RwLock to allow concurrent readers; adjust code to use read()/write() accordingly. - Migration/test hooks: introduce test-fault-injection feature and atomic flags to simulate crashes at two migration points; add env override for migration heartbeat interval for faster test cycles. - Manifest pinning: persist/read manifest-version in a MAC'd format (version\thex_mac) to detect tampering while accepting legacy plain format for upgrade. - Listing: add paginated directory listing (list_directory_paginated) and paginated forest walk for FlatNamespace to bound memory for large prefixes; preserve legacy single-page behavior when unbounded. - Cleanup/orphans: add orphan_drain_in_flight tracking, concurrent chunk delete implementation on native (returns failure count and logs), and enqueue-on-failure behavior for robust retry. - Misc: expose internal http_client accessor; verify BAO root when resuming chunked uploads; many test files and support modules added. These changes improve robustness (timeouts, verified buffered reads), testability (fault injection, heartbeat override), and scalability (paginated listings, concurrent deletes, reader concurrency).
Introduce end-to-end integrity and downgrade protections plus reliability helpers. Compute and store a BLAKE3 content_hash on upload and verify it on download (streaming and single-block) via enforce_content_hash; add enforce_min_version to prevent downgrade-to-no-AAD. Add forest_entry_lookup to load per-key forest entries for owner-read verification. Implement jittered exponential flush backoff with a process-wide counter and use it in retry paths. Add MultipartAbortGuard RAII to best-effort abort orphaned multipart uploads (w/ wasm spawn support) and wire it into upload_large_file. Add WAL transactional group append/load support with WAL_TRUNCATED_GROUPS counter and an append_group helper; expose diagnostics (flush_backoff_count, wal_truncated_groups_count) and include tests for new behaviors. Also add wasm-bindgen-futures dependency for wasm abort spawning.
Introduce an LRU CachedBlockStore and wire it into the FlexibleBlockStore so block-level LRU caching can be enabled/disabled and sized by MB. Key changes: - New CachedBlockStore with hit/miss counters, with_mb helper and inner_ref access; tests added/updated. - FlexibleBlockStore gains a Cached variant and with_cache_mb(), and is_persistent recurses through caches. - IpfsPinningBlockStore: remove internal DashMap cache and delegate reads/writes to the CachedBlockStore layer; simplify put/get/has/delete/block_size semantics. - Add Box<T: BlockStore> impl for BlockStore to allow boxed stores to be passed around easily. - CLI/config: add block_cache_mb option and wire it into AppState to optionally wrap the chosen store in the LRU cache. - WAL: add PageWrote and DirIndexWrote entries and clear WAL after migration; implement robust page/dir-index flush phases in EncryptedClient (load/flush/protect against races and concurrent modifications), plus manifest page loading and directory-index loading/rebuild logic. - Crypto: export manifest/page/dir-index types and helpers (ManifestRoot/Page, EncryptedManifestPage, DirectoryIndex, derive keys, PAGE_SIZE/MAX_PAGES, etc.) and raise MAX_SHARDS; adjust private-forest handling and migrations to use meta-HAMT pages and reconcile flush state. These changes separate caching concerns from the IPFS/pinning store, add persistent LRU caching support, and implement meta-HAMT + directory-index features and WAL records needed for robust sharded manifest handling and migrations.
Existing integration tests all run at the 16-shard default, so the meta-HAMT page loop was unexercised across flush/PUT/load. Add a test-only `FORCE_INITIAL_SHARD_COUNT` override (behind the `test-fault-injection` feature) and two integration tests that drive multi-page flush, fresh-client load, and the 412-replay race. Also pin `dir_index_seq` inside ManifestRoot so a forged-but-decryptable dir-index ciphertext under a different seq triggers rebuild instead of silently serving stale state; the ETag-only check can't catch that swap because `load_object` drops S3's HEAD ETag.
Introduce robust pinning and multipart handling changes: cache local IPFS multiaddrs (origins) in IpfsBlockStore and attach them to pin requests; add a semaphore to cap concurrent /pins POSTs (configurable via IpfsPinningConfig) to avoid burst 429/503s; implement retry/backoff for pinning-service add_pin with transient vs permanent classification and tests. Make MultipartUpload thread-safe and idempotent: store parts behind an interior mutex, add finished flag and cached ETag for idempotent complete/abort, provide detach() to suppress drop-time auto-abort, perform auto-abort in Drop when appropriate, and expose a bytes_uploaded stub. Update Flutter bindings to use a per-handle semaphore, add start_multipart_with_concurrency and detach_multipart, and adjust example usage. Add unit/integration tests covering semaphore behavior, origins forwarding, backoff behavior, and concurrency invariants.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Enhanced security and speed and scalability. Most changes are on client side