rfc: mutability & encryption for forge by fforbeck · Pull Request #84 · storacha/RFC

fforbeck · 2026-03-04T16:47:27Z

After the feedback received and the new POC completed by @alanshaw, I decided to break it into 2 RFCs:

hannahhoward

Overall, this is good, but there are some critical missing bits, that I think will emerge from working closely with @Peeja , @alanshaw and the existing go devs.

Specifically,

"Service layer" -- is this a server? a secondary process process on the dev machine? I don't either of these are a good idea, and the server approach would break a number of design principles about our system (namely that all CIDs should be generated on the client). Personally I think a language port is gonna be WAY faster (especially with AI aided dev) and product a way less complex system, so I'd argue strongly for a full port. These aren't complex libraries and I think we could have them ported in a week or two with the AI helping. And then we have a single process for a single machine, way simpler to maintain and reason about.
There's a bit of unspecified confusion about how Pail works in the Forge context, that I think you might need to embed with @Peeja on guppy to really grok. So Guppy has a notion of "sources" -- i.e. data sources (usually large, deep directories) that get uploaded within a space. Each space has 1..n sources, and when you upload within a space, after the first upload of a source, only the "delta" gets updated-- Guppy knows how to upload just blocks to make a new updated UnixFS root. So with mutabiltiy:

You have the list of sources which get updated, and you DEFINITELY want that to be represented by Pail + UCN.
You have the directory tree structure within the sources itself. This is currently UnixFS and is updated properly each incremental upload.
So the real question is about whether to use Pail for the whole directory tree, and I think that's a complicated question that merits further examination

Reasons not to use Pail:

These are extremely big complicated directories and Pail hasn't been tested at a scale even remotely close to working with these directories
The retrieval patters and general usage for Pail is totally different than for UnixFS -- so the downstream change implications of using Pail for the whole directory tree structure are unknown.

Reasons to use Pail:

Much more fine grained "multi-writer" capabilities are unlocked if you use Pail for everything. If you used pail for just the sources list, then you'd essentially have a last-writer-wins on a per-source level -- if source X is in state A, and two different guppies make several changes to the directory tree structure, written as UnixFS, then the directory structure would by default ONLY get the changes of the last client to write. Note: we could apply a smarter merge outside of PAIL, similar to the way I merge Markdown files in Clawracha. I actually believe this wouldn't be TOO hard.

Final sidebar: Current Guppy is also smart enough to only upload diff blocks for Files when they change. Encryption will kill that ability I believe, unless there's some useful way to encode only changes that works for encrypted data. Worth a google.

alanshaw · 2026-03-05T15:55:28Z

rfc/forge-mutability-encryption.md

+│          │                                                   │
+│          ▼                                                   │
+│  9. Publish to UCN: Name.publish(pailRootCID)                │
+│     → mutable name now points to updated index               │


This is "pail without CRDT" - in the case of multiple concurrent updates to the same name, the UCN resolution is to just use the first of the alphabetically sorted CIDs (IIRC). It means if 2 users start with the same pail, and both make an update, only 1 wins.

The Pail CRDT library allows the two updates to be applied, only resorting to alphabetically sorted CIDs when the two updates have the same causal order and touch the same key.

alanshaw · 2026-03-05T16:05:46Z

rfc/forge-mutability-encryption.md

+│     - KMS info                                               │
+│          │                                                   │
+│          ▼                                                   │
+│  5. Extract encrypted content from CAR using encryptedDataCID│


Why don't we encrypt each block?

POC: storacha/guppy#376

fforbeck · 2026-03-11T17:20:19Z

After the feedback received and the new POC completed by @alanshaw, I decided to break it into 2 RFCs:

hannahhoward

I'm pretty set on using Pail, especially since the go library exists, but if @alanshaw disagrees I'd be open to using the catalog approach.

hannahhoward · 2026-03-16T15:27:37Z

rfc/forge-encryption.md

+```
+
+**POC Status:**
+- Step 1 (gateway fetch): ✅ Existing infrastructure


no gateway in the forge context -- direct retrieval from SPs

@hannahhoward, when we run the guppy gateway serve, is that how the SPs will allow retrievals? Through this "gateway".

hannahhoward · 2026-03-16T15:29:01Z

rfc/forge-encryption.md

+
+Guppy SHOULD support two types of key rotation via CLI commands (KMS mode only):
+
+### KEK Rotation (Space Key)


This is pretty cool that we can rotate the core RSA key without reencrypting all the files.

rfc/forge-encryption.md

hannahhoward · 2026-03-16T15:30:54Z

rfc/forge-encryption.md

+
+This mode does NOT provide access control — anyone with the key can decrypt.
+
+### KMS Mode (Production/Enterprise)


I believe all external forge users will be in this tier

hannahhoward · 2026-03-16T16:10:55Z

rfc/forge-mutability.md

+
+## Approaches Under Consideration
+
+### Option A: Simple Catalog (Alex's Proposal)


So:
I do not believe a single catalog file is a good idea for the size of Forge directories (potentially several thousands or up to a million files/folders) -- an update would mean a large upload just to change a single value.

Yes, you could start chunking past a certain size, but then you're chunking up a key value list and essentially reimplementing Pail from scratch. Pail already considers chunking and all that comes with it -- like inserts and deletes, as well as sorting and range querying.

And https://github.com/storacha/go-pail already exists

hannahhoward · 2026-03-16T16:12:48Z

rfc/forge-mutability.md

+
+**How it works**
+
+- **Namespace:** UCN Names - ed25519 keypairs that can be delegated and shared


UCN is a bit of a misnomer as a library one needs to port -- it's all of a few hundred lines of code and is really just scaffolding to provide a simple UI around clock/head + clock/advance.

Ultimately I think we can just use UCN + Pail

hannahhoward · 2026-03-16T16:16:22Z

rfc/forge-mutability.md

+**How it works**
+
+- **Namespace:** UCN Names - ed25519 keypairs that can be delegated and shared
+- **State Index:** Pail - sharded Merkle trie for `path → CID` mappings  


This is an interesting question as it relates to uploads -- we upload large folders as UnixFS -- this is important for retrieval as all software tends to assume UnixFS.

We've discussed various proposals for putting the whole index in Pail vs putting only the uploads in UnixFS.

I think we could get by with just Pail for the upload list, or, if we want to do pail for everything, I think it would make sense to keep uploading in UnixFS (for traditional retrieval patterns) but also store all the file CIDs in Pail

rfc: mutability & encryption for forge

b6a46df

fforbeck requested a review from a team March 4, 2026 16:47

fforbeck self-assigned this Mar 4, 2026

fforbeck mentioned this pull request Mar 4, 2026

Plan privacy/mutability story storacha/project-tracking#663

Open

hannahhoward reviewed Mar 5, 2026

View reviewed changes

alanshaw reviewed Mar 5, 2026

View reviewed changes

fforbeck added 2 commits March 11, 2026 14:16

rfc: forge encryption

a3a48cd

rfc: forge mutability

8cdaec7

fforbeck requested review from alanshaw and hannahhoward March 11, 2026 17:19

hannahhoward requested changes Mar 16, 2026

View reviewed changes

minor updates in the forge-encryption rfc

4d94f49


		Guppy SHOULD support two types of key rotation via CLI commands (KMS mode only):

		### KEK Rotation (Space Key)


		This mode does NOT provide access control — anyone with the key can decrypt.

		### KMS Mode (Production/Enterprise)


		## Approaches Under Consideration

		### Option A: Simple Catalog (Alex's Proposal)


		How it works

		- Namespace: UCN Names - ed25519 keypairs that can be delegated and shared

Conversation

fforbeck commented Mar 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hannahhoward left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

fforbeck commented Mar 11, 2026

Uh oh!

hannahhoward left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

fforbeck commented Mar 4, 2026 •

edited

Loading

hannahhoward left a comment •

edited

Loading