Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(CIP-145): updates from forum discussion #149

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
54 changes: 39 additions & 15 deletions CIPs/cip-145.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,12 +34,20 @@ below, and would result in pruning of Data Events coinciding with Time Events.
### Definitions
* A Ceramic `synchronization layer` synchronizes all the events for a Ceramic stream.
* A Ceramic `aggregation layer` converts all the events in a Ceramic stream to its corresponding tip state.
* A `fork point` is the last common ancestor of two events `A` and `B`.
* A `merge point` is the first common descendant of two events `A` and `B`.
* A `branch` for an event is the set of all events transitively covered by that event.
* A `pruned branch` is any branch that is not covered by the tip.
* A `fork point` for a branch is the earliest event on that branch that is not on another branch.
* A `merge point` for two events is an event that transitively covers both events.
* `A` is a `covered event` if another event `B` has `A`'s CID in its `prev` field.
* Said differently, if event `B` has event `A`'s CID in its `prev` field, then event `B` "covers" event `A`. By
transitivity, event `B` also covers every event covered by event `A`.
* The `time` of a Data Event is the timestamp of the earliest Time Event that covers the Data Event.
* `A` is an `uncovered event` if there is no event with `A`'s CID in its `prev` field.
* A `diverged stream` has more than one uncovered event.
* A `converged stream` has a single uncovered event.
* A `dominant Data Event` is one that is not covered by another Data Event.
* A `non-dominant Data Event` is one that is covered by another Data Event. This applies transitively through Time
Events.
* A `diverged stream` has more than one dominant Data Event.
* A `converged stream` has a single dominant Data Event.
* An `invalid Data Event` is one that has either an invalid signature or an expired CACAO.
* A `valid Data Event` is one that has a valid signature and a valid CACAO (if applicable).

Expand All @@ -51,17 +59,33 @@ branches are pruned. Since pruning branches that contain Data Events would resul
a Data Event to contain multiple ancestor events in its `prev` field so that a new merge point Data Event can cover
multiple events.

1. If a stream is in a diverged state (see events `A` and `B` in fig. 5), we only consider branches that contain valid,
uncovered Data Events.
2. For branches that contain valid, uncovered Data Events, we only consider the first Data Event on a branch after the
fork.
3. The Data Event that is covered by the earliest Time Event wins (see event `A` in fig. 5).
4. If two uncovered Data Events are covered by Time Events at the same block height, the Data Event with the lower CID
wins.

Using multi-prev Data Events allows us to reduce the number of uncovered events and converge the stream so that there is
only a single uncovered event, without any data abandoned on pruned branches. The stream's converged/diverged state can
be determined by looking at the `prev` fields of all the Data Events for that stream.
1. If a stream is in a diverged state, each uncovered event is a candidate tip.
2. Branches that do not contain dominant Data Events cannot be the tip.
3. For branches that contain dominant Data Events, consider the earliest Data Event after a fork point.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be nice to clarify that it's not just this event that will be included in the new tip, but its corresponding branch. I like first more than earliest, because the latter made me think of anchor time instead of ordering.

Suggested change
3. For branches that contain dominant Data Events, consider the earliest Data Event after a fork point.
3. For branches that contain dominant Data Events, consider the first Data Event after a fork point when electing a new tip branch.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we actually did mean to refer to anchor time here. This helps keep the language consistent with other places.

We hope this earlier clarification of a fork point help clarify what we mean:

* A `fork point` for a branch is the earliest event on that branch that is not on another branch. 

4. The branch with the earliest Data Event becomes the tip.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be worth mentioning the case for step 4 in the example below, where one candidate branch is anchored and one isn't. Is the anchored one earlier by definition?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the anchored one is earlier by definition. A Data Event without a Time Event is as if it occurred at time infinity.

We'll update the rule to state this.

5. If multiple earliest Data Events have the same time, then the Data Event with the lowest CID becomes the tip.

For an example of these rules in action, see the figure below:
![Alt text](../assets/cip-145/rules1.png)

1. Based on rule (1), the stream state has 4 candidate tips, `Time 5`, `Time 6`, `Time 4`, and `Data F`. One of these
candidate tips will become the tip of the stream.
![Alt text](../assets/cip-145/rules2.png)
2. The stream forks between the branches for `Time 1` and `Data A`. Based on rules (3) and (4), the branch for `Data A`
is the only branch considered for tip selection.
![Alt text](../assets/cip-145/rules3.png)
3. This branch later forks into additional branches for `Data E`, `Time 4`, and `Data F`. Based on rules (3) and (4),
the branches for `Data E` and `Data F` are the only branches considered for tip selection.
![Alt text](../assets/cip-145/rules4.png)
4. Based on rule (5), since there is a Time Event for `Data E` but not for `Data F`, only the branch for `Data E` is
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be rule 4.

Suggested change
4. Based on rule (5), since there is a Time Event for `Data E` but not for `Data F`, only the branch for `Data E` is
4. Based on rule (4), since there is a Time Event for `Data E` but not for `Data F`, only the branch for `Data E` is

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we end up in the case where there is a yet-unknown, earlier anchor for the other branch in transit?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, yes, it should be rule 4. It is time that made the decision because a Data Event without a Time Event is as if it occurred at time infinity.

Can we end up in the case where there is a yet-unknown, earlier anchor for the other branch in transit?

Yes, that's possible. If there was an as of yet unknown Time 5.5 corresponding to Data F, then Time 5.5 becomes the tip.

While a lot less likely in the absence of a malicious CAS attempting a late-publishing attack, it is also possible for example for a Time 1.5 covering Data B to be discovered late. This would rewind the state of the stream, marking Time 5 the tip.

Having said that, this spec provides a way for the application to resolve such a situation without data loss. A user can decide whether to override the default tip with a new event, while keeping the stream history intact.

considered for tip selection.
![Alt text](../assets/cip-145/rules5.png)
5. `Time 6` is the tip of the surviving branch, and therefore becomes the tip of the stream.
![Alt text](../assets/cip-145/rules6.png)

Using multi-prev Data Events allows us to reduce the number of dominant Data Events and converge the stream so that
there is only a single dominant Data Event, without any data abandoned on pruned branches. The stream's
converged/diverged state can be determined by looking at the `prev` fields of all the Data Events for that stream.

Events that have invalid signatures cannot be tips, even if uncovered. This has important implications for Data Events
with expired CACAOs. In figures 5 and 6, if we assume event `A` has an expired CACAO, the aggregation layer can choose
Expand Down
Binary file added assets/cip-145/rules1.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/cip-145/rules2.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/cip-145/rules3.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/cip-145/rules4.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/cip-145/rules5.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/cip-145/rules6.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.