Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Plausible clocks #68

Closed
krasserm opened this issue May 10, 2015 · 1 comment · Fixed by #102

Comments

@krasserm
Copy link
Contributor

commented May 10, 2015

Overview

At the moment, each instance of EventsourcedActor (EA) has its own entry in a vector clock. This allows to distinguish concurrent from potentially causally related events, emitted by different EAs, either at the same location (i.e. to the same local event log) or at different locations (i.e. to different local event logs that are connected to a replicated event log). In the worst case, this may lead to vector clocks that grow with the number of EAs. This problem could be mitigated by organizing state around aggregates and only routing events between replicas with the same aggregate id, for example.

However, the majority of use cases are about detecting concurrent updates to replicated EA state at different locations. EA state is usually not replicated within a single location so that concurrent EA state updates within a location are not an issue. This makes causality tracking within a location less important and co-located EAs can share an entry in a vector clock. This means that vector clocks only grow with the number of locations (instead of the usually much larger number of EAs). Consequently, tracking of local time (using a counter) can be moved from EAs to a local log actor. A log actor anyway maintains a local counter which is the sequence number of written events.

EAs at the same location can rely on the fact that they receive all events in the same order. This order is identical with the storage order in a local event log. Should an application still decide to replicate state within a location and concurrent updates are made to these replicas, these updates may be reported as causally related updates which boils down to a last-writer-wins conflict resolution strategy. If this is not an option, EAs can opt-in to have their own vector clock entry which is described in #103. With this strategy, concurrent updates can also be reliably detected within a location (at the cost of larger vector clocks) (see #280).

Assuming a small number of locations, this more coarse-grained approach to causality tracking removes the burden from application developers to choose a design and event routing rules that keep vector clock sizes small, yet covering the majority of use cases. We therefore propose to change the causality tracking default to the described alternative. An opt-in to more fine-grained causality tracking is covered by #103 (see #280).

Formalism

The approach described above uses less vector clock entries than sources of concurrent activity (EAs) i.e. EAs at the same location share a vector clock entry. This is formalized in plausible clocks.

With plausible clocks, pairs of causally-related events are correctly ordered by the clock, although some concurrent events may also be ordered. In Eventuate, in particular:

  • events emitted concurrently at the same location may be ordered by the plausible clock, whereas
  • events emitted concurrently at different locations are never ordered by the plausible clock i.e. are always correctly detected as concurrent events.

Given a function T that maps events to their vector timestamps, for any two events x and y, the following conditions hold (see also VectorTime operators):

  • x = yT(x) equiv T(y) (equivalent) (1)
  • xyT(x) < T(y) (happened-before) (2)

(1) states that plausible clocks assign unique timestamps and (2) is called the weak clock condition. A reported ordering of events x and y means that they are either causally related or concurrent whereas a reported concurrency means that they are actually concurrent:

  • T(x) < T(y) ⇒ (xy) ∨ (x ↔︎ y)
  • T(x) conc T(y) ⇒ (x ↔︎ y)

Related tickets

Plausible clocks can be an enabler for advanced routing capabilities. More fine-grained causality tracking is covered in optional actor-level causality tracking and optional client-level causality tracking.

@krasserm krasserm added the ready label May 10, 2015

@krasserm krasserm changed the title Causality tracking alternatives Log actor vector clocks May 17, 2015

@krasserm krasserm changed the title Log actor vector clocks Log-level vector clocks May 17, 2015

@krasserm krasserm removed the ready label May 26, 2015

@drozzy

This comment has been minimized.

Copy link

commented May 29, 2015

This sounds great -- are there any problems with this approach?

@krasserm krasserm self-assigned this Jul 23, 2015

@krasserm krasserm added this to the 0.3 milestone Jul 23, 2015

krasserm added a commit that referenced this issue Jul 23, 2015
Initial hack
- only ExperimentalIntegrationSpec passes
- ExperimentalIntegrationSpec covers scenario in #68
- all other tests fail
krasserm added a commit that referenced this issue Jul 23, 2015
Initial hack
- ExperimentalIntegrationSpec successfully run scenario described in #68
- all other tests fail
krasserm added a commit that referenced this issue Jul 27, 2015
Initial implementation
- only LeveldbEventLog manages a vector clock
- ExperimentalIntegrationSpec covers scenario described in #68
- some LevelDB-based tests fail
- all Cassandra-based tests fail
krasserm added a commit that referenced this issue Aug 3, 2015
Enable integration and multi-jvm tests again
- event log specs still ignored
- causality specs for #68 added

- delay method unsupported (needs re-implementation)
- delay tests ignored

 - fixed snapshot saving bug
krasserm added a commit that referenced this issue Aug 3, 2015
Enable integration and multi-jvm tests again
- event log specs still ignored
- causality specs for #68 added

- delay method unsupported (needs re-implementation)
- delay tests ignored

 - fixed snapshot saving bug

@krasserm krasserm changed the title Log-level vector clocks Plausible clocks Sep 4, 2015

krasserm added a commit that referenced this issue Sep 8, 2015
Plausible clocks
- closes #68

@krasserm krasserm closed this in #102 Sep 8, 2015

@krasserm krasserm removed the in progress label Sep 8, 2015

@krasserm krasserm modified the milestones: 0.4, 0.3 Sep 8, 2015

krasserm added a commit that referenced this issue Jun 30, 2016
Initial work
- Optional dot for vector time
- Support for sequence gaps
- Some tests still failing [ci skip]
- superseded by #68 (see #280)
- work stopped (see #267)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.