response-2018-10-10

lkuper · Oct 9, 2018 · 42e8cfd · 42e8cfd
1 parent 27bddfb
commit 42e8cfd
Showing 1 changed file with 46 additions and 0 deletions.
diff --git a/responses/7868-response-2018-10-10.md b/responses/7868-response-2018-10-10.md
@@ -0,0 +1,46 @@
+This is the original paper on Lamport clocks! It's one of those really fascinating older
+papers that seem to have a lot more license to discuss implications and internal
+digressions than modern papers do.
+
+    A system is distributed if the message transmission delay is not negligible compared to the
+    time between events in a single process.
+
+This is a really good definition of what a distributed system is. Actually—
+
+    …a single process is defined to be a set of events with an a priori total ordering.
+
+This is a _really_ good definition of a process.
+
+I guess that's what happens when you're in a position to essentially lay out a lot of the
+fundamental assumptions about a field.
+
+The paper builds the partial-order-based setup of analysis of distributed systems from
+these foundations, and then moves to Lamport clocks. A Lamport clock is a global, logical
+clock that respects the partial order of messages within a distributed system: i.e., it
+monotonically increases across a single process' events and across messages being sent and
+received from different processes.
+
+This can be implemented fairly easily, if all messages contain logical timestamps; each
+process maintains its own local clock (incrementing it between events) and upon
+receiving a message, updates its clock to be later than the message's timestamp if
+necessary.
+
+Lamport argues that the induced total ordering on events in the system is valuable for
+many kinds of distributed systems—and indeed it is, though of course as we mentioned in
+class it doesn't quite correctly capture actual causality in a distributed system.
+
+Lamport then extends this to a scheme that attempts to do wallclock synchronisation across
+multiple machines. It's the same idea: when you receive a message, if it attests to a time
+that's later than you think it is (modulo the expected minimum latency), then update your
+clock to stay in sync. Apparently this actually does keep your distributed system in sync;
+the proof of this is rather nontrivial, and I'm not going to try to parse its subtleties.
+
+I will however raise a question: so this system doesn't keep the distributed system in
+sync with _actual time_, right? Lamport needs the requirement that clocks never be set
+backwards for Very Understandable Reasons, but surely this means that the system will
+synchronise itself to the fastest clock in the network, right?
+
+I'd also like to have a look at what the state of the art was at the time for cooperative
+protocols that handled failure/latency. Byzantine protocols I assume come much later, but
+it would be interesting to discuss their “implementation of reliable multiprocess systems”
+and how it differs from state of the art today.