Skip to content
Branch: master
Find file History
Pull request Compare This branch is even with aQuaYi:master.
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.

Lecture 4: Primary/Backup Replication


The Design of a for Practical System Practical Virtual System Machines for Fault-Tolerant Fault-Tolerant Virtual Machines


  1. The introduction says that it is more difficult to ensure deterministic execution on physical servers than on VMs. Why is this the case?
  2. What is a hypervisor?
  3. Both GFS and VMware FT provide fault tolerance. How should we think about when one or the other is better?
  4. How do Section 3.4's bounce buffers help avoid races?
  5. What is "an atomic test-and-set operation on the shared storage"?
  6. How much performance is lost by following the Output Rule?
  7. What if the application calls a random number generator? Won't that yield different results on primary and backup and cause the executions to diverge?
  8. How were the creators certain that they captured all possible forms of non-determinism?
  9. What happens if the primary fails just after it sends output to the external world?
  10. Section 3.4 talks about disk I/Os that are outstanding on the primary when a failure happens; it says "Instead, we re-issue the pending I/Os during the go-live process of the backup VM." Where are the pending I/Os located/stored, and how far back does the re-issuing need to go?
  11. How secure is this system?
  12. Is it reasonable to address only the fail-stop failures? What are other type of failures?





How does VM FT handle network partitions? That is, is it possible that if the primary and the backup end up in different network partitions that the backup will become a primary too and the system will run with two primaries?

You can’t perform that action at this time.