Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
branch: master
Fetching contributors…

Cannot retrieve contributors at this time

265 lines (232 sloc) 8.406 kb
><!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>CAP, Harvest, and Yield</title>
<!-- metadata -->
<meta name="generator" content="S5" />
<meta name="version" content="S5 1.1" />
<meta name="presdate" content="20111115" />
<meta name="author" content="Matt Page" />
<!-- configuration parameters -->
<meta name="defaultView" content="slideshow" />
<meta name="controlVis" content="hidden" />
<!-- style sheet links -->
<link rel="stylesheet" href="ui/default/slides.css" type="text/css" media="projection" id="slideProj" />
<link rel="stylesheet" href="ui/default/outline.css" type="text/css" media="screen" id="outlineStyle" />
<link rel="stylesheet" href="ui/default/print.css" type="text/css" media="print" id="slidePrint" />
<link rel="stylesheet" href="ui/default/opera.css" type="text/css" media="projection" id="operaFix" />
<!-- S5 JS -->
<script src="ui/default/slides.js" type="text/javascript"></script>
</head>
<body>
<div class="layout">
<div id="controls"><!-- DO NOT EDIT --></div>
<div id="currentSlide"><!-- DO NOT EDIT --></div>
<div id="header"></div>
<div id="footer">
<h1>Tsumobi HQ / 2011-11-15</h1>
<h2>{C,A}P, Harvest, and Yield</h2>
</div>
</div>
<div class="presentation">
<div class="slide">
<h1>{C,A}P, Harvest, and Yield</h1>
<h2>Tsumobi Readings in (Distributed) Systems</h2>
<h3>Matt Page</h3>
</div>
<div class="slide">
<h1> Steal these slides! </h1>
<a href="https://github.com/mpage/presos/">https://github.com/mpage/presos/</a>
</div>
<div class="slide">
<h1> Gilbert and Lynch </h1>
"Brewer's Conjecture and the Feasibility of Consistent, Available, Parition-Tolerant Web Services"
<ul>
<li> First formal proof of Brewer's conjecture. </li>
<li> Characterization of systems according to Brewer's conjecture. </li>
</ul>
</div>
<div class="slide">
<h1>Brewer's Conjecture</h1>
<p> A shared data system can have at most two of the following three properties: </p>
<ul class="incremental">
<li> <b>Consistency</b> - Updates appear to happen at the same logical time across all nodes.</li>
<li> <b>Availability</b> - Requests to non-failing nodes always return a result. </li>
<li> <b>Partition tolerance</b> - System tolerates arbitrary message loss between groups of nodes. </li>
</ul>
</div>
<div class="slide">
<h1>Proof of Brewer's Conjecture</h1>
Asynchronous network model:
<ul>
<li> No clocks. </li>
<li> Decisions can be made only in response to messages received. </li>
</ul>
<p> Somewhat surprising: even with no message loss we cannot guarantee CAP. </p>
</div>
<div class="slide">
<h1>Proof of Brewer's Conjecture</h1>
<p> Consider the following execution: </p>
<center>
<img src="img/cap_proof.png" />
</center>
</div>
<div class="slide">
<h1>Proof of Brewer's Conjecture</h1>
Partially synchronous network model:
<ul>
<li> Each node has a clock. </li>
<li> Message delays are bounded. (t<sub>m</sub>)</li>
<li> Message processing is also bounded. (t<sub>local</sub>)</li>
</ul>
<p>
Previous "proof" still applies in the face of message loss.
CAP possible if no messages lost.
</p>
</div>
<div class="slide">
<h1> t-Connected Consistency - Informally</h1>
<ul>
<li> Only applies in the Partially Synchronous Network Model </li>
<li> Stale data may be returned given message loss </li>
<li> Strong consistency is restored after a sufficient interval with no message loss (T)</li>
</ul>
</div>
</div>
<div class="slide">
<h1> t-Connected Consistency - Formally</h1>
<p>
Atomic (strong) consistency under no message loss. Under message loss, &#8707; partial ordering P,
such that:
</p>
<ul>
<li> P order all writes and all reads w.r.t. writes </li>
<li> Reads return value written by previous write in P, or an initial value. </li>
<li> P is consistent with ordering of read/write requests submitted at each node </li>
<li> &#8707 sufficiently long interval of time T w/ no message loss. If &#1012; completes before
T begins and &#981; begins after T ends, then &#1012 comes before &#981 in P. </li>
</ul>
</div>
</div>
<div class="slide">
<h1> System with t-Connected Consistency </h1>
<ul>
<li> Centralized writer C </li>
<li> Read returns result, or stale data if no reply from C in 2*t<sub>m</sub> + t<sub>local</sub></li>
<li> Writes always succeed, but are replayed if no reply from C in 2*t<sub>m</sub> + t<sub>local</sub></li>
<li> C serializes all writes and periodically broadcasts the latest values, along with sequence numbers, to all nodes </li>
</ul>
</div>
<div class="slide">
<h1> Characterizing Systems</h1>
Given that we can only choose two of {C,A,P}, what useful systems can we build?
</div>
<div class="slide">
<h1> Characterizing Systems - AP </h1>
<p>
Sacrifice consistency: Tolerate stale/partial data in the face of failures. Use read-repair/
anti-entropy techniques to restore consistency once partition disappears.
</p>
<p>Examples:</p>
<ul>
<li> Dynamo clones (Riak, Voldemort, Cassandra)</li>
<li> Web caches </li>
</ul>
</div>
<div class="slide">
<h1> Characterizing Systems - CP </h1>
Sacrifice availability: Refuse updates in the face of failures.
Examples:
<ul>
<li> Single node DB systems </li>
<li> Partitioned datastores with primary master per partition (MongoDB, HBase)</li>
</ul>
</div>
<div class="slide">
<h1> Characterizing Systems - CA </h1>
<p> Let's pretend partitions don't happen: </p>
<img src="img/bury_head_in_sand.png" />
</div>
<div class="slide">
<h1> Characterizing Systems - CA </h1>
Partitions happen:
<ul>
<li> Switches fail. </li>
<li> Power (and backup power) fails. </li>
<li> SiteOps inadvertently disconnect network cables. </li>
<li> Undersea fiber is severed. </li>
</ul>
</div>
<div class="slide">
<h1> Takeaway </h1>
When a partition occurs, which of the following do you give up?
<ul>
<li> Consistency </li>
<li> Availability </li>
</ul>
</div>
<div class="slide">
<h1> Brewer and Fox </h1>
"Harvest, Yield, and Scalable Tolerant Systems"
<ul>
<li> Introduces the concepts of Harvest and Yield </li>
<li> Techniques for using Harvest to improve Yield </li>
</ul>
</div>
<div class="slide">
<h1> Harvest and Yield </h1>
<ul>
<li> <b> Yield </b> - Probability of successfully completing a request.</li>
<li> <b> Harvest </b> - Completeness of the returned answer. </li>
</ul>
</div>
<div class="slide">
<h1> Sacrificing Harvest to Improve Yield </h1>
<center> <img src="img/pixel_porn.png" /> </center>
</div>
<div class="slide">
<h1> Sacrificing Harvest to Improve Yield </h1>
<p>
Depending on the application, we can return a partial/stale result instead of no result
during failure scenarios.
</p>
<p>Examples:</p>
<ul class="incremental">
<li> Porn quality degredation on lossy or skinny links. </li>
<li> Partial page rendering on Amazon. </li>
<li> Static fallback newsfeed for Facebook. </li>
<li> Incomplete search results. </li>
</ul>
</div>
<div class="slide">
<h1> Harvest/Yield Applied to System Architecture </h1>
<ul>
<li> Sacrifice Yield at the subsystem (service) level. </li>
<li> Sacrifice Harvest at the composite level. </li>
</ul>
</div>
<div class="slide">
<h1> Orthogonal Mechanisms </h1>
<ul>
<li> Independent subsystems with little to no runtime interface to other subsystems. </li>
<li> (Almost) free lunch. </li>
</ul>
</div>
<div class="slide">
<h1> Orthogonal Mechanisms - Examples </h1>
<ul>
<li> PaaS functionality. (Instance scaling, health monitoring, etc.)</li>
<li> Application sandboxing. </li>
<li> SSL </li>
</ul>
</div>
<div class="slide">
<h1> Takeaway </h1>
<ul>
<li> Failure is a continuum. </li>
<li> Architect your systems so that individual subsystems can fail without reducing overall yield. </li>
</ul>
</div>
</body>
</html>
Jump to Line
Something went wrong with that request. Please try again.