* [Highly Available Transactions: Virtues and Limitations](http://www.bailis.org/papers/hat-vldb2014.pdf) (Bailis et al.)
* [The Incident Command System](http://www.high-reliability.org/files/The_Incident_Command_System.pdf) (Bigley and Roberts)
* [The Chubby Lock Service for Loosely Coupled Distributed Systems](http://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en/us/archive/chubby-osdi06.pdf) (Burrows)
+* [Paxos Made Live - An Engineering Perspective](http://www.cs.utexas.edu/users/lorenzo/corsi/cs380d/papers/paper2-1.pdf) (Tushar Chandra)
* [Bigtable: a Distributed Storage System for Structured Data](http://www.read.seas.harvard.edu/~kohler/class/cs239-w08/chang06bigtable.pdf) (Chang et al.)
* [Spanner: Google’s Globally-Distributed Database](http://research.google.com/archive/spanner-osdi2012.pdf) (Corbett et al.)
* [Dynamo: Amazon’s Highly Available Key-Value Store](http://www.read.seas.harvard.edu/~kohler/class/cs239-w08/decandia07dynamo.pdf) (DeCandia et al.)
@@ -24,6 +25,7 @@
* [Fallacies of Distributed Computing Explained](http://www.rgoarchitects.com/Files/fallacies.pdf) (Rotem-Gal-Oz)
* [F1 - The Fault-Tolerant Distributed RDBMS Supporting Google’s Ad Business](http://research.google.com/pubs/archive/38125.pdf) (Shute et al.)
* [Dapper, A Large Scale Distributed Systems Tracing Infrastructure](http://research.google.com/pubs/archive/36356.pdf) (Sigelman et al.)
+* [Convergent and Commutative Replicated Data Types](http://hal.inria.fr/docs/00/55/55/88/PDF/techreport.pdf) (Shapiro et al.)
* [Resident Distributed Datasets: a Fault-Tolerant Abstraction for In-Memory Cluster Computing](https://www.usenix.org/system/files/conference/nsdi12/nsdi12-final138.pdf) (Zahari et al.)
* [The Human Side of Postmortems](https://docs.google.com/file/d/0Byl4UKRYLErDVlJMNDNjaThiR2M/edit) (Zwieback)
* [Crew Resource Management: a Positive Change for the Fire Service](http://www.iaff.org/06news/NearMissKit/6.%20Crew%20Resource%20Management/CRM.pdf)
@@ -32,6 +34,7 @@
* [Resilience Engineering: Part I](http://www.kitchensoap.com/2011/04/07/resilience-engineering-part-i/), [Part II](http://www.kitchensoap.com/2012/06/18/resilience-engineering-part-ii-lenses/) (Allspaw)
* [Systems Engineering: a Great Definition](http://www.kitchensoap.com/2011/07/18/systems-engineering-great-definition/) (Allspaw)
+* [The CAP FAQ](http://henryr.github.io/cap-faq/) (Henry Robinson)
* [Some Rules for Engineering and Operations](http://blog.b3k.us/2012/01/24/some-rules.html) (Black)
* [Service Level Disagreements Part I](http://blog.b3k.us/2009/07/15/service-level-disagreements.html), [Part II](http://blog.b3k.us/2009/07/16/service-level-disagreements-2.html) (Black)
* [Design, Lessons, and Advice from Building Distributed Systems at Google](http://odbms.org/download/dean-keynote-ladis2009.pdf) (Dean)
@@ -41,23 +44,29 @@
* [Observations on Errors, Corrections, & Trust of Dependent Systems](http://perspectives.mvdirona.com/2012/02/26/ObservationsOnErrorsCorrectionsTrustOfDependentSystems.aspx) (Hamilton)
* [Life Beyond Distributed Transactions: An Apostate’s Opinion](http://cs.brown.edu/courses/cs227/archives/2012/papers/weaker/cidr07p15.pdf) (Helland)
* [Notes on Distributed Systems for Young Bloods](http://www.somethingsimilar.com/2013/01/14/notes-on-distributed-systems-for-young-bloods/) (Hodges)
+* [A brief history of Consensus, 2PC and Transaction Commit](http://betathoughts.blogspot.com/2007/06/brief-history-of-consensus-2pc-and.html) (Mc Keown)
+* [Principles of Robust Timing over the Internet](http://queue.acm.org/detail.cfm?id=1773943) (Ridoux and Veitch)
* [The Network is Reliable](http://aphyr.com/posts/288-the-network-is-reliable) (Kingsbury)
* [The Trouble with Clocks](http://aphyr.com/posts/299-the-trouble-with-timestamps) (Kingsbury)
* [Call Me Maybe: Final Thoughts](http://aphyr.com/posts/286-call-me-maybe-final-thoughts) (Kingsbury)
* [Getting Real About Distributed Systems Reliability](http://blog.empathybox.com/post/19574936361/getting-real-about-distributed-system-reliability) (Kreps)
+* [What Distinguishes Distributed Computing From Parallel Computing?](http://colin-scott.github.io/blog/2014/03/30/what-distinguishes-distributed-computing-from-parallel-computing/) (Scott)
* [The Log: What every software engineer should know about real-time data's unifying abstraction](http://engineering.linkedin.com/distributed-systems/log-what-every-software-engineer-should-know-about-real-time-datas-unifying) (Kreps)
* [Incident Response at Heroku](https://blog.heroku.com/archives/2014/5/9/incident-response-at-heroku) (McGranaghan)
* [Observability at Twitter](https://blog.twitter.com/2013/observability-at-twitter) (Watson)
* [Stevey’s Google Platforms Rant](https://plus.google.com/112678702228711889851/posts/eVeouesvaVX) (Yegge)
+* [Your Coffee Shop Doesn’t Use Two-Phase Commit](http://www.enterpriseintegrationpatterns.com/docs/IEEE_Software_Design_2PC.pdf) (Hohpe)
####Presentations
* [Service Design Best Practices](http://www.mvdirona.com/jrh/TalksAndPapers/JamesHamilton_POA20090226.pdf) (Hamilton)
+* [Designs, Lessons and Advice from Building Large Distributed Systems](http://www.cs.cornell.edu/projects/ladis2009/talks/dean-keynote-ladis2009.pdf) (Dean)
####Books
* [The Field Guide To Understanding Human Error](http://www.amazon.com/Field-Guide-Understanding-Human-Error/dp/0754648265) (Dekker)
+* [Guide to Reliable Distributed Systems](http://www.amazon.com/Guide-Reliable-Distributed-Systems-High-Assurance/dp/1447124154) (Birman)
* [Agile Retrospectives: Making Good Teams Great](http://www.amazon.com/Agile-Retrospectives-Making-Teams-Great/dp/0977616649) (Derby et al.)
* [Better: A Surgeon’s Notes on Performance](http://www.amazon.com/dp/0312427654) (Gawande)
* [The Checklist Manifesto: How to Get Things Right](http://www.amazon.com/The-Checklist-Manifesto-ebook/dp/B0030V0PEW) (Gawande)
@@ -80,6 +89,7 @@
* [Ricon](http://ricon.io/)
* [Surge](http://surge.omniti.com/)
* [Velocity](http://velocityconf.com/)
+* [Strangeloop](https://thestrangeloop.com/)
####Courseware
You can't perform that action at this time.
You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.