http://www.gluecon.com/ Not a great description of the conference on the front page really, the session descriptions are more useful: https://docs.google.com/spreadsheets/d/1A6RoScrHsRn96u74o7uwSOa5VuJ7mfLKS_cgaVhyCq0/edit#gid=0 ...
"While pursuing success in a dynamic, complex environment with limited resources and multiple goal conflicts, a succession of small, everyday decisions eventually produced breakdowns on a massive ...
http://www.amazon.com/Inviting-Disaster-James-R-Chiles-ebook/dp/B0018ND83Y This book was in one of Nygard's footnotes in Release It! It's an awesome collection of research into catastrophic failures ...
Book: Failure is Not an Option
http://www.amazon.com/Failure-Is-Not-Option-Mission/dp/1439148813 A great book about how the operations team worked and grew from the beginning of NASA through all of the major Apollo missions. Although ...
Book: The Practice of Cloud System Administration
http://www.amazon.com/The-Practice-Cloud-System-Administration/dp/032194318X
Post / Website: Lambda Architecture
http://lambda-architecture.net Nathan Marz came up with the term Lambda Architecture (LA) for a generic, scalable and fault-tolerant data processing architecture, based on his experience working on distributed ...
Post: Distributed Systems Design
https://www.bluebox.net/insight/blog-article/distributed-systems-design-part-1-4 https://www.bluebox.net/insight/blog-article/distributed-systems-design-part-2-4 https://www.bluebox.net/insight/blog-article/distributed-systems-design-part-3-4 ...