Permalink
Browse files

import of work in progress website

  • Loading branch information...
1 parent ad0b65e commit e2c4f69d180adb8e122109ef33027ab02f680ba6 Noah Slater committed Aug 6, 2010
Showing with 8,231 additions and 1 deletion.
  1. +61 −0 README.md
  2. +5 −0 buy.html
  3. +488 −0 draft/api.html
  4. +41 −0 draft/balancing.html
  5. +61 −0 draft/btree.html
  6. BIN draft/btree/01.png
  7. +113 −0 draft/clustering.html
  8. +19 −0 draft/colophon.html
  9. +266 −0 draft/conflicts.html
  10. BIN draft/conflicts/01.png
  11. BIN draft/conflicts/02.png
  12. BIN draft/conflicts/03.png
  13. BIN draft/conflicts/04.png
  14. BIN draft/conflicts/05.png
  15. BIN draft/conflicts/06.png
  16. BIN draft/conflicts/07.png
  17. BIN draft/conflicts/08.png
  18. +213 −0 draft/consistency.html
  19. BIN draft/consistency/01.png
  20. BIN draft/consistency/02.png
  21. BIN draft/consistency/03.png
  22. BIN draft/consistency/04.png
  23. BIN draft/consistency/05.png
  24. BIN draft/consistency/06.png
  25. BIN draft/consistency/07.png
  26. +421 −0 draft/cookbook.html
  27. +143 −0 draft/design.html
  28. BIN draft/design/01.png
  29. +368 −0 draft/documents.html
  30. BIN draft/documents/01.png
  31. BIN draft/documents/02.png
  32. BIN draft/documents/03.png
  33. BIN draft/documents/04.png
  34. +31 −0 draft/foreword.html
  35. +133 −0 draft/formats.html
  36. BIN draft/formats/01.png
  37. +63 −0 draft/index.html
  38. +113 −0 draft/json.html
  39. +268 −0 draft/lists.html
  40. BIN draft/lists/01.png
  41. +79 −0 draft/mac.html
  42. +338 −0 draft/managing.html
  43. BIN draft/managing/01.png
  44. BIN draft/managing/02.png
  45. +234 −0 draft/notifications.html
  46. +218 −0 draft/performance.html
  47. +53 −0 draft/preface.html
  48. +460 −0 draft/recipes.html
  49. +118 −0 draft/replication.html
  50. +69 −0 draft/scaling.html
  51. +243 −0 draft/security.html
  52. +298 −0 draft/show.html
  53. BIN draft/show/01.png
  54. BIN draft/show/02.png
  55. +263 −0 draft/source.html
  56. +219 −0 draft/standalone.html
  57. BIN draft/standalone/01.png
  58. BIN draft/standalone/02.png
  59. BIN draft/standalone/03.png
  60. BIN draft/standalone/04.png
  61. BIN draft/standalone/05.png
  62. BIN draft/standalone/06.png
  63. BIN draft/standalone/07.png
  64. BIN draft/standalone/08.png
  65. BIN draft/standalone/09.png
  66. BIN draft/standalone/10.png
  67. BIN draft/standalone/11.png
  68. BIN draft/standalone/12.png
  69. +410 −0 draft/tour.html
  70. BIN draft/tour/01.png
  71. BIN draft/tour/02.png
  72. BIN draft/tour/03.png
  73. BIN draft/tour/04.png
  74. BIN draft/tour/05.png
  75. BIN draft/tour/06.png
  76. BIN draft/tour/07.png
  77. BIN draft/tour/08.png
  78. BIN draft/tour/09.png
  79. BIN draft/tour/10.png
  80. +260 −0 draft/transforming.html
  81. +61 −0 draft/unix.html
  82. +210 −0 draft/validation.html
  83. BIN draft/validation/01.png
  84. +575 −0 draft/views.html
  85. BIN draft/views/01.png
  86. BIN draft/views/02.png
  87. BIN draft/views/03.png
  88. BIN draft/views/04.png
  89. +159 −0 draft/why.html
  90. BIN draft/why/01.png
  91. BIN draft/why/02.png
  92. BIN draft/why/03.png
  93. +21 −0 draft/windows.html
  94. BIN image/couchdb_book.jpg
  95. BIN image/couchg_bg.png
  96. BIN image/couchg_bg_dark.png
  97. BIN image/couchg_bg_footer.png
  98. +153 −1 index.html
  99. +263 −0 script.js
  100. +100 −0 style.css
  101. +562 −0 style/custom.css
  102. +21 −0 style/ie.css
  103. +21 −0 style/print.css
  104. +16 −0 todo.txt
View
@@ -0,0 +1,61 @@
+# CouchDB: The Definitive Guide
+
+This is the hope of the open source book “CouchDB: The Definitive Guide”
+
+
+## Organisation
+
+`draft/` is the always work in progress version of the next edition.
+
+`editions/` has a list of all editions of the book. `editions/1/` is the current edition of the book, when we do a second edition, it will be under `editions/2` and so on.
+
+`editions/1/en` is the main book content. Translations can be found under `editions/1/..`. Each language can be found under its respective language code (de, fr, jp…; see <http://en.wikipedia.org/wiki/List_of_ISO_639-1_codes>)
+
+
+## Contributions
+
+This book is open source under the Creative Commons Attribution 3.0 Unported license (http://creativecommons.org/licenses/by/3.0/).
+
+The authors encourage you to fork, improve and publish our work under the terms of the license. Big thanks to O’Reilly for allowing this!
+
+If you feel like giving back, please use GitHub pull-requests [TODO: link] to notify us of new content. We’re equally happy about issues [TODO: link] you raise to get our attention.
+
+Chris, Jan and Noah retain editorial control over any changes anyone submits.
+
+
+## Publication
+
+Every once in a while, O’Reilly will take the current `en` version and turn it into a printed book. When that’s done, we’ll create a new directory under `editions/` to hold a stable snapshot of that release.
+
+
+## Translations
+
+We’d like to encourage you to start translating the book into your native language (or language of choice, really). We’re continuously publishing our work and all translations, so our readers always get the most up to date information.
+
+If you like to see a translation in a particular language, please first check if one exists already. If not, follow the instructions below.
+
+
+### Starting a Translation
+
+Only ever start a translation from the `en/` directory under `editions/<number>`. Do not try to translate `draft/` as it is constantly changing. Do not try to translate from any other language as `en/` is most likely the most complete source.
+
+Here’s how you would make a german (de) translation.
+
+ cd editions/1
+ cp -r en de
+ git add de
+ git commit -m 'Start German translation' de
+
+
+#### Styles
+
+If you need custom CSS rules for your tranlsations, please create a new file `style.css` in `editions/1/de/` and add another `<link rel="stylesheet" href="../style.css">` line to your HTML files.
+
+
+### Publishing a Translation
+
+You’re free to publish a translation under the aforementioned license. O’Reilly voiced interest in publishing translations as well, but no definite plans have been made. We’re happy to put you in touch with our editor to discuss printed editions of translations further.
+
+## Relax
+
+
View
@@ -0,0 +1,5 @@
+<meta charset="utf-8">
+
+<link rel="stylesheet" href="style.css">
+
+<script src="http://catalog.oreilly.com/catalog/9780596158163/widgets/buy_buttons.js"></script>
View
Oops, something went wrong.
View
@@ -0,0 +1,41 @@
+<title>Load Balancing</title>
+
+<meta charset="utf-8">
+
+<link rel="stylesheet" href="../style.css">
+
+<link rel="prev" href="conflicts.html">
+
+<link rel="next" href="clustering.html">
+
+<h2 id="balancing">Load Balancing</h2>
+
+<p>Jill is woken up at 4:30 a.m. by her mobile phone. She receives text message after text message, one every minute. Finally, Joe calls. Joe is furious, and Jill has trouble understanding what Joe is saying. In fact, Jill has a hard time figuring out why Joe would call her in the middle of the night. Then she remembers: Joe is running an online shop selling sports gear on one of her servers, and he is furious because the server went down and now his customers in New Zealand are angry because they can’t get to the online shop.
+
+<p>This is a typical scenario, and you have probably seen many variations of it, being in the role of Jill, Joe, or both. If you are Jill, you want to sleep at night, and if you are Joe, you want your customers to buy from you whenever it pleases them.
+
+<h3 id="backup">Having a Backup</h3>
+
+<p>The problems persist: computers fail, and in many ways. There are hardware problems, power outages, bugs in the operating system or application software, etc. Only CouchDB doesn’t have any bugs. (Well, of course, that’s not true. All software has bugs, with the possible exception of things written by Daniel J. Bernstein and Donald Knuth.)
+
+<p>Whatever the cause is, you want to make sure that the service you are providing (in Jill and Joe’s case, the database for an online store) is resilient against failure. The road to resilience is a road of finding and removing single points of failure. A server’s power supply can fail. To keep the server from turning off during such an event, most come with at least two power supplies. To take this further, you could get a server where everything is duplicated (or more), but that would be a highly specialized (and expensive) piece of hardware. It is much cheaper to get two similar servers where the one can take over if the other has a problem. However, you need to make sure both servers have the same set of data in order to switch them without a user noticing.
+
+<p>Removing all single points of failure will give you a highly available or a fault-tolerant system. The order of tolerance is restrained only by your budget. If you can’t afford to lose a customer’s shopping cart in any event, you need to store it on at least two servers in at least two far apart geographical locations.
+
+<div class="aside note">
+
+<p>Amazon does this for the <a href="http://www.amazon.com">Amazon.com</a> website. If one data center is the victim of an earthquake, a user will still be able to shop.
+
+<p>It is likely, though, that Amazon’s problems are not your problems and that you will have a whole set of new problems when your data center goes away. But you still want to be able to live through a server failure.
+
+</div>
+
+<p>Before we dive into setting up a highly available CouchDB system, let’s look at another situation. Joe calls Jill during regular business hours and relays his customers’ complaints that loading the online shop takes “forever.” Jill takes a quick look at the server and concludes that this is a lucky problem to have, leaving Joe puzzled. Jill explains that Joe’s shop is suddenly attracting many more users who are buying things. Joe chimes in, “I got a great review on that blog. That’s where they must be coming from.” A quick referrer check reveals that indeed many of the new customers are coming from a single site. The blog post already includes comments from unhappy customers voicing their frustration with the slow site. Joe wants to make his customers happy and asks Jill what to do. Jill advises that they set up a second server that can take half of the load of the current server, making sure all requests get answered in a reasonable amount of time. Joe agrees, and Jill begins to set things up.
+
+<p>The solution to the outlined problem looks a lot like the earlier one for providing a fault-tolerant setup: install a second server and synchronize all data. The difference is that with fault tolerance, the second server just sits there and waits for the first one to fail. In the server-overload case, a second server helps answer all incoming requests. This case is not fault-tolerant: if one server crashes, the other will get all the requests and will likely break down, or at least provide very slow service, either of which is not acceptable.
+
+<p>Keep in mind that although the solutions look similar, high availability and fault tolerance are not the same. We’ll get back to the second scenario later on, but first we will take a look at how to set up a fault-tolerant CouchDB system.
+
+<p>We already gave it away in the previous chapters: the solution to synchronizing servers is replication.
+
+<script src="../script.js"></script>
View
@@ -0,0 +1,61 @@
+<title>The Power of B-trees</title>
+
+<meta charset="utf-8">
+
+<link rel="stylesheet" href="../style.css">
+
+<link rel="prev" href="json.html">
+
+<link rel="next" href="colophon.html">
+
+<h2 id="btree">The Power of B-trees</h2>
+
+<p>CouchDB uses a data structure called a B-tree to index its documents and views. We’ll look at B-trees enough to understand the types of queries they support and how they are a good fit for CouchDB.
+
+<p>This is our first foray into CouchDB internals. To use CouchDB, you don’t need to know what’s going on under the hood, but if you understand how CouchDB performs its magic, you’ll be able to pull tricks of your own. Additionally, if you understand the consequences of the ways you are using CouchDB, you will end up with smarter systems.
+
+<p>If you weren’t looking closely, CouchDB would appear to be a B-tree manager with an HTTP interface.
+
+<div class="aside note">
+
+<p>CouchDB is actually using a B+ tree, which is a slight variation of the B-tree that trades a bit of (disk) space for speed. When we say <em>B-tree</em>, we mean CouchDB’s <em>B+ tree</em>.
+
+</div>
+
+<p>A B-tree is an excellent data structure for storing huge amounts of data for fast retrieval. When there are millions and billions of items in a B-tree, that’s when they get fun. B-trees are usually a shallow but wide data structure. While other trees can grow very high, a typical B-tree has a single-digit height, even with millions of entries. This is particularly interesting for CouchDB, where the leaves of the tree are stored on a slow medium such as a hard drive. Accessing any part of the tree for reading or writing requires visiting only a few nodes, which translates to a few head seeks (which are what make a hard drive slow), and because the operating system is likely to cache the upper tree nodes anyway, only the seek to the final leaf node is needed.
+
+<blockquote>
+
+<p>From a practical point of view, B-trees, therefore, guarantee an access time of less than 10 ms even for extremely large datasets.
+
+<p class="attribution">&mdash;Dr. Rudolf Bayer, inventor of the B-tree
+
+</blockquote>
+
+<p>CouchDB’s B-tree implementation is a bit different from the original. While it maintains all of the important properties, it adds Multi-Version Concurrency Control (MVCC) and an append-only design. B-trees are used to store the main database file as well as view indexes. One database is one B-tree, and one view index is one B-tree.
+
+<p>MVCC allows concurrent reads and writes without using a locking system. Writes are serialized, allowing only one write operation at any point in time for any single database. Write operations do not block reads, and there can be any number of read operations at any time. Each read operation is guaranteed a consistent view of the database. How this is accomplished is at the core of CouchDB’s storage model.
+
+<p>The short answer is that because CouchDB uses append-only files, the B-tree root node must be rewritten every time the file is updated. However, old portions of the file will never change, so every old B-tree root, should you happen to have a pointer to it, will also point to a consistent snapshot of the database.
+
+<p>Early in the book we explained how the MVCC system uses the document’s <code>_rev</code> value to ensure that only one person can change a document version. The B-tree is used to look up the existing <code>_rev</code> value for comparison. By the time a write is accepted, the B-tree can expect it to be an authoritative version.
+
+<p>Since old versions of documents are not overwritten or deleted when new versions come in, requests that are reading a particular version do not care if new ones are written at the same time. With an often changing document, there could be readers reading three different versions at the same time. Each version was the latest one when a particular client started reading it, but new versions were being written. From the point when a new version is <em>committed</em>, new readers will read the new version while old readers keep reading the old version.
+
+<p>In a B-tree, data is kept only in leaf nodes. CouchDB B-trees append data only to the database file that keeps the B-tree on disk and grows only at the end. Add a new document? The file grows at the end. Delete a document? That gets recorded at the end of the file. The consequence is a robust database file. Computers fail for plenty of reasons, such as power loss or failing hardware. Since CouchDB does not overwrite any existing data, it cannot corrupt anything that has been written and <em>committed</em> to disk already. See <a href="#figure/1">Figure 1, “Flat B-tree and append-only”</a>.
+
+<p>Committing is the process of updating the database file to reflect changes. This is done in the file footer, which is the last 4k of the database file. The footer is 2k in size and written twice in succession. First, CouchDB appends any changes to the file and then records the file’s new length in the first database footer. It then force-flushes all changes to disk. It then copies the first footer over to the second 2k of the file and force-flushes again.
+
+<div class="figure" id="figure/1">
+
+<img src="btree/01.png">
+
+<p class="caption">Figure 1. Flat B-tree and append-only
+
+</div>
+
+<p>If anywhere in this process a problem occurs—say, power is cut off and CouchDB is restarted later—the database file is in a consistent state and doesn’t need a checkup. CouchDB starts reading the database file backward. When it finds a footer pair, it makes some checks: if the first 2k are corrupt (a footer includes a checksum), CouchDB replaces it with the second footer and all is well. If the second footer is corrupt, CouchDB copies the first 2k over and all is well again. Only once both footers are flushed to disk successfully will CouchDB acknowledge that a write operation was successful. Data is never lost, and data on disk is never corrupted. This design is the reason for CouchDB having no <em>off</em> switch. You just terminate it when you are done.
+
+<p>There’s a lot more to say about B-trees in general, and if and how SSDs change the runtime behavior. The Wikipedia article on <a href="http://en.wikipedia.org/wiki/B-tree">B-trees</a> is a good starting point for further investigations. Scholarpedia includes <a href="http://www.scholarpedia.org/article/B-tree_and_UB-tree">notes</a> by Dr. Rudolf Bayer, inventor of the B-tree.
+
+<script src="../script.js"></script>
View
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Oops, something went wrong.

0 comments on commit e2c4f69

Please sign in to comment.