= OSCON 2008, Session 1: CouchDB
== RDBMS
- CouchDB is not a relational database.
- Usually you design a schema up front
== What is CouchDB
- Stores *Documents* (individual data records)
- No Schema!
- Columns containing NULL don't make sense
- "My Business card doesn't have 'Fax Number:' and then NULL"
- Natural data behavior
== Documents
- Store your document data in a JSON string.
- Talked about how Ruby is basically JSON compatible with no translations
- XML Sucks.
Short example:
{
"_id":"223BDCD",
"_rev":"834BC",
"age":54,
"name":"Darth Vader",
...
}
- Revision allows you to fetch the document, write a new copy, and save it as the latest revision of the document. This allows you to turn back time per database row!
== How do I talk to it?
=== HTTP REST API
- Create: HTTP PUT /db/docid
- Read: HTTP GET /db/docid
- Update: HTTP POST /db/docid
- Delete: HTTP DELE /db/docid
- The ID does not have to be generated by the user. Just don't provide one. If you provide one, it has to be a string.
- JSON doesn't deal with binary data, you have to BASE64 encoding. There apparently is some other way to handle it.
- If you don't have well-formed JSON, all calls will result in an error. They don't have a way to specify a way to enforce writes.
- Type integrity checking: They don't care.
- Is there a way to get a document using something other than an id? Yes.
There are 2 more features that make Couch really cool:
== Views
- Filter, Collate, Aggregate
- Powered by map/reduce! (They improved it a little bit!)
- Views are built incrementally and on demand. Reduction is optional.
- Sends diffs around to sync db data. VERY FAST!
- No write penalty with views.
- The view is simply the result of a map/reduce function stored in a btree.
=== Example: Tag Cloud
- We have a db full of tagged documents
- We must know how often each tag appears
- Use map/reduce!
- Works well since it's in Erlang, which can be massively parallel
== Replication
- CouchDB was originally designed for an offline replication of your database.
- Replication works a lot like rsync
- They don't use auto_increment
- Full new revisions of documents, not partial changes.
== Built for the future
- Written in Erlang.
- Non-locking MVCC and ACID compliant data store
- No locking of the data store ever
- Damien Katz invented it. Self-funded fulltime development for 2 years.
- Now it's backed by IBM.