Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

typos #40

Merged
1 commit merged into from
Feb 18, 2011
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 11 additions & 11 deletions docs/purpose.txt
Original file line number Diff line number Diff line change
Expand Up @@ -14,23 +14,23 @@ appropriate for every storage situation (we still have files and hashmaps, after
and in particular the current SQL products available, present problems for large scale software.

But first lets review what is good about the relational model:
1. It seperates data from look-up strategy, queries are declarative
2. It is increadibly flexible--most desirable data operations can be expressed in relations
1. It separates data from look-up strategy, queries are declarative
2. It is incredibly flexible--most desirable data operations can be expressed in relations

The problems with current RDBMSes are the following:
1. Building a website on a centralized database combines the two worst enemies, disk io and network io, for every
server in the system, and centralizes them onto a single machine. This is clearly not feasible.
2. The relational model does not parallelize well, due to the difficulties of sharing state. Most databases favor
consistency over performance (2PC). You can't have both.
3. SQL sucks for application developement. Most web apps are programs that construct little sql statement mini-programs
3. SQL sucks for application development. Most web apps are programs that construct little sql statement mini-programs
by concatenating together strings. This completely ungainly. SQL is vender specific so you are tied to the vendors
database. This means development suffers emensely. Unit testing any piece of code that involves database access is quite
problematic becuase the unit test will be slow and will depend on a server running on another maching which can't be
bootstraped as part of the test code. This could easily be solved if vendors provided a light-weight drop in
database. This means development suffers immensely. Unit testing any piece of code that involves database access is quite
problematic because the unit test will be slow and will depend on a server running on another machine which can't be
bootstrapped as part of the test code. This could easily be solved if vendors provided a light-weight drop in
replacement that could be used for testing. Approximately 10000 products attempt to work around this problem
by automatically mapping your application to sql. This has been called the vietnam of computer science (refering to the
by automatically mapping your application to sql. This has been called the Vietnam of Computer Science (referring to the
American war, not the country...and of course computer scientists are not particularly interested in this problem as
it is too pratical).
it is too practical).

To begin to understand the problems with relational databases, let's first review some basic facts

Expand All @@ -54,9 +54,9 @@ our greatest enemy followed by the network. We can iterate over XXX items of an
look-up a single item via a log(n) b-tree index. We can iterate over YYY items in the time required to complete
a mysql request.

The relational model is great, when accessed programmatically, but it is best for little applications (of which ther are many).
The relational model is great, when accessed programmatically, but it is best for little applications (of which there are many).

In a high scalbility scenario with a shared db, the advantages of the relational model disappear. No longer do you have
In a high scalability scenario with a shared db, the advantages of the relational model disappear. No longer do you have
flexibility, what you have is a system in which most operations will bring the database to its knees along with
everyone depending on it. Think about the average table, of the set of possible queries, only a few can be issued without
difficulty. With this being the case, the abstraction of the lookup structure becomes a major problem--it is impossible to
Expand All @@ -72,7 +72,7 @@ simple puts and gets, and each operation goes first to the cache and then to the
faster than the db, but it can be distributed whereas the db cannot.

There are two basic caching strategies
1. Local read/write with background repication
1. Local read/write with background replication
2. Remote access

In the first strategy we try to keep each object on each server. We read and write to the local cache, and in the background
Expand Down