# Reviewing Online Transactional Databases

### Introduction

Now so far in structuring our databases, we have developed a schema for an online transactional processing database (OLTP).  This is the relational database design of most web applications.  It's a database structure that optimizes for an updating record of user transactions -- where say we update the record of transactions multiple times a second.  


However, this design is not optimal for analytics queries, where our data is updated much less often.  In this lesson, we'll review OLTP, explore some of the issues with it, and learn about a new database structure for performing analytical queries.

### 1. Online Transactional Processing: Changing Fast

Online transactional processing is meant for databases that are *transactional*.  For example, the database of any consumer facing website is transactional.  

So then what databases are not *transactional*? Well a database used for analytics does not have a lot of transactions.  Instead, we simply copy over the data in our OLTP to another database, and perform queries on it.  But instead of users interacting with our database multiple times a second, we will not really update and insert data other than bulk copying additional data, maybe on a daily basis.  

Notice that our OLTP database serves a different goal than our analytical database.  The OLTP may be used confurrently by hundreds of users.  It needs to accomodate for high throughput and is insert and update-intensive.  So one of the main goals with an OLTP is speedy updates and reads.

We serve this goal by having tables that are relatively small -- say 5-7 columns or fewer, which speeds up our read and write transactions. 

For example, our user table may be just a few columns.

> <img src="./patients-table.png" width="60%">

### OLTP: Single Source of Truth

So we saw one reason why OLTP has tables with a relatively few number of columns: it allows for speedy inserts and updates.  There is another reason why we an OLTP tends to have tables with a relatively small number of columns - and that is because of a single source of truth.  

Remember that with an OLTP, we try to prevent duplication of values our databases.  So if we are creating a database to manage doctors appointments, we make sure that we are only listing a doctor or patient's name in one record.  If we need to reference that doctor or patient, say in an appointment, we instead just reference the `patient_id`.  And of course, the same goes for the doctor.

> <img src="./doctors-patients-join.png" width="60%">

So we only list the patient information one time in our database, and if we ever need to change the patient's name, we only need to provide the update in one location, and not across multiple records.

> <img src="./patients-table.png" width="60%">

When we eliminate repetition in our database, we have *normalized* our database, or achieved *third normal form* (3NF) of our database.

### 3. Tightly Coupled Information

What's third normal form of a database?  Well as put it by one computer scientist:

> "[every] non-key [attribute] must provide a fact about the key, the whole key, and nothing but the key"

If we look at the patients table, we can see that all of our non-key attributes -- first name, last name, birthday do in fact describe the key of `patients.id`.

> <img src="./patients-table.png" width="60%">

And if we look at the appointments table we can see that we do not have any non-key attributes. 

> <img src="./doctors-patients-join.png" width="60%">

So, here we see that we meet the requirement that "[every] non-key [attribute] must provide a fact about the key, the whole key, and nothing but the key".  

So perhaps this is another benefit about this structure.  In addition to providing for fast inserts and updates, and a single source of truth, it also keeps our tables well organized so that each column is gives information about the primary key.

### Summary

In this lesson, we reviewed some of the benefits of using an online transactional process (OLTP) database structure.  We saw that the database structure organizes our data into tables with a small number of columns, which provides for fast inserts and updates to our data.  With an OLTP, we prevent duplicating information, promoting a single source of truth.  And when we achieve this, our data is in third normal form.  An added benefit of this is that the non-key columns (fact columns) in a table are directly related to the information in the primary key. 