# Reviewing Online Transactional Databases

### Introduction

Now so far in structuring our databases, we have used a data model suitable for an online transactional processing (OLTP) database.  This is the relational database design of most web applications.  It's a database structure that optimizes for an updating record of user transactions -- where say we update the record of transactions multiple times a second.  

However, this design is not optimal for analytics queries, where our data is updated much less often.  So when we structure our data in an analytical database like snowflake, we'll learn some different modeling techniques -- most notably the star schema.  Before moving onto the the star schema, let's review OLTP.  We'll explore some of the issues with structuring our data in an OLTP form, and learn about a new database structure for performing analytical queries.

### 1. Online Transactional Processing: Changing Fast

Online transactional processing is meant for databases that are *transactional*.  For example, the database of any consumer facing website is transactional.  

So then what databases are not *transactional*? Well our analytics database does not have a lot of transactions.  Instead, we simply load our data into our analytical database periodically, and then perform queries on it.  So with our analytical database, we will not really update and insert data other than bulk copying additional data, maybe on a daily basis.  

Notice that our OLTP database serves a different goal than our analytical database.  The OLTP may be used by thousands users simultaneously.  So it needs to accomodate for high throughput to allow for insert and update-intensive operations.  So one of the main goals with an OLTP is speedy updates and reads.

We serve this goal by having tables that are relatively small -- say 5-7 columns or fewer, which speeds up our read and write transactions. 

For example, if we look at the actor table in the Pagila database, we can see that it is just a few columns.

> <img src="./actor-table.png" width="90%">

### OLTP: Single Source of Truth

So that's one reason why OLTP has tables with a relatively few number of columns: it allows for speedy inserts and updates.  There is another reason why we an OLTP tends to have tables with a relatively small number of columns - and that is because of a single source of truth.  

Remember that with an OLTP, we try to prevent duplication of values our databases.  So for example, with our pagila database, if we want view all of the films that an actor has been in, we still only list the actor's name once -- in the actors table -- and then represent the connection between an actor and a film in the join table.

> <img src="./film-actor.png" width="90%">

So we only list the actor information one time in our database, and if we ever need to change the actor's name, we only need to provide the update in one location, and not across multiple records.

> <img src="./actor-list.png" width="80%">

When we eliminate repetition in our database, we have *normalized* our database, or achieved *third normal form* (3NF) of our database.

### 3. Tightly Coupled Information

What's third normal form of a database?  Well as put it by one computer scientist:

> "[every] non-key [attribute] must provide a fact about the key, the whole key, and nothing but the key"

If we look at the actors table, we can see that all of our non-key attributes -- first name, last name do in fact describe the key of `actor_id`.

> <img src="./actor-list.png" width="60%">

And if we look at the `film_actor` table we can see that we do not have any non-key attributes (other than the `last_update` column). 

> <img src="./film-actor-table.png" width="60%">

So, here we see that we meet the requirement that "[every] non-key [attribute] must provide a fact about the key, the whole key, and nothing but the key".  

So perhaps this is another benefit about this structure.  In addition to providing for fast inserts and updates, and a single source of truth, it also keeps our tables well organized so that each column is gives information about the primary key.

As we have guessed, we'll model our data differently in analytical databases, but we'll see how in a little bit.  For now, let's review our understanding of modeling and querying in analytical databases, and then we can see how we model our data differently in OLAP databases.

### Summary

In this lesson, we reviewed some of the benefits of using an online transactional process (OLTP) database structure.  We saw that the database structure organizes our data into tables with a small number of columns, which provides for fast inserts and updates to our data.  With an OLTP, we prevent duplicating information, promoting a single source of truth.  And when we achieve this, our data is in third normal form.  An added benefit of this is that the non-key columns (fact columns) in a table are directly related to the information in the primary key. 