# The ELT Pattern

### Introduction

As we know, in a company we generally have one database - our OLTP database - for hosting our web application, and data warehouse for storing and querying our analytics data -- our analytics database.  Now as data engineers, to use this separate database, we'll first have to move data there.  And to do so, we'll use a pattern called **Extract Load and Transform** or ELT for short.  Let's quickly go through the diagram below, and then we can move more slowly through the various steps.

> With the diagram below, we first extract data from our OLTP or various APIS, and then load it into our OLAP (below the `source_tables` table.  Then we select data from our source table, transform it, and then load it into the target tables (here, the products table).

<img src="./elt_paradigm.jpg" width="80%">

### 1. Extracting Data 

Let's start by just focusing on the extract portion, from the diagram above.  Extracting data just means to retrieve it.  We've seen various ways to retrieve our data.  Here are a couple of examples.

1. Extracting from APIs

Remember that in an analytical database, we may want to select data from various outside sources, like Google Analytics to track website traffic, Mixpanel to see what users click on when clicking on our site, or Hubspot to track how users open emails, or responses during sales calls.  

Each of these outside tools have an API that allow developers to *extract* the data.

```python
import requests
url = "www.api.mixpanel.com"
requests.get(url)
```

2. Extracting from a database

Of course some of the data may also be in our application database, our OLTP, so we would need to extract data from our OLTP and load it into our OLAP as well.  For example, information about how much customers spend on various products, or our most popular products may be most available in the OLTP database.  We would want to extract that data for futher analysis.

```SQL
SELECT * FROM orders;
```

### 2. Loading Data

Once we extract the data, we then need to load the data into our OLAP database.  If we think about how we would accomplish this, we first need to use SQL to create the tables to load in our data, and we would then need to insert the data with  an `INSERT INTO` statement.  

### Transforming Data

Now because we are loading the data directly into the database as it comes in from an API or our OLTP database, this data comes in unformatted, and directly loaded into the datbase.

So after we load this raw data into the database, we'll then need to transform the data to get it into the structure we want.  As we know, when we get data from an API:

* there is generally more data than we need,
* that data may not be of the correct data type,
* and the data may not be properly separated into different tables.

These are each different kinds of transformations that we may need to apply, once the data is loaded into our database.  Let's take another look at our ELT diagram, this time focusing on the `load`, `transform`, and `load again` steps.

<img src="./elt_paradigm.jpg" width="80%">

What the diagram indicates is that we first load in some messy data to our database.  These original location where we load the data is called **staging**, and the tables with the unformatted data is called our **source tables**.  We then transform that data to load into a more properly formatted set of tables - often called our **mart tables**.

Once we have our data formatted into our mart tables, we can then query our data, and generate reports from our properly formatted data to deliver to various stakeholders.

### Summary

In this lesson, we learned about the ELT paradigm, where data is first extracted from the external sources like an API or an application database, and then loaded into the analytics database.  With the ELT pattern, the data simply transferred into the analytics database without much, if any, initial transformation.  Once loaded into the analytics database, then further transformations occur so that the data is well formatted for something like a data dashboard, or for reports for internal stakeholders.