# Why Analytical Databases

### Introduction

So far we have become familiar with databases that are designed and use technologies that power modern web applications.  In other words, these databases are generally used to serve and respond to requests from customers. 

For example, a customer of an ecommerce application may want to see previous orders he made, or the status of a current order.  And for that, the application would perform a database query to look up this information.  

But data is increasingly being used not just for *customers of the application*, but also for *internal members of a company*.  For example, data analysts may want to calculate the most profitable  products, whereas marketing analysts may want to discover what drives customers acquistions, and then machine learning engineers may want to use data to forecast future sales.  As we'll see, to allow internal users to process these requests, companies typically set up one or more analytics databases. 

Before learning about the ins and outs of analytical databases, let's make sure we understand why we needed a database to power our web application in the first place.  Then we can go into some of the different data we need for an analytical database.

### 1. Application data needs

Now so far, everything we have learned about databases, really applies to an application database.  This kind of database is referred to as an **online transactional database**, or OLTP for short.  And it is our OLTPs that power customer facing applications.

For example, if we consider an ecommerce store, our web application needs a database to display different data on the page.

<img src="./shopify_store.png" width="80%">

For example, consider all of the data on the product page above.  We are displaying information about the product name, different sizes, the price, and a product description.  When a user visits a product page about a product, a query like the following is run:

```SQL
SELECT * FROM products WHERE products.id = 10
```

And then all of the needed data about that product is returned to be displayed on the page.

<img src="./product_row.png" width="60%">

The point for now, is that we need a database to keep track of all of the information needed to process the web application.  Now this may include **a lot** of information.  For example, these are just some of the tables and related columns that power a shopify store.

> <img src="./shopify-erd.png" width="60%">

Looks like a lot.  But as much data as we are storing to power a web application, there may be *additional* data that we want to load into our analytics database.

### 2. Analytical data needs

For an analytical database, we may need some data that comes from our application's OLTP, but may also want data that comes from different data sources.  Here are some examples of additional data we may want to work with:

1. Data From third party tools
    * The marketing team, may be using **Google analytics** to discover what is driving users to the website
    * Perhaps it's a tool like **Mixpanel** to see what users click on once they visit the site
    * And perhaps they are using something like **hubspot** to track sales calls to potential customers
    
None of this data would be stored in the OLTP, as it isn't needed to power the application, and it's been built by someone else.  But we would may still want it in an analytics database so we can perform the analysis of the marketing funnel and understand the customer journey. 

2. External Trends

So above we saw that we may want an analytics database to store some additional marketing or sales data.  But we may also want to store some data external to our product or marketing funnel.  For example, we may want to store data to get a sense of the broader market.  We could imagine pulling in data to answer questions about the following:

* Popular competitor products 
* Competitor pricing
* Demographic or consumer trends

So this external information could still be used by data professionals to benchmark current performance or predict future performance.  And this is data would likely not need to be stored in the OLTP, as it is not needed to help the web application run. 

### Summary

In this lesson, we saw that the data needs of an internal team may be different than the data needed to power a typical web application.  Because of this, we would want to store data sources that may not be needed to be stored in a web application's database.  For example, we saw that marketing data tracked through tools like Google Analytics, MixPanel, or Hubspot may be useful to analyze in an analytical database to better understand the customer acquisition lifecycle.  

Then, we saw also saw that data external to the company may also be useful -- to perform competitive analysis, or track demographic or consumer trends.  To perform this work, data would be likely be stored in a separate analytical database, as the data is not needed to power the web application, but rather used for internal analysis.  