# Ecommerce Modeling Reading

### Introduction

In the last lesson we were given a common use case of database modeling to store customer orders -- in this case, sneaker orders.


We started with data like the following:

<img src="./nike_data.png" width="100%">

[Data link](https://docs.google.com/spreadsheets/d/1AuTC1Me-Fm0_26VERGldctnThNrwWc8RbhIcAO4qa8c/edit#gid=0)

And ideally we would wind up with modeling that looks like the following.

<img src="./nike-modeling.png">

In this lesson, we'll see how we got there.

### Beginning our Modeling

When performing a data modeling problem like the above, there is a *transaction* that brings brings together different entities.  Here, the transaction is a purchase.

And we can begin to break up the entities with the who, what, and where questions.  Doing so, we can begin to identify a couple of initial tables:

> The names of the different tables may vary.

<img src="./id-products.png" width="70%">

> **Notice that** we got the above, by thinking about different entities like who and what.  But, we could have gotten to the same place by identifying repetition in our tables.  For example, in our original data the products show up multiple times.  And if we imagine a user returned to our website and made another purchase, we would see that information repeated as well.

> <img src="./dup-data.png" width="100%">

### Connecting the tables

So now that we have our users and products, the next step may be to consider the relationship between them.

<img src="./id-products.png" width="70%">

A user has many products, and products have many users (defined by all of the users who purchase the product).  So we can place a join table to connect these, but really that connection is defined by another entity: an order.

> Take a look at our updated data modeling below.

<img src="./users-orders.png" width="100%">

So we can see that with the first order, a user bought jordans 22 for 125 dollars.  And with the second, the user bought pegasus 12 for 100 dollars.

### One more table

Looking at the current structure, you may notice a small problem.

<img src="./users-orders.png" width="100%">

This is that when we would describe what happened in the first two orders, we would likely describe this as one purchase of two items.

This suggests that there should be an additional table -- like *line items*, and in that first order we would represent that as one order with two different line items of jordans and pegasus 12.  
> You can think of line item, as each individual item on a receipt, and the order is recorded in that receipt.

So with our updated data modeling, we would describe this as an order has many line items, and a line item belongs to an order.

With this update, now our data will look like the following:

<img src="./purchases.png" width="100%">

So now if we tell a story with this data, looking at orders, we can see that user 1 made two orders.  One on 1/1/22 and another on 10/1/22.  To see what he bought in the first order, we look at the line_items table and see he bought jordans 22 and pegasus 12 in the first order.

### Another route to line items

So above we identified the line items table mainly because it simply did not sound right to say that a user bought two different items on the same date.  It seemed like something was missing -- and that was that an order had many line items.

But we could have seen this issue another way.  And that is our old technique of looking for repetition in the data.

<img src="./users-orders.png" width="100%">

We can see that repetition in the first two items in the orders table.  Notice that user_id and purchase_date have the same values for orders 1 and 2.  This is indicating these orders are part of the same thing which is currently not represented in our data model, and that is a that they are two line items from the same order.

Our final data modeling looks like the following.

<img src="./nike-modeling.png" width="100%">