# SQL Joins

Working with Multiple tables at once.

In our orders we have no mention of the names of the company, but there is the account_id  
whereas in our accounts table we have the name and the account_id. So that each company has only one account.

**Database Normalization** - the way that data is stored in the database.
1. Are the tables storing logical groups of data?
2. Can I make Changes in the single location, rather than many tables for the same info?
3. Can I access and manipulate data quickly and efficiently?

Most Cases the databases are already setup for you, so there isn't a need to go too in-depth.
[Database Normalization](https://www.itprotoday.com/sql-server/sql-design-why-you-need-database-normalization)

## JOIN

Used to merge the data of two tables, needs to have a similar data set between two tables.

```sql
SELECT orders.*
FROM orders
JOIN accounts
ON orders.account_id = accounts.id;
```
We are selecting everything from orders, that matches the accounts id set.
- the table name is always before the period
- the column is always after the period

Examples: 
1. if we want to pull only the account name and the dates in which that account placed an order, but none of the other columns.
```sql
SELECT account.name, orders.occurred_at
FROM orders
JOIN accounts
ON orders.account_id = accounts.id;
```
1. Alternatively, the below query pulls all the columns from both the accounts and orders table.
```sql
SELECT *
FROM orders
JOIN accounts
ON orders.account_id = accounts.id;
```
3. And the first query you ran pull all the information from only the orders table:
```sql
SELECT orders.*
FROM orders
JOIN accounts
ON orders.account_id = accounts.id
```

Questions.
1. Try pulling all the data from the accounts table, and all the data from the orders table.
```sql
SELECT *
FROM orders
JOIN accounts
ON orders.account_id = accounts.id;
```
2. Try pulling standard_qty, gloss_qty, and poster_qty from the orders table, and the website and the primary_poc from the accounts table.
```sql
SELECT orders.standard_qty, orders.gloss_qty, orders.poster_qty, accounts.website, accounts.primary_poc
FROM orders
JOIN accounts
ON orders.account_id = accounts.id;
```

These all work due to that matching data that we find the in the accounts' id.


### ERD

The ERD is good for reference when making the relationship between tables for this kind of JOIN ON relationships.  
![ERD](https://video.udacity-data.com/topher/2017/October/59e946e7_erd/erd.png)<BR>

- PK - Primary Key, is a column that has a unique value for every row.
- FK - Foreign Key, is a column that cam be used to link two tables.



### Join more then one table

```sql
SELECT *
FROM web_events
JOIN accounts
ON web_events.account_id = accounts.id
JOIN orders
ON accounts.id = orders.account_id
```
By using more than one JOIN we can create multiple links.

```sql
SELECT web_events.channel, accounts.name, orders.total
```
And by specifying in the SELECT operator we can pull specific columns.

To shorten the query to more manageable length we can use alias.
```sql
Select t1.column1 aliasname, t2.column2 aliasname2
FROM tablename AS t1
JOIN tablename2 AS t2
```
The AS is optional in this example.
```sql
Select t1.column1 aliasname, t2.column2 aliasname2
FROM tablename t1
JOIN tablename2 t2
```

### Test Questions

1. Provide a table for all web_events associated with account name of Walmart. There should be three columns. Be sure to include the primary_poc, time of the event, and the channel for each event. Additionally, you might choose to add a fourth column to assure only Walmart events were chosen.
```sql
SELECT we.occurred_at, we.channel, act.primary_poc, act.name
FROM web_events we
JOIN accounts act
ON act.id = we.account_id
WHERE act.name = 'Walmart';
```
2. Provide a table that provides the region for each sales_rep along with their associated accounts. Your final table should include three columns: the region name, the sales rep name, and the account name. Sort the accounts alphabetically (A-Z) according to account name.
```sql
SELECT r.name "Region", sr.name "Sales Rep", act.name "Account"
FROM region r
JOIN sales_reps sr
ON r.id = sr.region_id
JOIN accounts act
ON sr.id = act.sales_rep_id
ORDER BY act.name;

```
3. Provide the name for each region for every order, as well as the account name and the unit price they paid (total_amt_usd/total) for the order. Your final table should have 3 columns: region name, account name, and unit price. A few accounts have 0 for total, so I divided by (total + 0.01) to assure not dividing by zero.
```sql
SELECT r.name "Region", act.name "Account", o.total_amt_usd/(total+0.01) "Unit Price"
FROM orders o
JOIN accounts act
ON act.id = o.account_id
JOIN sales_reps sr
ON sr.id = act.sales_rep_id
JOIN region r
ON r.id = sr.region_id
ORDER BY "Unit Price";
```

* ON statements should always occur with FK being equal to PK
* JOIN statements allow us to pull data from multiple tables
* You can use all commands with the JOIN statement.

## Left and RIGHT joins

will get the inner join or the rows that are in both tables
left will get all the results of the left table even if the right table doesn't have matching data.
right is just the inverse
```sql
SELECT *
FROM left_table
LEFT JOIN right_table
```
It is most common that all queries will have LEFT JOINs only.

Using the **AND** keyword instead of a **WHERE** it will filter the LEFT table before the JOIN.



Questions:
1. Provide a table that provides the region for each sales_rep along with their associated accounts. This time only for the Midwest region. Your final table should include three columns: the region name, the sales rep name, and the account name. Sort the accounts alphabetically (A-Z) according to account name.
```sql
SELECT r.name region, sr.name rep, act.name account
FROM sales_reps sr
LEFT JOIN region r
ON r.id = sr.region_id
LEFT JOIN accounts act
ON sr.id = act.sales_rep_id
ORDER BY account;
```
2. Provide a table that provides the region for each sales_rep along with their associated accounts. This time only for accounts where the sales rep has a first name starting with S and in the Midwest region. Your final table should include three columns: the region name, the sales rep name, and the account name. Sort the accounts alphabetically (A-Z) according to account name.
```sql
SELECT r.name region, sr.name rep, act.name account
FROM sales_reps sr
LEFT JOIN region r
ON r.id = sr.region_id
LEFT JOIN accounts act
ON sr.id = act.sales_rep_id
WHERE sr.name like 'S%'
ORDER BY account;
```
3. Provide a table that provides the region for each sales_rep along with their associated accounts. This time only for accounts where the sales rep has a last name starting with K and in the Midwest region. Your final table should include three columns: the region name, the sales rep name, and the account name. Sort the accounts alphabetically (A-Z) according to account name.
```sql
SELECT r.name region, sr.name rep, act.name account
FROM sales_reps sr
LEFT JOIN region r
ON r.id = sr.region_id
LEFT JOIN accounts act
ON sr.id = act.sales_rep_id
WHERE sr.name like '% K%'
ORDER BY account;
```
4. Provide the name for each region for every order, as well as the account name and the unit price they paid (total_amt_usd/total) for the order. However, you should only provide the results if the standard order quantity exceeds 100. Your final table should have 3 columns: region name, account name, and unit price. In order to avoid a division by zero error, adding .01 to the denominator here is helpful total_amt_usd/(total+0.01).
```sql
SELECT r.name region, act.name account, o.total_amt_usd/(total+0.01) "Unit Price"
FROM orders o
LEFT JOIN accounts act
ON o.account_id = act.id
LEFT JOIN sales_reps sr
ON act.sales_rep_id = sr.id
LEFT JOIN region r
ON sr.region_id = r.id
WHERE o.standard_qty > 100;
```
5. Provide the name for each region for every order, as well as the account name and the unit price they paid (total_amt_usd/total) for the order. However, you should only provide the results if the standard order quantity exceeds 100 and the poster order quantity exceeds 50. Your final table should have 3 columns: region name, account name, and unit price. Sort for the smallest unit price first. In order to avoid a division by zero error, adding .01 to the denominator here is helpful (total_amt_usd/(total+0.01).
```sql
SELECT r.name region, act.name account, o.total_amt_usd/(total+0.01) "Unit Price"
FROM orders o
LEFT JOIN accounts act
ON o.account_id = act.id
LEFT JOIN sales_reps sr
ON act.sales_rep_id = sr.id
LEFT JOIN region r
ON sr.region_id = r.id
WHERE o.standard_qty > 100 AND o.poster_qty > 50
ORDER BY "Unit Price";
```
6. Provide the name for each region for every order, as well as the account name and the unit price they paid (total_amt_usd/total) for the order. However, you should only provide the results if the standard order quantity exceeds 100 and the poster order quantity exceeds 50. Your final table should have 3 columns: region name, account name, and unit price. Sort for the largest unit price first. In order to avoid a division by zero error, adding .01 to the denominator here is helpful (total_amt_usd/(total+0.01).
```sql
SELECT r.name region, act.name account, o.total_amt_usd/(total+0.01) "Unit Price"
FROM orders o
LEFT JOIN accounts act
ON o.account_id = act.id
LEFT JOIN sales_reps sr
ON act.sales_rep_id = sr.id
LEFT JOIN region r
ON sr.region_id = r.id
WHERE o.standard_qty > 100 AND o.poster_qty > 50
ORDER BY "Unit Price" DESC;
```
7. What are the different channels used by account id 1001? Your final table should have only 2 columns: account name and the different channels. You can try SELECT DISTINCT to narrow down the results to only the unique values.
```sql
SELECT act.name account, we.channel channel
FROM accounts act
LEFT JOIN web_events we
ON act.id = we.account_id
WHERE act.id = 1001;
```
8. Find all the orders that occurred in 2015. Your final table should have 4 columns: occurred_at, account name, order total, and order total_amt_usd.
```sql
SELECT o.occurred_at, a.name, o.total, o.total_amt_usd
FROM accounts a
JOIN orders o
ON o.account_id = a.id
WHERE o.occurred_at BETWEEN '01-01-2015' AND '01-01-2016'
ORDER BY o.occurred_at DESC;
```