# Lesson 9 - SQL Aggregations - Part 2 of 2

## `DISTINCT`

`DISTINCT` is always used in `SELECT` statements, and it provides the unique rows for all columns written in the `SELECT` statement. Therefore, you only use `DISTINCT` once in any particular `SELECT` statement.

You could write:

`SELECT DISTINCT column1, column2, column3
FROM table1;`
which would return the unique (or DISTINCT) rows across all three columns.

You would not write:

SELECT DISTINCT column1, DISTINCT column2, DISTINCT column3
FROM table1;
You can think of DISTINCT the same way you might think of the statement "unique".

DISTINCT - Expert Tip
It’s worth noting that using DISTINCT, particularly in aggregations, can slow your queries down quite a bit.

*Examples:*

<img src="../SQL/ERD DAND.jpg" width="600" height="400">

Use DISTINCT to test if there are any accounts associated with more than one region.

    The below two queries have the same number of resulting rows (351), so we know that every account is associated with only one region. If each account was associated with more than one region, the first query should have returned more rows than the second query.

    `SELECT a.id as "account id", r.id as "region id", 
    a.name as "account name", r.name as "region name"
    FROM accounts a
    JOIN sales_reps s
    ON s.id = a.sales_rep_id
    JOIN region r
    ON r.id = s.region_id;`
    
    and

    `SELECT DISTINCT id, name
    FROM accounts;`


Have any sales reps worked on more than one account?

Yes, all sales reps have worked on more than one account. At a minimum there is 3.

`SELECT s.id, s.name, COUNT(*) num_accounts
FROM accounts a
JOIN sales_reps s
ON s.id = a.sales_rep_id
GROUP BY s.id, s.name
ORDER BY num_accounts`

This ensures all sales reps are accounted for (50 of them):

`SELECT DISTINCT id, name
FROM sales_reps;`




## `HAVING`

This is like using a `WHERE` clause except you can use aggregate functions on it. You cannot do this with a `WHERE` clause.

`HAVING` is the “clean” way to filter a query that has been aggregated, but this is also commonly done using a subquery. Essentially, any time you want to perform a `WHERE` on an element of your query that was created by an aggregate, you need to use `HAVING` instead.




Key takeaways - `HAVING` vs `WHERE`:

- `WHERE` subsets the returned data based on a logical condition.

- `WHERE` appears **after** the `FROM`, `JOIN`, and `ON` clauses, but **before** `GROUP BY`.

- `HAVING` appears **after** the `GROUP BY` clause, but **before** the ORDER BY clause.

- `HAVING` is like `WHERE`, but it works on logical statements involving aggregations.


*Examples:*

How many of the sales reps have more than 5 accounts that they manage?

`SELECT s.id, s.name, COUNT(*) num_accounts
FROM accounts a
JOIN sales_reps s
ON s.id = a.sales_rep_id
GROUP BY s.id, s.name
HAVING COUNT(*) > 5
ORDER BY num_accounts`


How many accounts have more than 20 orders?

`SELECT a.id, a.name, COUNT(*) num_orders
FROM orders o
JOIN accounts a
ON a.id = o.account_id
GROUP BY a.id, a.name
HAVING COUNT(*) > 20
ORDER BY num_orders`


Which account has the most orders?

`SELECT a.id, a.name, COUNT(*) num_orders
FROM orders o
JOIN accounts a
ON a.id = o.account_id
GROUP BY a.id, a.name
HAVING COUNT(*) > 20
ORDER BY num_orders DESC
LIMIT 1`


How many accounts spent more than 30,000 usd total across all orders?

204 results.

`SELECT a.id, a.name, SUM(o.total_amt_usd) total_spent
FROM accounts a
JOIN orders o
ON a.id = o.account_id
GROUP BY a.id, a.name
HAVING SUM(o.total_amt_usd) > 30000
ORDER BY total_spent;`


How many accounts spent less than 1,000 usd total across all orders?

3 results.

`SELECT a.id, a.name, SUM(o.total_amt_usd) total_spent
FROM accounts a
JOIN orders o
ON a.id = o.account_id
GROUP BY a.id, a.name
HAVING SUM(o.total_amt_usd) < 1000
ORDER BY total_spent;`


Which account has spent the most with us?

EOG Resources

`SELECT a.id, a.name, SUM(o.total_amt_usd) total_spent
FROM accounts a
JOIN orders o
ON a.id = o.account_id
GROUP BY a.id, a.name
ORDER BY total_spent DESC
LIMIT 1;`


Which account has spent the least with us?

Nike

`SELECT a.id, a.name, SUM(o.total_amt_usd) total_spent
FROM accounts a
JOIN orders o
ON a.id = o.account_id
GROUP BY a.id, a.name
ORDER BY total_spent
LIMIT 1;`


Which accounts used facebook as a channel to contact customers more than 6 times?

`SELECT a.id, a.name, w.channel, COUNT(*) use_of_channel
FROM accounts a
JOIN web_events w
ON a.id = w.account_id
GROUP BY a.id, a.name, w.channel
HAVING COUNT(*) > 6 AND w.channel = 'facebook'
ORDER BY use_of_channel;`


Which account used facebook most as a channel? 

Gilead Sciences

`SELECT a.id, a.name, w.channel, COUNT(*) use_of_channel
FROM accounts a
JOIN web_events w
ON a.id = w.account_id
WHERE w.channel = 'facebook'
GROUP BY a.id, a.name, w.channel
ORDER BY use_of_channel DESC
LIMIT 1;`


Which channel was most frequently used by most accounts?

All the top 10 are direct.

`SELECT a.id, a.name, w.channel, COUNT(*) use_of_channel
FROM accounts a
JOIN web_events w
ON a.id = w.account_id
GROUP BY a.id, a.name, w.channel
ORDER BY use_of_channel DESC
LIMIT 10;`