## 5. HAVING
Filters results after aggregation. Similar to WHERE, but operates on aggregated data.

Remember, while `WHERE` filters **raw rows before aggregation**, `HAVING` filters **groups created by `GROUP BY`**.  

**Connection to GROUP BY:**  
- `GROUP BY` creates aggregated groups (e.g., total sales per customer).  
- `HAVING` applies conditions on these aggregated values. Without `GROUP BY`, `HAVING` can still work on aggregate functions applied to the entire table.  


```sql
SELECT customer_id, SUM(amount) AS total_sales
FROM sales
GROUP BY customer_id
HAVING SUM(amount) > 100;


In [None]:
query_select = """
SELECT customer_id, SUM(amount) AS total_sales
FROM sales
GROUP BY customer_id
HAVING SUM(amount) > 100;
"""
df_select = pd.read_sql_query(query_select, conn)
print(df_select)

   customer_id  total_sales
0          101       225.25
1          102       401.00
2          103       150.00


## 6. ORDERBY
Sorts the result set by one or more columns.

```sql
SELECT product_id, amount
FROM sales
ORDER BY amount DESC; -- DESC for descending, ASC for ascending


In [None]:
query_select = """
SELECT product_id, amount
FROM sales
ORDER BY amount DESC;
"""
df_select = pd.read_sql_query(query_select, conn)
print(df_select)

   product_id  amount
0           2  200.50
1           2  200.50
2           1  150.00
3           1  150.00
4           3   75.25


## Some More SQL Essentials

### DISTINCT
Returns **unique values** from a column, removing duplicates.

```sql
-- Get unique customer IDs
SELECT DISTINCT customer_id
FROM sales;


In [None]:
query_select = """
SELECT DISTINCT customer_id
FROM sales;
"""
df_select = pd.read_sql_query(query_select, conn)
print(df_select)

   customer_id
0          101
1          102
2          103


### COUNT
Counts the number of rows that satisfy a condition.

```sql
-- Count total sales
SELECT COUNT(*) AS total_sales
FROM sales;

-- Count number of unique customers
SELECT COUNT(DISTINCT customer_id) AS unique_customers
FROM sales;

In [None]:
query_select = """
SELECT COUNT(DISTINCT customer_id) AS unique_customers
FROM sales;
"""
df_select = pd.read_sql_query(query_select, conn)
print(df_select)

   unique_customers
0                 3


### LIMIT
Restricts the number of rows returned, useful for sampling or previewing data.
```sql
-- Get the 10 most recent sales
SELECT *
FROM sales
ORDER BY sale_date DESC
LIMIT 10;

```sql
-- Count unique customers but only show the first 5 results
SELECT customer_id, COUNT(*) AS num_sales
FROM sales
GROUP BY customer_id
ORDER BY num_sales DESC
LIMIT 5;

In [None]:
query_select = """
SELECT customer_id, COUNT(*) AS num_sales
FROM sales
GROUP BY customer_id
ORDER BY num_sales DESC
LIMIT 5;
"""
df_select = pd.read_sql_query(query_select, conn)
print(df_select)

   customer_id  num_sales
0          102          2
1          101          2
2          103          1
