# SQL Scripts - Windows Commands in SQL
- Run these script in MySQL 8.0+ (window functions supported)

## Connect to MySQL
- go terminal mode
- root/Mysql@123
- mysql -u piuser - p
- enter password (Pass#123)
- #user-piuser, pwd-Pass#123, yourdb - piit
- mysql
    - show databases;
    - use piit;
    - show tables; 

### create or use database
- CREATE DATABASE IF NOT EXISTS piit;
- USE piit;

### Drop table if exists sales
- DROP TABLE IF EXISTS sales;

### Create the table
- CREATE TABLE sales (
    id INT PRIMARY KEY,
    salesperson VARCHAR(100) NOT NULL,
    region VARCHAR(50) NOT NULL,
    sale_date DATE NOT NULL,
    amount DECIMAL(10,2) NOT NULL
- DESC sales;

### Insert sample data
-  INSERT INTO sales (id, salesperson, region, sale_date, amount) VALUES
(1, 'Alice', 'East', '2025-08-01', 200),
(2, 'Bob',   'East', '2025-08-02', 150),
(3, 'Alice', 'West', '2025-08-03', 300),
(4, 'Bob',   'East', '2025-08-04', 250),
(5, 'Alice', 'East', '2025-08-05', 100);
- select * from sales;

### Quick peek
- SELECT * FROM sales ORDER BY id;

### WINDOWS FUNCTION
- Window functions perform calculations across a set of rows related to the current row, without collapsing the result set like GROUP BY does.
They work with the OVER() clause, which defines the window (set of rows) for the calculation.
- Common uses:
    - Ranking rows
    - Running totals
    - Moving averages
    - Calculations within partitions

### Window function examples
- Rank sales within each region by amount (highest = rank 1)
#### SELECT salesperson,   region,   amount,   RANK() OVER (PARTITION BY region ORDER BY amount DESC) AS region_rank FROM sales ORDER BY region, amount DESC;
- PARTITION BY region → restarts the rank for each region.
- ORDER BY amount DESC → highest sales get rank 1.
- Ranking focus — ordered by region and amount desc: makes RANK() behavior obvious

### Running total per salesperson by sale_date
#### SELECT   salesperson,   sale_date,   amount,   SUM(amount) OVER ( PARTITION BY salesperson   ORDER BY sale_date  ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW  ) AS running_total FROM sales ORDER BY salesperson, sale_date;
- Uses SUM() as a window function.
- Orders by sale_date to keep the cumulative sum correct.

### 3-row moving average per salesperson (current row + previous 2)
- 3-day moving average for each salesperson.
#### SELECT   salesperson,   sale_date,  amount,   AVG(amount) OVER (     PARTITION BY salesperson     ORDER BY sale_date     ROWS BETWEEN 2 PRECEDING AND CURRENT ROW   ) AS moving_avg_3 FROM sales ORDER BY salesperson, sale_date;
- The window frame (ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) looks at the current row and the two previous rows.

### Each sale as % of total sales in its region
- Find each sale’s percentage of total sales in its region
#### SELECT   salesperson,   region,  amount,  ROUND(amount / SUM(amount) OVER (PARTITION BY region) * 100, 2) AS pct_of_region FROM sales ORDER BY region, amount DESC;
- Divides each sale amount by the regional total.
- Keeps all rows without grouping them away.

### Summary of Windows Functions
- PARTITION BY → Splits data into groups without reducing rows.
- ORDER BY in OVER() → Controls calculation order.
- Window frames (ROWS BETWEEN...) → Fine-tune which rows are considered.
- Window functions do not filter rows; use them alongside WHERE/HAVING.

## Other Windows Functions
- let’s add examples for DENSE_RANK, LEAD, and LAG on the same sales table. I’ll first insert a few extra rows to create ties (so RANK vs DENSE_RANK differences actually show up), then run the window queries
- Add rows that create ties within regions and give more timeline for LEAD/LAG
- INSERT INTO sales (id, salesperson, region, sale_date, amount) VALUES
(6, 'Bob',   'East', '2025-08-06', 250),   -- tie with Bob's earlier 250
(7, 'Carol', 'East', '2025-08-02', 150),   -- tie with Bob's 150
(8, 'Dan',   'West', '2025-08-04', 300);   -- tie with Alice's 300 (West)
- select * from sales;

### Dense Rank
- Rank sales in each region by amount, showing how RANK leaves gaps on ties while DENSE_RANK doesn’t.
#### SELECT   region,  salesperson,  amount,  RANK()  OVER (PARTITION BY region ORDER BY amount DESC) AS rnk,  DENSE_RANK() OVER (PARTITION BY region ORDER BY amount DESC) AS dense_rnk FROM sales ORDER BY region, amount DESC, salesperson;
- If two rows tie on amount, RANK() skips the next number (e.g., 1, 2, 2, 4), while DENSE_RANK() stays consecutive (1, 2, 2, 3).

###  LAG() — compare each sale to the previous one (per salesperson)
- Goal: For each salesperson, see the previous sale amount and the change from that sale.
#### SELECT   salesperson,  sale_date,  amount,  LAG(amount, 1) OVER (    PARTITION BY salesperson    ORDER BY sale_date  ) AS prev_amount,  amount - LAG(amount, 1) OVER (    PARTITION BY salesperson    ORDER BY sale_date ) AS change_from_prev FROM sales ORDER BY salesperson, sale_date;
- The first row per salesperson has NULL for prev_amount. If you prefer a default (e.g., 0), use LAG(amount, 1, 0).

### LEAD() — peek at the next sale (per salesperson)
- Goal: For each sale, see the next one for the same salesperson and the expected change.
### SELECT   salesperson,  sale_date,  amount,  LEAD(amount, 1) OVER (    PARTITION BY salesperson    ORDER BY sale_date  ) AS next_amount,  LEAD(amount, 1) OVER (    PARTITION BY salesperson    ORDER BY sale_date ) - amount AS change_to_next FROM sales ORDER BY salesperson, sale_date;
- The last row per salesperson will have NULL for next_amount. You can give a default like LEAD(amount, 1, amount) to repeat current amount.

### Bonus: LAG() within a different partition/order
- Goal: Within each region, list amounts from highest to lowest and show the previous higher amount.
#### SELECT   region,  salesperson,  amount,  LAG(amount) OVER (    PARTITION BY region    ORDER BY amount DESC  ) AS prev_higher_amount,  amount - LAG(amount) OVER (    PARTITION BY region    ORDER BY amount DESC  ) AS diff_from_prev FROM sales ORDER BY region, amount DESC;