
##  SQL Window Functions
The notebook covers:
- Database and table creation
- Sample data loading
- Window functions:
  - `ROW_NUMBER`
  - `RANK`
  - `DENSE_RANK`


## Create a Sample Database

In [None]:

CREATE DATABASE IF NOT EXISTS retail_analytics;


In [None]:
USE retail_analytics;


## Create Products Table

In [None]:

CREATE TABLE IF NOT EXISTS products (
    product_id INT PRIMARY KEY AUTO_INCREMENT,
    category VARCHAR(50),
    product_name VARCHAR(100),
    total_sales INT
);



## Load Sample Data

In [None]:

INSERT INTO products (category, product_name, total_sales) VALUES
('Electronics', 'iPhone 17', 120),
('Electronics', 'Samsung Galaxy S25', 115),
('Electronics', 'OnePlus Nord CE5', 115),
('Electronics', 'Redmi A4', 100),
('Electronics', 'Vivo V27', 100),
('Electronics', 'Realme Narzo', 95),
('Electronics', 'Samsung Galaxy A55', 90),
('Electronics', 'iPhone 16 Pro', 85);



## View the Table


In [None]:

SELECT *
FROM products;



## Rank products based on how many units they sold, where higher sales indicate higher popularity


**Start by viewing the required data**


Before applying any ranking logic, first select the relevant columns.

```sql
SELECT 
    product_name,
    total_sales
FROM products;

#### Popularity is determined by higher sales values, so the data must be sorted accordingly

```sql
SELECT 
    product_name,
    total_sales
FROM products
ORDER BY total_sales DESC;


### Why rows cannot be numbered directly?

Row numbering without order is undefined because SQL tables do not guarantee row order. SQL must know how to arrange rows before assigning numbers.

`ROW_NUMBER()` is a window function that assigns a unique sequential number to each row. However, it requires additional context to determine how rows should be numbered.

The `OVER()` clause defines the window, or perspective, of the data over which the function operates.

`ROW_NUMBER() OVER ()`

This alone is incomplete because no ordering is specified. SQL still cannot determine how to number the rows




**To rank products by popularity, rows must be ordered by sales within the window**

```sql
ROW_NUMBER() OVER (ORDER BY total_sales DESC)




## Ranking Query

The complete query combines column selection, window definition, and row numbering

In [None]:

SELECT 
    product_name,
    total_sales,
    ROW_NUMBER() OVER (ORDER BY total_sales DESC) AS popularity
FROM products;


SQL processes this query conceptually in the following way: 
- Data is read from the `products` table
- Rows are ordered within the window by `total_sales DESC`
- `ROW_NUMBER()` assigns sequential numbers starting from `1`, 
- The final result is returned with the calculated popularity column.


## Comparing ROW_NUMBER, RANK, and DENSE_RANK

Used when ties in sales values exist.


In [None]:

SELECT 
    product_name,
    total_sales,
    ROW_NUMBER() OVER (ORDER BY total_sales DESC) AS row_num,
    RANK() OVER (ORDER BY total_sales DESC) AS rank_num,
    DENSE_RANK() OVER (ORDER BY total_sales DESC) AS dense_rank_num
FROM products;



## Market Segmentation

Segment products into **4 performance tiers** based on sales.

`NTILE(n)` divides ordered rows into n approximately equal groups.

For example:

- NTILE(4) creates 4 performance buckets

- Bucket numbers start from 1

- Bucket 1 contains the highest-performing rows


In [None]:

SELECT 
    product_name,
    total_sales,
    NTILE(4) OVER (ORDER BY total_sales DESC) AS performance_bucket
FROM products;



This approach is commonly used for **pricing strategy**, **inventory planning**, and **promotions**.



## Summary

- Window functions enable **row-level analytics**
- `ROW_NUMBER`, `RANK`, `DENSE_RANK` handle ranking logic
- `NTILE` helps in segmentation and categorization

These patterns are widely used in **retail analytics, dashboards, and reporting systems**.
