## Basic analysis: gross margin
join/aggregate

In [0]:
%sql

SELECT b.city, sum(profit) / sum(amount) *100  as gross_margin_percent
FROM sales_order_detail a
JOIN sales_order_header b
ON a.order_id = b.order_id
GROUP BY city
ORDER BY gross_margin_percent desc

city,gross_margin_percent
Surat,19.698301113063856
Allahabad,18.27727353621641
Udaipur,18.15226225955026
Kolkata,17.748118699417862
Delhi,14.021343778728165
Thiruvananthapuram,13.90147856452931
Pune,13.55694274364565
Amritsar,12.07011315731085
Gangtok,7.600454890068234
Simla,7.569813062543272


### Profit by category

In [0]:
%sql

SELECT category, sub_category, sum(profit) profit
FROM sales_order_detail a
GROUP BY category, sub_category
ORDER BY profit desc

category,sub_category,profit
Electronics,Printers,5964.0
Furniture,Bookcases,4888.0
Electronics,Accessories,3559.0
Clothing,Trousers,2847.0
Clothing,Stole,2559.0
Electronics,Phones,2207.0
Clothing,Hankerchief,2098.0
Clothing,T-shirt,1500.0
Clothing,Shirt,1131.0
Furniture,Furnishings,844.0


### Profitable months

In [0]:
%sql

SELECT category, sum(profit) as total_profit, sum(quantity) as total_quantity, month(order_date) as month
FROM sales_order_detail a
JOIN sales_order_header b
ON a.order_id = b.order_id
GROUP BY month,category
ORDER BY category, total_profit desc, month

category,total_profit,total_quantity,month
Clothing,5060.0,516,3
Clothing,3736.0,380,11
Clothing,2148.0,284,10
Clothing,1901.0,259,12
Clothing,1822.0,312,2
Clothing,1691.0,436,1
Clothing,-48.0,142,7
Clothing,-184.0,251,4
Clothing,-267.0,233,5
Clothing,-1075.0,276,8


## Sales performance
pre-joined 'denormalized' table; rank by partition
``` SQL API 
partion, rank/sum etc: https://spark.apache.org/docs/latest/api/sql/index.html#dense_rank
```

In [0]:
%sql

SELECT 
distinct 
category, city, sum(quantity) OVER (PARTITION BY category, city) as quantity_for_city_category
--,* --uncomment to check details along with the aggregation
FROM sales_denorm
--WHERE category='Furniture' and month(order_date)=1
ORDER BY category,city, quantity_for_city_category desc
--limit 20

-- select * from sales_denorm where category='Furniture' and month(order_date)=1 order by quantity desc -- 110 rows

category,city,quantity_for_city_category
Clothing,Ahmedabad,168
Clothing,Allahabad,93
Clothing,Amritsar,26
Clothing,Bangalore,116
Clothing,Bhopal,158
Clothing,Chandigarh,179
Clothing,Chennai,56
Clothing,Delhi,201
Clothing,Gangtok,73
Clothing,Goa,114


## Sale performance: target achieved?
```
--fetech target and actual sales amount
--get insights: check whether the target was achieved or not
      use case: potential forecast what could happen in near future at CATEGORY level
```

In [0]:
%sql

SELECT t.category, t.year, t.month, t.target, a.total_sold_amount,
case 
  when t.target > a.total_sold_amount then 'not achieved'
  else 'achieved'
end as performance
FROM sales_target t
LEFT JOIN
(
  select category, month(order_date) as month, year(order_date) as year,sum(amount) as total_sold_amount
  from sales_denorm
  group by  category, year, month
) as a 
  on t.month = a.month
  and t.year = a.year
  and t.category=a.category
--where t.category='Furniture' and t.year=2018 and t.month=4
--LIMIT 10

category,year,month,target,total_sold_amount,performance
Furniture,2018,4,10400.0,8121.0,not achieved
Furniture,2018,5,10500.0,6220.0,not achieved
Furniture,2018,6,10600.0,5532.0,not achieved
Furniture,2018,7,10800.0,3483.0,not achieved
Furniture,2018,8,10900.0,9538.0,not achieved
Furniture,2018,9,11000.0,8704.0,not achieved
Furniture,2018,10,11100.0,6766.0,not achieved
Furniture,2018,11,11300.0,15165.0,achieved
Furniture,2018,12,11400.0,9474.0,not achieved
Furniture,2019,1,11500.0,21257.0,achieved
