## Exercises - Basic SQL Queries

Here are some of the exercises for which you can write SQL queries to self evaluate.
* Ensure that we have required database and user for retail data. **We might provide the database as part of our labs.** Here are the instructions to use `psql` for setting up the required database (if required) and tables.

```shell
psql -U postgres -h localhost -p 5432 -W
```

```sql
CREATE DATABASE itversity_retail_db;
CREATE USER itversity_retail_user WITH ENCRYPTED PASSWORD 'retail_password';
GRANT ALL ON DATABASE itversity_retail_db TO itversity_retail_user;
```

* Create Tables using the script provided. You can either use `psql` or **SQL Workbench**.

```shell
psql -U itversity_retail_user \
  -h localhost \
  -p 5432 \
  -d itversity_retail_db \
  -W
```

* You can drop the existing tables.

```sql
DROP TABLE order_items;
DROP TABLE orders;
DROP TABLE customers;
DROP TABLE products;
DROP TABLE categories;
DROP TABLE departments;
```

* Once the tables are dropped you can run below script to create the tables for the purpose of exercises.

```sql
\i /data/retail_db/create_db_tables_pg.sql
```

* Data shall be loaded using the script provided.

```sql
\i /data/retail_db/load_db_tables_pg.sql
```

* Run queries to validate we have data in all the 3 tables.

### Exercise 1 - Customer order count

Get order count per customer for the month of 2014 January.
* Tables - orders and customers
* Data should be sorted in descending order by count and ascending order by customer id.
* Output should contain customer_id, customer_first_name, customer_last_name and customer_order_count.

In [2]:
%load_ext sql

In [3]:
%env DATABASE_URL=postgresql://itv002461_retail_user:7ji8g7gg8p8olbqbna5vz1tjyikaixco@pg.itversity.com:5433/itv002461_retail_db

env: DATABASE_URL=postgresql://itv002461_retail_user:7ji8g7gg8p8olbqbna5vz1tjyikaixco@pg.itversity.com:5433/itv002461_retail_db


In [None]:
%sql SELECT current_database()

In [None]:
%sql select column_name from information_schema.columns where table_name='customers'

In [55]:
%sql select column_name from information_schema.columns where table_name='orders'

 * postgresql://itv002461_retail_user:***@pg.itversity.com:5433/itv002461_retail_db
4 rows affected.


column_name
order_id
order_date
order_customer_id
order_status


In [18]:
%%sql

SELECT customer_id,customer_fname AS customer_first_name,customer_lname AS customer_last_name,count(1) AS customer_order_count
FROM orders o JOIN customers c ON o.order_customer_id=c.customer_id
WHERE to_char(order_date,'yyyy-mm-dd') LIKE '2014-01%'
GROUP BY c.customer_id
ORDER BY c.customer_id,customer_order_count DESC
LIMIT 10

 * postgresql://itv002461_retail_user:***@pg.itversity.com:5433/itv002461_retail_db
10 rows affected.


customer_id,customer_first_name,customer_last_name,customer_order_count
7,Melissa,Wilcox,4
8,Megan,Smith,2
13,Mary,Baldwin,1
14,Katherine,Smith,1
15,Jane,Luna,1
17,Mary,Robinson,1
18,Robert,Smith,2
24,Mary,Smith,2
26,Johnny,Hood,2
27,Mary,Vincent,1


In [None]:
%sql select order_id,order_date from orders where order_customer_id=24

### Exercise 2 - Dormant Customers

Get the customer details who have not placed any order for the month of 2014 January.
* Tables - orders and customers
* Data should be sorted in ascending order by customer_id
* Output should contain all the fields from customers

In [None]:
%%sql

SELECT c.* ,o.*
FROM customers c LEFT OUTER JOIN orders o
ON c.customer_id=o.order_customer_id and to_char(order_date,'yyyy-mm-dd') LIKE '2014-01%'
WHERE o.order_customer_id IS NULL
ORDER BY c.customer_id


In [22]:
%%sql

SELECT * FROM customers c2
WHERE customer_id NOT IN (SELECT c.customer_id 
                          FROM customers as c JOIN orders as o ON c.customer_id=o.order_customer_id 
                          WHERE to_char(order_date,'YYYY-MM')='2014-01' 
                          ORDER BY c.customer_id )
LIMIT 20

 * postgresql://itv002461_retail_user:***@pg.itversity.com:5433/itv002461_retail_db
20 rows affected.


customer_id,customer_fname,customer_lname,customer_email,customer_password,customer_street,customer_city,customer_state,customer_zipcode
1,Richard,Hernandez,XXXXXXXXX,XXXXXXXXX,6303 Heather Plaza,Brownsville,TX,78521
2,Mary,Barrett,XXXXXXXXX,XXXXXXXXX,9526 Noble Embers Ridge,Littleton,CO,80126
3,Ann,Smith,XXXXXXXXX,XXXXXXXXX,3422 Blue Pioneer Bend,Caguas,PR,725
4,Mary,Jones,XXXXXXXXX,XXXXXXXXX,8324 Little Common,San Marcos,CA,92069
5,Robert,Hudson,XXXXXXXXX,XXXXXXXXX,10 Crystal River Mall,Caguas,PR,725
6,Mary,Smith,XXXXXXXXX,XXXXXXXXX,3151 Sleepy Quail Promenade,Passaic,NJ,7055
9,Mary,Perez,XXXXXXXXX,XXXXXXXXX,3616 Quaking Street,Caguas,PR,725
10,Melissa,Smith,XXXXXXXXX,XXXXXXXXX,8598 Harvest Beacon Plaza,Stafford,VA,22554
11,Mary,Huffman,XXXXXXXXX,XXXXXXXXX,3169 Stony Woods,Caguas,PR,725
12,Christopher,Smith,XXXXXXXXX,XXXXXXXXX,5594 Jagged Embers By-pass,San Antonio,TX,78227


### Exercise 3 - Revenue Per Customer

Get the revenue generated by each customer for the month of 2014 January
* Tables - orders, order_items and customers
* Data should be sorted in descending order by revenue and then ascending order by customer_id
* Output should contain customer_id, customer_first_name, customer_last_name, customer_revenue.
* If there are no orders placed by customer, then the corresponding revenue for a give customer should be 0.
* Consider only COMPLETE and CLOSED orders

### Answer

In [23]:
%%sql

SELECT c.customer_id, c.customer_fname, c.customer_lname,
CASE 
    WHEN (round(sum(oi.order_item_subtotal)::numeric,2)) IS NULL THEN 0 
    ELSE (round(sum(oi.order_item_subtotal)::numeric,2)) 
END AS customer_revenue
FROM customers c LEFT OUTER JOIN orders o 
ON c.customer_id=o.order_customer_id 
LEFT OUTER JOIN order_items oi 
ON o.order_id=oi.order_item_order_id
WHERE to_char(o.order_date,'YYYY-MM') LIKE '2014-01%' AND o.order_status IN('COMPLETE','CLOSED')
GROUP BY c.customer_id
ORDER BY c.customer_id,customer_revenue DESC
LIMIT 10

 * postgresql://itv002461_retail_user:***@pg.itversity.com:5433/itv002461_retail_db
10 rows affected.


customer_id,customer_fname,customer_lname,customer_revenue
8,Megan,Smith,353.93
14,Katherine,Smith,704.93
15,Jane,Luna,0.0
17,Mary,Robinson,569.95
18,Robert,Smith,1309.85
26,Johnny,Hood,699.96
27,Mary,Vincent,0.0
28,Timothy,Smith,59.99
38,Mary,Smith,1209.83
42,Ethan,Smith,559.94


#### Practice query

In [25]:
%sql select column_name from information_schema.columns where table_name='order_items'

 * postgresql://itv002461_retail_user:***@pg.itversity.com:5433/itv002461_retail_db
6 rows affected.


column_name
order_item_id
order_item_order_id
order_item_product_id
order_item_quantity
order_item_subtotal
order_item_product_price


In [13]:
%%sql

SELECT o.order_customer_id,o.order_id,round(sum(order_item_subtotal)::numeric,2) AS customer_revenue
FROM  orders o LEFT OUTER JOIN order_items oi ON o.order_id=oi.order_item_order_id
WHERE to_char(order_date,'YYYY-MM') LIKE '2014-01%' AND order_status IN('COMPLETE','CLOSED')
GROUP BY o.order_id
ORDER BY order_customer_id,o.order_id,customer_revenue DESC
LIMIT 10

 * postgresql://itv002461_retail_user:***@pg.itversity.com:5433/itv002461_retail_db
10 rows affected.


order_customer_id,order_id,customer_revenue
4598,25882,879.93
6735,25888,299.98
10045,25889,119.95
3037,25891,319.97
1044,25895,599.97
6405,25897,615.98
3950,25898,129.99
8068,25899,
2382,25900,719.94
3099,25901,599.95


In [14]:
%%sql

SELECT o.order_id,count(order_item_subtotal)
FROM orders o LEFT OUTER JOIN order_items oi ON o.order_id=oi.order_item_order_id
WHERE to_char(order_date,'YYYY-MM') LIKE '2014-01%' AND order_status IN('COMPELTE','CLOSED')
group by order_id,order_item_subtotal
order by order_id
LIMIT 10

 * postgresql://itv002461_retail_user:***@pg.itversity.com:5433/itv002461_retail_db
10 rows affected.


order_id,count
25891,1
25891,1
25891,1
25899,0
25900,1
25900,1
25900,1
25932,1
25932,1
25932,1


### Exercise 4 - Revenue Per Category

Get the revenue generated for each category for the month of 2014 January
* Tables - orders, order_items, products and categories
* Data should be sorted in ascending order by category_id.
* Output should contain all the fields from category along with the revenue as category_revenue.
* Consider only COMPLETE and CLOSED orders

In [54]:
%sql select column_name from information_schema.columns where table_name='categories'

 * postgresql://itv002461_retail_user:***@pg.itversity.com:5433/itv002461_retail_db
3 rows affected.


column_name
category_id
category_department_id
category_name


In [56]:
%sql select column_name from information_schema.columns where table_name='products'

 * postgresql://itv002461_retail_user:***@pg.itversity.com:5433/itv002461_retail_db
6 rows affected.


column_name
product_id
product_category_id
product_name
product_description
product_price
product_image


In [11]:
%%sql

SELECT c.*,round(sum(order_item_subtotal)::numeric,2) AS category_revenue
FROM  categories c  LEFT JOIN products p ON c.category_id=p.product_category_id 
LEFT JOIN order_items oi ON p.product_id=oi.order_item_product_id 
LEFT JOIN orders o ON oi.order_item_order_id=o.order_id 
WHERE to_char(order_date,'YYYY-MM') LIKE '2014-01%' AND order_status IN('COMPLETE','CLOSED')
GROUP BY c.category_id
ORDER BY c.category_id
LIMIT 10

 * postgresql://itv002461_retail_user:***@pg.itversity.com:5433/itv002461_retail_db
10 rows affected.


category_id,category_department_id,category_name,category_revenue
2,2,Soccer,1094.88
3,2,Baseball & Softball,3214.41
4,2,Basketball,1299.98
5,2,Lacrosse,1299.69
6,2,Tennis & Racquet,1124.75
7,2,Hockey,1433.0
9,3,Cardio Equipment,133156.77
10,3,Strength Training,3388.96
11,3,Fitness Accessories,1509.73
12,3,Boxing & MMA,3998.46


### Exercise 5 - Product Count Per Department

Get the products for each department.
* Tables - departments, categories, products
* Data should be sorted in ascending order by department_id
* Output should contain all the fields from department and the product count as product_count

In [None]:
%sql select column_name from information_schema.columns where table_name='categories'

In [None]:
%sql select column_name from information_schema.columns where table_name='departments'

In [None]:
%sql select column_name from information_schema.columns where table_name='products'

In [4]:
%%sql
select  d.*,count(p.product_id) as product_count
from departments d  join categories c on d.department_id=c.category_department_id  
join products p on c.category_id=p.product_category_id
group by d.department_id
order by d.department_id
limit 10

6 rows affected.


department_id,department_name,product_count
2,Fitness,168
3,Footwear,168
4,Apparel,140
5,Golf,120
6,Outdoors,336
7,Fan Shop,149


#### Practice

In [None]:
%%sql

select departments.department_id,departments.department_name,products.product_id,count(products.product_id),products.product_name
from departments,categories,products
where departments.department_id=categories.category_department_id and categories.category_id=products.product_category_id  
and departments.department_id=4
group by departments.department_id,products.product_id
order by products.product_id
