### Setup

#### Load "dotenv" extension to use enviroment variables in the Notebook

In [4]:
%load_ext dotenv
%dotenv

The dotenv extension is already loaded. To reload it, use:
  %reload_ext dotenv


#### Load required libraries

In [5]:
from sqlalchemy import create_engine, text
import psycopg2 as ps
import pandas as pd
import os

#### Setup database connection

Make sure you create a `.env` file in the same directory as this notebook.

Your `.env` file should contain the definition of the following variables:
- DB_HOSTNAME
- DB_PORT
- DB_NAME
- DB_USER
- DB_PASS


Example:
```
DB_HOSTNAME = "localhost"
DB_PORT = 5432
DB_NAME = "retail_db"
DB_USER = "retail_user"
DB_PASS = "retailpassword"
```

In [6]:
DB_HOST = os.getenv("DB_HOSTNAME")
DB_PORT = os.getenv("DB_PORT")
DB_NAME = os.getenv("DB_NAME")
DB_USER = os.getenv("DB_USER")
DB_PASS = os.getenv("DB_PASS")
    
DB_CONNECTION_URI = "postgresql://%s:%s@%s:%s/%s" % (DB_USER, DB_PASS, DB_HOST, DB_PORT, DB_NAME)

# Create SQL engine with SQLAlchemy
sql_engine = create_engine(DB_CONNECTION_URI)

> You can now execute queries to the database with pandas!

# Exercises - Basic SQL Queries

Here are some of the exercises for which you can write SQL queries to self evaluate using all the concepts we have learnt to write SQL Queries.

- All the exercises are based on the reatail database (retail tables).
- We have already setup the tables and also populated the data.
- We will use all the 6 tables in retail database as part of these exercises.

## Solutions

**NOTE:** You can run queries using pandas and SQLAlchemy as it follows:

```python
sql_df = pd.read_sql_query(
    '''
        SELECT *
        FROM courses
        ORDER BY course_id;
    ''',
    con = sql_engine
)
 
# Show query results
sql_df
```

### Exercise 1 - Customer order count

Get order count per customer for the month of 2014 January.

- Tables - `orders` and `customers`
- Data should be sorted in descending order by count and ascending order by customer id
- Output should contain `customer_id`, `customer_fname`, `customer_lname` and `customer_order_count`.

In [None]:
sql_df = pd.read_sql_query(
    '''
        YOUR QUERY HERE
    ''',
    con = sql_engine
)
 
# Show query results
sql_df

### Exercise 2 - Dormant Customers

Get the customer details who have not placed any order for the month of 2014 january.

- Tables - `orders` and `customers`
- Output columns - All columns from `customers`
- Data should be sorted in ascending order by `customer_id`
- Output should contain all the fields from `customers`

> HINT: You can use `NOT IN` or `NOT EXISTS` or `OUTER JOIN` to solve this problem.

In [16]:
sql_df = pd.read_sql_query(
    '''
        YOUR QUERY HERE
    ''',
    con = sql_engine
)
 
# Show query results
sql_df

ProgrammingError: (psycopg2.errors.SyntaxError) syntax error at or near "YOUR"
LINE 2:         YOUR QUERY HERE
                ^

[SQL: 
        YOUR QUERY HERE
    ]
(Background on this error at: https://sqlalche.me/e/14/f405)

### Exercise 3

Get the revenue generated by each customer for the month of 2014 january.

- Tables - `orders`, `order_items` and `customers`
- Data should be sorted in descending order by revenue and then ascending order by `customer_id`
- Output should contain `customer_id`, `customer_fname`, `customer_lname`and `customer_revenue`.
- If there are no orders placed by customer, then the corresponding revenue for a given customer should be 0.
- Consider only `COMPLETE` and `CLOSED` orders.

In [None]:
sql_df = pd.read_sql_query(
    '''
        YOUR QUERY HERE
    ''',
    con = sql_engine
)
 
# Show query results
sql_df

### Exercise 4

Get the revenue generated for each category for the month of 2014 january.


- Tables - `orders`, `order_items`, `products` and `categories`
- Data should be sorted in ascending order by `category_id`
- Output should contain all the field from `categories` along with the revenue as `category_revenue`
- Consider only `COMPLETE` and `CLOSED` orders.

In [None]:
sql_df = pd.read_sql_query(
    '''
        YOUR QUERY HERE
    ''',
    con = sql_engine
)
 
# Show query results
sql_dfs

### Exercise 5

Get the count of products for each department.
- Tables - `departments`, `categories`, `products``
- Data should be sorted in ascending order by `deparment_id``
- Output should contain all the fields from departments and the product count as `product_count`

In [None]:
sql_df = pd.read_sql_query(
    '''
        YOUR QUERY HERE
    ''',
    con = sql_engine
)
 
# Show query results
sql_df