<img src="motorcycle.jpg" alt="Image of a motorcycle" height="250" width="250">

You're working for a company that sells motorcycle parts, and they've asked with some help in analyzing their sales data!

They operate three warehouses in the area, selling both retail and wholesale. They offer a variety of parts and accept credit card, cash, and bank transfer as payment methods. However, each payment type incurs a different fee.

The board of directors want to gain a better understanding of wholesale revenue by product line, and how this varies month-to-month and across warehouses. You have been tasked with calculating net revenue for each product line, grouping results by month and warehouse. The results should be filtered so that only `"Wholesale"` orders are included.

They have provided you with access to their database, which contains the following table called `sales`:

| Column | Data type | Description |
|--------|-----------|-------------|
| `order_number` | `VARCHAR` | Unique order number. |
| `date` | `DATE` | Date of the order, from June to August 2021. |
| `warehouse` | `VARCHAR` | The warehouse that the order was made from&mdash; `North`, `Central`, or `West`. |
| `client_type` | `VARCHAR` | Whether the order was `Retail` or `Wholesale`. |
| `product_line` | `VARCHAR` | Type of product ordered. |
| `quantity` | `INT` | Number of products ordered. | 
| `unit_price` | `FLOAT` | Price per product (dollars). |
| `total` | `FLOAT` | Total price of the order (dollars). |
| `payment` | `VARCHAR` | Payment method&mdash;`Credit card`, `Transfer`, or `Cash`. |
| `payment_fee` | `FLOAT` | Percentage of `total` charged as a result of the `payment` method. |


Your query output should be presented in the following format:

| `product_line` | `month` | `warehouse` |	`net_revenue` |
|----------------|-----------|----------------------------|--------------|
| product_one | --- | --- | --- |
| product_one | --- | --- | --- |
| product_one | --- | --- | --- |
| product_one | --- | --- | --- |
| product_one | --- | --- | --- |
| product_one | --- | --- | --- |
| product_two | --- | --- | --- |
| ... | ... | ... | ... |



In [1]:
import pandas as pd
import sqlalchemy
engine = sqlalchemy.create_engine('postgresql://postgres:042711@localhost:2705/mei_database')
%load_ext sql
%sql $engine.url

In [2]:
sales = pd.read_csv('sales.csv')
sales.to_sql('sales', engine, if_exists='replace', index=False)

In [14]:
%%sql
ALTER TABLE sales
ALTER COLUMN date TYPE timestamp
USING date::timestamp;

SELECT table_name, column_name, data_type 
FROM INFORMATION_SCHEMA.COLUMNS
WHERE table_name = 'sales'

 * postgresql://postgres:***@localhost:2705/mei_database
Done.
10 rows affected.


table_name,column_name,data_type
sales,payment_fee,double precision
sales,date,timestamp without time zone
sales,quantity,bigint
sales,unit_price,double precision
sales,total,double precision
sales,order_number,text
sales,payment,text
sales,warehouse,text
sales,client_type,text
sales,product_line,text


In [6]:
%%sql
SELECT * 
FROM sales
LIMIT 5

 * postgresql://postgres:***@localhost:2705/mei_database
5 rows affected.


order_number,date,warehouse,client_type,product_line,quantity,unit_price,total,payment,payment_fee
N1,2021-06-01T00:00:00.000Z,North,Retail,Breaking system,9,19.29,173.61,Cash,0.0
N2,2021-06-01T00:00:00.000Z,North,Retail,Suspension & traction,8,32.93,263.45,Credit card,0.03
N3,2021-06-01T00:00:00.000Z,North,Wholesale,Frame & body,16,37.84,605.44,Transfer,0.01
N4,2021-06-01T00:00:00.000Z,North,Wholesale,Suspension & traction,40,37.37,1494.8,Transfer,0.01
N5,2021-06-01T00:00:00.000Z,North,Retail,Frame & body,6,45.44,272.61,Credit card,0.03


In [29]:
%%sql
SELECT product_line, trim(to_char (date,'Month')) as month, warehouse, round (sum(total * (1 - payment_fee))::numeric, 2) as net_revenue
FROM sales
WHERE client_type = 'Wholesale'
GROUP BY product_line, month, warehouse
ORDER BY product_line, month, net_revenue DESC 
LIMIT 10


 * postgresql://postgres:***@localhost:2705/mei_database
10 rows affected.


product_line,month,warehouse,net_revenue
Breaking system,August,Central,3009.1
Breaking system,August,West,2475.71
Breaking system,August,North,1753.19
Breaking system,July,Central,3740.94
Breaking system,July,West,3030.39
Breaking system,July,North,2568.55
Breaking system,June,Central,3648.14
Breaking system,June,North,1472.93
Breaking system,June,West,1200.64
Electrical system,August,North,4673.99


In [5]:
%%sql
SELECT payment, avg(payment_fee)
FROM sales
GROUP BY payment
LIMIT 5;

 * postgresql://postgres:***@localhost:2705/mei_database
3 rows affected.


payment,avg
Transfer,0.0099999999999999
Credit card,0.0299999999999999
Cash,0.0


In [7]:
%%sql
SELECT DISTINCT client_type 
FROM sales

 * postgresql://postgres:***@localhost:2705/mei_database
2 rows affected.


client_type
Wholesale
Retail
