# SQL in Python Exercise

In this exercises we will be using Jupyter and Python as our IDE for SQL. We can download the northwind database from [here](https://drive.google.com/file/d/1HCfNF5BsYUrQhhr-vnO_LFgs5THJUAh_/view?usp=sharing). Before we go to the actual tasks we will remember how we can connect to existing databases.

In [89]:
import pandas as pd
import sqlite3

In [90]:
def create_connection(db_file):
    """ create a database connection to the SQLite database
        specified by db_file
    :param db_file: database file
    :return: Connection object or None
    """
    conn = None
    try:
        conn = sqlite3.connect(db_file)
        return conn
    except Error as e:
        print(e)

    return conn

In [91]:
# if northwind is in the same working directory
con = create_connection("northwind (2).db")

> #### Warning
> in case the file is somewhere else it will still work but create new, empty database.

<sqlite3.Connection at 0x1fda9f35e40>

### Get all table names

**1. classic Python approach**

In [92]:
cursor = con.cursor()

res = cursor.execute("SELECT type, name FROM sqlite_master WHERE type='table'").fetchall()
all_tables = pd.DataFrame(res)
all_tables.columns = ["type","name"]
all_tables

Unnamed: 0,type,name
0,table,Employees
1,table,Categories
2,table,Customers
3,table,Shippers
4,table,Suppliers
5,table,Orders
6,table,Products
7,table,Order Details
8,table,Territories
9,table,EmployeeTerritories


**2. with pandas**

In [93]:
all_tables = pd.read_sql("SELECT type, name FROM sqlite_master WHERE type='table'", con)
all_tables

Unnamed: 0,type,name
0,table,Employees
1,table,Categories
2,table,Customers
3,table,Shippers
4,table,Suppliers
5,table,Orders
6,table,Products
7,table,Order Details
8,table,Territories
9,table,EmployeeTerritories


Using **read_sql** method from Pandas we can easily pull any SQL query into table format in Python.

In [94]:
orders = pd.read_sql("SELECT * FROM orders", con)
orders.ShippedDate.value_counts()

1998-04-10    8
1998-03-18    7
1998-01-23    6
1998-04-08    6
1998-04-24    6
             ..
1997-12-23    1
1997-03-26    1
1997-05-14    1
1998-04-09    1
1997-06-12    1
Name: ShippedDate, Length: 387, dtype: int64

### Task I: 
Write a query to get Product name and quantity/unit.

In [34]:
product_name_quantity = pd.read_sql(
    """SELECT ProductID, ProductName,sum(UnitsInStock) as UnitsInStock,
              sum(UnitsOnOrder) as UnitsOnOrder
              FROM products
    group by ProductName
    """, con)

product_name_quantity


Unnamed: 0,ProductID,ProductName,UnitsInStock,UnitsOnOrder
0,17,Alice Mutton,0,0
1,3,Aniseed Syrup,13,70
2,40,Boston Crab Meat,123,0
3,60,Camembert Pierrot,19,0
4,18,Carnarvon Tigers,42,0
...,...,...,...,...
72,7,Uncle Bob's Organic Dried Pears,15,0
73,50,Valkoinen suklaa,65,0
74,63,Vegie-spread,24,0
75,64,Wimmers gute Semmelknödel,22,80


### Task II: 
Write a query to get the most expensive and least expensive Product (name and unit price)

In [49]:
# pd.read_sql(
#     """SELECT *
#               FROM products limit 2
#     """, con)

print(pd.read_sql(
    """SELECT ProductName, min(UnitPrice) as min_unitprice
              FROM products
    """, con))

print(pd.read_sql(
    """SELECT ProductName, max(UnitPrice) as max_unitprice
              FROM products
    """, con))

  ProductName  min_unitprice
0     Geitost            2.5
     ProductName  max_unitprice
0  Côte de Blaye          263.5


### Task III: 
Write a query to count current and discontinued products.




In [52]:
pd.read_sql(
    """SELECT discontinued, count(*)
              FROM products
              group by discontinued
    """, con)

Unnamed: 0,Discontinued,count(*)
0,0,69
1,1,8


### Task IV: 
Select all product names and their category names.




In [60]:
pd.read_sql(
    """SELECT distinct productname,categoryname
              FROM products as a 
        left join Categories as b
        on a.categoryid = b.categoryid
              
    """, con)

Unnamed: 0,ProductName,CategoryName
0,Alice Mutton,Meat/Poultry
1,Aniseed Syrup,Condiments
2,Boston Crab Meat,Seafood
3,Camembert Pierrot,Dairy Products
4,Carnarvon Tigers,Seafood
...,...,...
72,Uncle Bob's Organic Dried Pears,Produce
73,Valkoinen suklaa,Confections
74,Vegie-spread,Condiments
75,Wimmers gute Semmelknödel,Grains/Cereals


In [54]:
pd.read_sql(
    """SELECT *
              FROM Categories
              limit 3
    """, con)

Unnamed: 0,CategoryID,CategoryName,Description,Picture
0,1,Beverages,"Soft drinks, coffees, teas, beers, and ales",
1,2,Condiments,"Sweet and savory sauces, relishes, spreads, an...",
2,3,Confections,"Desserts, candies, and sweet breads",


### Task V: 
Select all product names, unit price and the supplier region that don't have suppliers from USA region. (26 rows)


In [71]:

pd.read_sql(
    """SELECT distinct ProductName,UnitPrice,Region,Country
              FROM products as a
        left join suppliers as b
        on a.supplierid = b.supplierid
        where country not like ('USA')

    """, con)


Unnamed: 0,ProductName,UnitPrice,Region,Country
0,Alice Mutton,39.00,Victoria,Australia
1,Aniseed Syrup,10.00,,UK
2,Camembert Pierrot,34.00,,France
3,Carnarvon Tigers,62.50,Victoria,Australia
4,Chai,18.00,,UK
...,...,...,...,...
60,Tunnbröd,9.00,,Sweden
61,Valkoinen suklaa,16.25,,Finland
62,Vegie-spread,43.90,Victoria,Australia
63,Wimmers gute Semmelknödel,33.25,,Germany


In [70]:
pd.read_sql(
    """SELECT *
              FROM suppliers
              limit 3
    """, con)

Unnamed: 0,SupplierID,CompanyName,ContactName,ContactTitle,Address,City,Region,PostalCode,Country,Phone,Fax,HomePage
0,1,Exotic Liquids,Charlotte Cooper,Purchasing Manager,49 Gilbert St.,London,,EC1 4SD,UK,(171) 555-2222,,
1,2,New Orleans Cajun Delights,Shelley Burke,Order Administrator,P.O. Box 78934,New Orleans,LA,70117,USA,(100) 555-4822,,#CAJUN.HTM#
2,3,Grandma Kelly's Homestead,Regina Murphy,Sales Representative,707 Oxford Rd.,Ann Arbor,MI,48104,USA,(313) 555-5735,(313) 555-3349,


In [63]:
pd.read_sql(
    """SELECT *
              FROM products
              limit 3
    """, con)

Unnamed: 0,ProductID,ProductName,SupplierID,CategoryID,QuantityPerUnit,UnitPrice,UnitsInStock,UnitsOnOrder,ReorderLevel,Discontinued
0,1,Chai,1,1,10 boxes x 20 bags,18.0,39,0,10,0
1,2,Chang,1,1,24 - 12 oz bottles,19.0,17,40,25,0
2,3,Aniseed Syrup,1,2,12 - 550 ml bottles,10.0,13,70,25,0


### Task VI: 
Create report which shows a number of orders for each month.



In [76]:
pd.read_sql(
    """SELECT strftime('%m', OrderDate) as month,count(*) as cnt
              FROM orders
              group by strftime('%m', OrderDate)
    """, con)

Unnamed: 0,month,cnt
0,1,88
1,2,83
2,3,103
3,4,105
4,5,46
5,6,30
6,7,55
7,8,58
8,9,60
9,10,64


### Task VII:
Create report which shows total quantity of orders sold for each order.(830 rows)




In [83]:
pd.read_sql(
    """SELECT a.OrderID, sum(quantity) as total_ordered
              FROM orders as a 
              left join `Order Details` as b
              on a.OrderID = b.OrderID
              group by a.OrderID
    """, con)

Unnamed: 0,OrderID,total_ordered
0,10248,27
1,10249,49
2,10250,60
3,10251,41
4,10252,105
...,...,...
825,11073,30
826,11074,14
827,11075,42
828,11076,50


Unnamed: 0,OrderID,CustomerID,EmployeeID,OrderDate,RequiredDate,ShippedDate,ShipVia,Freight,ShipName,ShipAddress,...,ShipRegion,ShipPostalCode,ShipCountry,OrderID.1,ProductID,UnitPrice,Quantity,Discount,total,dow
0,10248,VINET,5,1996-07-04,1996-08-01,1996-07-16,3,32.38,Vins et alcools Chevalier,59 rue de l'Abbaye,...,,51100,France,10248,11,14.00,12,0.00,168.00,4
1,10248,VINET,5,1996-07-04,1996-08-01,1996-07-16,3,32.38,Vins et alcools Chevalier,59 rue de l'Abbaye,...,,51100,France,10248,42,9.80,10,0.00,98.00,4
2,10248,VINET,5,1996-07-04,1996-08-01,1996-07-16,3,32.38,Vins et alcools Chevalier,59 rue de l'Abbaye,...,,51100,France,10248,72,34.80,5,0.00,174.00,4
3,10249,TOMSP,6,1996-07-05,1996-08-16,1996-07-10,1,11.61,Toms Spezialitäten,Luisenstr. 48,...,,44087,Germany,10249,14,18.60,9,0.00,167.40,5
4,10249,TOMSP,6,1996-07-05,1996-08-16,1996-07-10,1,11.61,Toms Spezialitäten,Luisenstr. 48,...,,44087,Germany,10249,51,42.40,40,0.00,1696.00,5
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2150,11077,RATTC,1,1998-05-06,1998-06-03,,2,8.53,Rattlesnake Canyon Grocery,2817 Milton Dr.,...,NM,87110,USA,11077,64,33.25,2,0.03,66.44,3
2151,11077,RATTC,1,1998-05-06,1998-06-03,,2,8.53,Rattlesnake Canyon Grocery,2817 Milton Dr.,...,NM,87110,USA,11077,66,17.00,1,0.00,17.00,3
2152,11077,RATTC,1,1998-05-06,1998-06-03,,2,8.53,Rattlesnake Canyon Grocery,2817 Milton Dr.,...,NM,87110,USA,11077,73,15.00,2,0.01,29.98,3
2153,11077,RATTC,1,1998-05-06,1998-06-03,,2,8.53,Rattlesnake Canyon Grocery,2817 Milton Dr.,...,NM,87110,USA,11077,75,7.75,4,0.00,31.00,3


### Task VIII: 
Create report which shows how many years each employee works in the company. (9 rows)


In [95]:
pd.read_sql(
    """SELECT *
              FROM Employees
    """, con)

Unnamed: 0,EmployeeID,LastName,FirstName,Title,TitleOfCourtesy,BirthDate,HireDate,Address,City,Region,PostalCode,Country,HomePhone,Extension,Photo,Notes,ReportsTo,PhotoPath


### Task IX: 
Which day of the week is the best for orders?

In [116]:
pd.read_sql(
    """SELECT strftime('%w', OrderDate) as dow, count(*) as cnt
              FROM orders
              group by dow
              order by cnt desc
              
    """, con)

DatabaseError: Execution failed on sql 'SELECT strftime('%w', OrderDate) as dow, count(*) as cnt
              FROM orders
              where orderid > 100
              limit(2)
              group by dow
              order by cnt desc
              
    ': near "group": syntax error

In [112]:
pd.read_sql(
    """SELECT  strftime('%w', OrderDate) as dow, sum(((unitprice - discount)*Quantity)) as total
              FROM orders as a 
              left join `Order Details` as b
              on a.OrderID = b.OrderID
              group by dow
              order by total desc
              
    """, con)

Unnamed: 0,dow,total
0,5,283765.44
1,1,274653.05
2,2,271543.77
3,3,265739.51
4,4,255434.86
