![DSL_logo](https://raw.githubusercontent.com/BrockDSL/SQL-Workshop/main/dsl_logo.png)


# SQL and Databases

During this workshop we'll learn how to interact with SQL databases. Our focus will be on pulling information from different tables and constructing queries.

# Before we Begin!

1. Please click the 'Copy to Drive' button in the toolbar above
1. Click on the Gear icon next to your picture, Select 'Editor' and make sure 'Show Line Numbers' is selected
1. Share in the chat box a quick hello and where you are in the world right now.

In [None]:
#Libraries
#We'll use Pandas to interact with the SQL file

import sqlite3
import pandas as pd
import matplotlib.pyplot as plt 

print("Done loading Library")

## What's a Database?

The digram above represents how we can conceptualize a database as a series of _tables_ that reference each other. You can think of a _table_ as a very specific spreadsheet. We are going to be looking at a popular database used for teaching SQL that is called [NorthWind](https://docs.yugabyte.com/preview/sample-data/northwind/).

We are going to use a type of SQL connection/file called [SQLite](https://www.sqlite.org/index.html). This loads the SQL database into our environment and we can interact with it directly. Often we use mySQL which requires a connection to an SQL server. We use an environment like when we have lots of data to sort through.

Run the next cell to download and connect to that Database.

![ERD_Diagram](https://github.com/BrockDSL/SQL-Workshop/blob/main/Northwind_ERD.png?raw=true)

In [None]:
#Load SQLite File
!wget -O northwind.db "https://github.com/BrockDSL/SQL-Workshop/blob/main/northwind.db?raw=true"
try:
    connection = sqlite3.connect("northwind.db")
    print("Connection Successful!")
except:
    print("Error connecting to the database")


## Talking to the Database

We use a special kind of syntax to access the information that is in the database. We call this an _sql query_. It is very much structured like a sentence.

### Show Tables

Our first query will be to show all the tables in our database. Using the following:

```SQL
SELECT name FROM sqlite_master WHERE type='table';
```


In [None]:
SHOW_TABLES = \
"""

SELECT name FROM sqlite_master WHERE type='table';

"""
try:
    pd.read_sql_query(QUERY, connection)
except:
    print("SQL is incorrect")

### Show All Customers and Details

```SQL

SELECT * FROM Customers;

```

In [None]:
QUERY = \
"""

SELECT * FROM Customers;

"""
try:
    pd.read_sql_query(QUERY, connection)
except:
    print("SQL is incorrect")

### Show certain information about Customers

```SQL

SELECT ContactName, Phone FROM Customers;

```

In [None]:
QUERY = \
"""

SELECT ContactName, Phone FROM Customers;

"""
try:
    pd.read_sql_query(QUERY, connection)
except:
    print("SQL is incorrect")

### Selecting and Ordering

```SQL
SELECT ContactName, Country FROM Customers ORDER BY Country;

```

In [None]:
QUERY = \
"""

SELECT ContactName, Country FROM Customers ORDER BY Country;

"""
try:
    pd.read_sql_query(QUERY, connection)
except:
    print("SQL is incorrect")

### Selecting and Comparing Values

```SQL
SELECT * from Orders WHERE Freight > 50;

```

In [None]:
QUERY = \
"""

SELECT * from Orders WHERE Freight > 50;

"""
try:
    pd.read_sql_query(QUERY, connection)
except:
    print("SQL is incorrect")

### Q1


In [None]:
QUERY = \
"""


"""
try:
    pd.read_sql_query(QUERY, connection)
except:
    print("SQL is incorrect")

### Q2

In [None]:
QUERY = \
"""


"""
try:
    pd.read_sql_query(QUERY, connection)
except:
    print("SQL is incorrect")

### Q3

In [None]:
QUERY = \
"""


"""
try:
    pd.read_sql_query(QUERY, connection)
except:
    print("SQL is incorrect")

## Aggregate Functions

We can peform some basic math with our select statements using aggregate Functions

```
MIN
MAX
AVG
SUM
COUNT
```

### AVG

What is the average price of all of the products that this company sells?

```SQL

Select AVG(UnitPrice) from Products;

```

In [None]:
QUERY = \
"""

Select AVG(UnitPrice) from Products;

"""
try:
    pd.read_sql_query(QUERY, connection)
except:
    print("SQL is incorrect")

### COUNT

How many employees does this company have?

```SQL

SELECT COUNT(EmployeeID) From Employees;

```

In [None]:
QUERY = \
"""

SELECT COUNT(EmployeeID) From Employees;

"""
try:
    pd.read_sql_query(QUERY, connection)
except:
    print("SQL is incorrect")

In [None]:
QUERY = \
"""

SELECT * From Employees;

"""
try:
    pd.read_sql_query(QUERY, connection)
except:
    print("SQL is incorrect")

### Q4

```SQL
;

```

In [None]:
QUERY = \
"""



"""
try:
    pd.read_sql_query(QUERY, connection)
except:
    print("SQL is incorrect")

### Q5

```SQL
;

```

In [None]:
QUERY = \
"""



"""
try:
    pd.read_sql_query(QUERY, connection)
except:
    print("SQL is incorrect")

### Q6

```SQL
;

```

In [None]:
QUERY = \
"""



"""
try:
    pd.read_sql_query(QUERY, connection)
except:
    print("SQL is incorrect")

## Selecting from Multiple Databases

The real power in SQL is the ability to create queries that span multiple tables. The catch in these scenario is that you need to match the _key_ between tables to generate your query.


### Show Orders and include Customer Info

Show all the orders associated with Nancy Davolio


```SQL

SELECT * FROM Orders WHERE EmployeeID 

IN(SELECT EmployeeID FROM Employees WHERE FirstName = "Nancy");

```

In [None]:
QUERY = \
"""

SELECT * FROM Orders WHERE EmployeeID 

IN(SELECT EmployeeID FROM Employees WHERE FirstName = "Nancy");


"""
try:
    pd.read_sql_query(QUERY, connection)
except:
    print("SQL is incorrect")

In [None]:
#Let's doublecheck

QUERY = \
"""

SELECT * FROM Employees;

"""
try:
    pd.read_sql_query(QUERY, connection)
except:
    print("SQL is incorrect")

### Show all products that are beverages

```SQL

SELECT * FROM Products WHERE CategoryID

IN(SELECT CategoryID FROM Categories WHERE CategoryName = "Beverages");

```

In [None]:
QUERY = \
"""

SELECT * FROM Products WHERE CategoryID

IN(SELECT CategoryID FROM Categories WHERE CategoryName = "Beverages");


"""
try:
    pd.read_sql_query(QUERY, connection)
except:
    print("SQL is incorrect")

In [None]:
#Let's doublecheck

QUERY = \
"""

SELECT * FROM Categories;

"""
try:
    pd.read_sql_query(QUERY, connection)
except:
    print("SQL is incorrect")

### Show all products that are protiens

```SQL

SELECT * FROM Products WHERE CategoryID

IN(SELECT CategoryID FROM Categories WHERE CategoryName = "Meat/Poultry" OR CategoryName = "Seafood");

```

In [None]:
QUERY = \
"""

SELECT * FROM Products WHERE CategoryID

IN(SELECT CategoryID FROM Categories WHERE CategoryName = "Meat/Poultry" OR CategoryName = "Seafood");

"""
try:
    pd.read_sql_query(QUERY, connection)
except:
    print("SQL is incorrect")

In [None]:
#Let's doublecheck

QUERY = \
"""

SELECT * FROM Categories;

"""
try:
    pd.read_sql_query(QUERY, connection)
except:
    print("SQL is incorrect")

### Q7

```SQL
;

```

In [None]:
QUERY = \
"""



"""
try:
    pd.read_sql_query(QUERY, connection)
except:
    print("SQL is incorrect")

### Q8

```SQL
;

```

In [None]:
QUERY = \
"""



"""
try:
    pd.read_sql_query(QUERY, connection)
except:
    print("SQL is incorrect")

### Q9

```SQL
;

```

In [None]:
QUERY = \
"""



"""
try:
    pd.read_sql_query(QUERY, connection)
except:
    print("SQL is incorrect")

## More complex Queries

Let's bring together all of these pieces and do some more sophisticated queries

### What is the average price of Beverages

```SQL

SELECT AVG(UnitPrice) FROM Products WHERE CategoryID

IN(SELECT CategoryID FROM Categories WHERE CategoryName = "Beverages");

```

In [None]:
QUERY = \
"""

SELECT AVG(UnitPrice) FROM Products WHERE CategoryID

IN(SELECT CategoryID FROM Categories WHERE CategoryName = "Beverages");

"""
try:
    pd.read_sql_query(QUERY, connection)
except:
    print("SQL is incorrect")

### How many products are protiens

```SQL


SELECT COUNT(*) FROM Products WHERE CategoryID

IN(SELECT CategoryID FROM Categories WHERE CategoryName = "Meat/Poultry" OR CategoryName = "Seafood");


```

In [None]:
QUERY = \
"""

SELECT COUNT(*) FROM Products WHERE CategoryID

IN(SELECT CategoryID FROM Categories WHERE CategoryName = "Meat/Poultry" OR CategoryName = "Seafood");


"""
try:
    pd.read_sql_query(QUERY, connection)
except:
    print("SQL is incorrect")

### What is the maximum weight of an order shipped by Speedy Express

```SQL

SELECT MAX(Freight) FROM Orders WHERE ShipVia

IN(SELECT ShipperID FROM Shippers WHERE CompanyName = "Speedy Express");


```

In [None]:
QUERY = \
"""

SELECT MAX(Freight) FROM Orders WHERE ShipVia

IN(SELECT ShipperID FROM Shippers WHERE CompanyName = "Speedy Express");


"""
try:
    pd.read_sql_query(QUERY, connection)
except:
    print("SQL is incorrect")

### Q10

```SQL
;

```

In [None]:
QUERY = \
"""



"""
try:
    pd.read_sql_query(QUERY, connection)
except:
    print("SQL is incorrect")

### Q11

```SQL
;

```

In [None]:
QUERY = \
"""



"""
try:
    pd.read_sql_query(QUERY, connection)
except:
    print("SQL is incorrect")

### Q12

```SQL
;

```

In [None]:
QUERY = \
"""



"""
try:
    pd.read_sql_query(QUERY, connection)
except:
    print("SQL is incorrect")

## One last thing...

When we build a query against SQL in Pandas and then visualize our results by piping it into Matplotlib. Here we'll visualize who our best customers are.

In [None]:
QUERY = \
"""
SELECT * FROM Orders

"""
try:
    result = pd.read_sql_query(QUERY, connection)

    result = result.groupby("CustomerID").count().sort_values(by="OrderID",ascending = False)["OrderID"][0:10]

    plt.bar(result.index,result.values)
    plt.xticks(rotation = 45)
    plt.show()
except:
    print("SQL is incorrect")

### Q13

```SQL
;

```

In [None]:
QUERY = \
"""


"""
try:
    result = pd.read_sql_query(QUERY, connection)
    result = result.groupby("CategoryID").count()["ProductID"]
    plt.bar(result.index,result.values)
    plt.xticks(rotation = 45)
    plt.show()
except:
    print("SQL is incorrect")

### Q14

```SQL
;

```

In [None]:
QUERY = \
"""



"""
pd.read_sql_query(QUERY, connection)

---

## Congratulations!

Congratulations, you have successfully been introduced to SQL and how you can interact with it. It gets **very complex** quickly. There are lots of other places you can go with this.


To sign-up for future sessions please check us out on [Eventbrite](https://brockdsl.eventbrite.com) 