<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Lesson-Structure" data-toc-modified-id="Lesson-Structure-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>Lesson Structure</a></span><ul class="toc-item"><li><span><a href="#Questions" data-toc-modified-id="Questions-1.1"><span class="toc-item-num">1.1&nbsp;&nbsp;</span>Questions</a></span></li><li><span><a href="#Outcomes" data-toc-modified-id="Outcomes-1.2"><span class="toc-item-num">1.2&nbsp;&nbsp;</span>Outcomes</a></span></li><li><span><a href="#Objectives" data-toc-modified-id="Objectives-1.3"><span class="toc-item-num">1.3&nbsp;&nbsp;</span>Objectives</a></span></li><li><span><a href="#Outline" data-toc-modified-id="Outline-1.4"><span class="toc-item-num">1.4&nbsp;&nbsp;</span>Outline</a></span></li></ul></li><li><span><a href="#Introduction-SQL" data-toc-modified-id="Introduction-SQL-2"><span class="toc-item-num">2&nbsp;&nbsp;</span>Introduction SQL</a></span></li><li><span><a href="#Demonstrating-SQLite-&amp;-Python" data-toc-modified-id="Demonstrating-SQLite-&amp;-Python-3"><span class="toc-item-num">3&nbsp;&nbsp;</span>Demonstrating SQLite &amp; Python</a></span><ul class="toc-item"><li><span><a href="#Connecting-to-our-sqlite-db-using-sqlite3" data-toc-modified-id="Connecting-to-our-sqlite-db-using-sqlite3-3.1"><span class="toc-item-num">3.1&nbsp;&nbsp;</span>Connecting to our sqlite db using sqlite3</a></span></li><li><span><a href="#Listing-the-tables-in-our-db" data-toc-modified-id="Listing-the-tables-in-our-db-3.2"><span class="toc-item-num">3.2&nbsp;&nbsp;</span>Listing the tables in our db</a></span></li><li><span><a href="#Now--select-everything-from-the-employees-table-just-to-get-a-feel-for-it" data-toc-modified-id="Now--select-everything-from-the-employees-table-just-to-get-a-feel-for-it-3.3"><span class="toc-item-num">3.3&nbsp;&nbsp;</span>Now  select everything from the employees table just to get a feel for it</a></span></li></ul></li><li><span><a href="#More-Complexity" data-toc-modified-id="More-Complexity-4"><span class="toc-item-num">4&nbsp;&nbsp;</span>More Complexity</a></span><ul class="toc-item"><li><span><a href="#I'm-pretty-lazy-and-I-don't-want-to-type..." data-toc-modified-id="I'm-pretty-lazy-and-I-don't-want-to-type...-4.1"><span class="toc-item-num">4.1&nbsp;&nbsp;</span>I'm pretty lazy and I don't want to type...</a></span></li><li><span><a href="#Why-not-just-load-the-full-results-directly-into-a-DataFrame?" data-toc-modified-id="Why-not-just-load-the-full-results-directly-into-a-DataFrame?-4.2"><span class="toc-item-num">4.2&nbsp;&nbsp;</span>Why not just load the full results directly into a DataFrame?</a></span></li><li><span><a href="#Specific-results-(and-order)-from-query" data-toc-modified-id="Specific-results-(and-order)-from-query-4.3"><span class="toc-item-num">4.3&nbsp;&nbsp;</span>Specific results (and order) from query</a></span></li><li><span><a href="#Queries-with-joins" data-toc-modified-id="Queries-with-joins-4.4"><span class="toc-item-num">4.4&nbsp;&nbsp;</span>Queries with joins</a></span></li><li><span><a href="#Want-More-Joins?" data-toc-modified-id="Want-More-Joins?-4.5"><span class="toc-item-num">4.5&nbsp;&nbsp;</span>Want More Joins?</a></span></li></ul></li><li><span><a href="#Exercises" data-toc-modified-id="Exercises-5"><span class="toc-item-num">5&nbsp;&nbsp;</span>Exercises</a></span></li></ul></div>

# Lesson Structure

## Questions

## Outcomes

YWBAT
- build functions to perform queries
- execute a query using multiple joins

## Objectives

- create a Pandas DataFrame from a query using sqlite3
- compare and contrast using a dataframe vs using the results from a query
- write a single join query
- write a multiple join query

## Outline

* Take Questions
* Intro to SQL
* Connect to our sqlite db
* Discussion on reading SQL database as a dataframe
* More complex queries (joins)
* Practice various queries
* Complete a query with a join
* Complete a query using multiple joins
* Wrap up

# Introduction SQL

[An Introduction to SQL](intro_to_sql.ipynb)

# Demonstrating SQLite & Python

In [None]:
import pandas as pd
import sqlite3

In [None]:
ls

## Connecting to our sqlite db using sqlite3

In [None]:
conn = sqlite3.connect('data.sqlite')
cursor = conn.cursor()

## Listing the tables in our db

In [None]:
# table_names to be a list of my table_names
query = "SELECT name FROM sqlite_master WHERE type='table';"
res = cursor.execute(query).fetchall()
print(res) # Notice these are tuples, let's extract only the names in the next cell

In [None]:
res[0]

In [None]:
table_names = [r[0] for r in res]
table_names

## Now  select everything from the employees table just to get a feel for it

In [None]:
query = 'select * from employees limit 5;'
res = cursor.execute(query).fetchall()
res

# More Complexity

## I'm pretty lazy and I don't want to type...

Think how much you have to type to get relevant info from the results of a new query

In [None]:
# Write the query
query = 'select * from offices;'

# Execute the query
cursor.execute(query)

# fetch one or all results
# result = cursor.fetchone()
results = cursor.fetchall()

In [None]:
display(results)

In [None]:
# And if we want to see the descriptions
cursor.description

If only there was a better way!

Wait, we have **functions**!!! → If you have to do it more than once, make a function!

In [None]:
def sql_with_cols(query, cursor=cursor):
  '''
  Gives me the full result (with columns)
  '''
  result = cursor.execute(query).fetchall()
  cols = tuple([description[0] for description in cursor.description])
  
  full_result = [cols] + result[:] 
  return full_result

In [None]:
results = sql_with_cols(
    'select * from employees'
)

In [None]:
results

In [None]:
display("Description: ", results[0])

In [None]:
display("Items: ", results[1:])

In [None]:
# Now we can put these results into a DataFrame
def sql_to_df(query, cursor=cursor):
    '''
    Create a DataFrame directly from a SQL query.
    '''
    results = sql_with_cols(query, cursor)
    return pd.DataFrame(columns=results[0], data=results[1:])

In [None]:
df = sql_to_df('select * from employees')
df.head()

## Why not just load the full results directly into a DataFrame?

> What are the pros and cons of loading a results from a SQL query into a DataFrame?

- Pros
    - Easier look at & manipulate

- Cons
    - too much data 

## Specific results (and order) from query

In [None]:
# Select specific columns from a table
df = sql_to_df('''
    SELECT
        firstName as 'First Name',
        lastName as 'Last Name',
        jobTitle as 'Position',
        email
    FROM
        employees
''')

df.head()

##  Queries with joins

Let's look at a couple tables that are related.

In [None]:
df_orderdetails = sql_to_df('''
    select
        *
    from
        orderdetails
''')
df_orderdetails.head()

In [None]:
df_products = sql_to_df('''
    select
        *
    from
        products
''')

df_products.head()

What if we want to get results from multiple tables?

Imagine we want to include the product info with the order details

In [None]:
df_combo = sql_to_df('''
    select
        *
    from
        orderdetails
        JOIN products 
            on orderdetails.productCode = products.productCode
''')

df_combo.head()

What if we wanted specific columns from the row? 

We can add some new

We also can make it easier to read.

```SQL
SELECT
    -- We can add custom names to for reference ("alias")
    od.orderNumber AS 'Order',
    p.productName AS Product,
    p.productCode AS Code,
    od.quantityOrdered AS Quantity,
    od.priceEach AS 'Price per Unit',

    -- We can do some math to make a new column too
    (od.priceEach * od.quantityOrdered) AS 'Total'
FROM
    -- We can also use aliases for table names
    orderdetails AS od
    JOIN products AS p
        ON od.productCode = p.productCode
```

In [None]:
df_combo = sql_to_df('''
    SELECT
        -- We can add custom names to for reference ("alias")
        od.orderNumber AS 'Order',
        p.productName AS Product,
        p.productCode AS Code,
        od.quantityOrdered AS Quantity,
        od.priceEach AS 'Price per Unit',
        
        -- We can do some math to make a new column too
        (od.priceEach * od.quantityOrdered) AS 'Total'
    FROM
        -- We can also use aliases for table names
        orderdetails AS od
        JOIN products AS p
            ON od.productCode = p.productCode;
''')

df_combo.head()

## Want More Joins?

Here's another notebook with some more info & practice on doing joins

[DataEngineering/SQL/joins.ipynb](joins.ipynb)

# Exercises

[DataEngineering/SQL/sql_exercises.ipynb](sql_exercises.ipynb)