# Select SQL Lab

### Introduction
We've learned all the basics about setting up our SQL database, creating tables, and inserting data. In this lab we will practice querying the data from our salad shop database using `SELECT` statements.

### Loading the Data

In [20]:
import sqlite3
import pandas as pd


conn = sqlite3.connect('./grocery.db')
cursor = conn.cursor()

employees_url = "https://raw.githubusercontent.com/eng-6-22/mod-1-sql-curriculum/master/sql-select-lab/data/employees.csv"
ingredients_url = "https://raw.githubusercontent.com/eng-6-22/mod-1-sql-curriculum/master/sql-select-lab/data/ingredients.csv"
ingredients_df = pd.read_csv(ingredients_url)
employees_df = pd.read_csv(employees_url)
ingredients_df.to_sql('ingredients', conn,
                  index = False, if_exists = 'replace')
employees_df.to_sql('employees', conn, index = False,
        if_exists = 'replace')
employees_df.columns

Index(['id', 'name', 'position', 'start_date', 'zip_code', 'phone_number'], dtype='object')

In [21]:
ingredients_df.columns

Index(['id', 'name', 'cost_per_ounce', 'calories_per_ounce', 'expiration'], dtype='object')

### Selecting data
Our database is now ready for querying. We will use the python library `sqlite3` for running our SELECT statements.
So to begin, let's select select the employees.

Fill in the appropriate SQL statements inside of each function.

> We'll help you get started with the first one.

In [2]:
# cursor = conn.cursor()
# cursor.execute('select * from employees;')
# cursor.fetchall()

[(1, 'Tom', 'manager', '2019-05-20', 10001, '212-438-3984'),
 (2, 'Alison', 'salad maker', '2020-01-04', 10023, '212-329-2489'),
 (3, 'Jerry', 'delivery driver', '2019-12-15', 10001, '212-219-1029'),
 (4, 'Brian', 'salad maker', '2019-12-04', 10002, '646-328-2309'),
 (5, 'Vanessa', 'clerk', '2019-11-14', 11201, '718-348-2348'),
 (6, 'Cindy', 'assistant manager', '2019-11-12', 10002, '212-300-1203'),
 (7, 'Charles', 'delivery driver', '2020-01-02', 10003, '212-993-2348'),
 (8, 'Alex', 'assistant manager', '2019-06-11', 10001, '718-293-1823'),
 (9, 'Naomi', 'manager', '2019-02-12', 10002, '212-493-2922'),
 (10, 'Isabel', 'salad maker', '2020-01-04', 10001, '212-390-9122'),
 (11, 'James', 'clerk', '2020-01-10', 10023, '718-645-2929'),
 (12, 'Max', 'salad maker', '2019-10-03', 10002, '212-555-2911')]

In [5]:
def all_employees():
  cursor.execute('select * from employees;')
  return cursor.fetchall()


In [6]:
len(all_employees()) == 12

True

In [7]:
def all_ingredients():
    cursor.execute('select * from ingredients')
    return cursor.fetchall()

In [8]:
len(all_ingredients()) == 20

True

In [9]:
def all_ingredient_names():
    cursor.execute('select name from ingredients')
    return cursor.fetchall()

In [10]:
len(all_ingredient_names()) == 20

True

In [15]:
def all_employee_start_dates():
    cursor.execute('select start_date from employees')
    return cursor.fetchall()

In [16]:
len(all_employee_start_dates()) == 12

True

### Select Practice

In the following exercises, answer the question by creating a sql select statement that will provide us with the solution. After reading each question, first create a sql statement, the run it using the cursor object to find the answer.

> HINT: use the WHERE clause and boolean logic to filter the data within the select statement.


Here's an example:
What ingredients cost less than $1/oz?

In [None]:
#statement = '''SELECT * FROM ingredients WHERE cost_per_ounce < 1.0'''
#cursor.execute(statement)
#cursor.fetchall()

In [24]:
def cheaper_ingredients():
    statement = """SELECT * FROM ingredients WHERE cost_per_ounce < 1.0"""
    cursor.execute(statement)
    return cursor.fetchall()

What date did the employee Naomi begin working?

In [25]:
def naomi_start_date():
    statement = """SELECT start_date FROM employees WHERE name = 'Naomi'"""
    cursor.execute(statement)
    return cursor.fetchall()

In [26]:
naomi_start_date() == [('2019-02-12',)]

True

Which ingredients expire after February 3rd 2020?

In [27]:
def ingredients_expiring_after_feb():
    statement = """SELECT * FROM ingredients WHERE expiration > '2020-02-03'"""
    cursor.execute(statement)
    return cursor.fetchall()

In [28]:
ingredients_expiring_after_feb()

[(1, 'baby spinach', 0.19, 7, '2020-02-05'),
 (2, 'kale', 0.5, 10, '2020-02-10'),
 (4, 'broccoli', 1.15, 45, '2020-02-10'),
 (5, 'chicken', 3.5, 90, '2020-02-07'),
 (6, 'arugula', 0.5, 20, '2020-02-10'),
 (7, 'apples', 1.2, 30, '2020-02-08'),
 (8, 'pears', 1.1, 30, '2020-02-10'),
 (9, 'corn', 1.25, 40, '2020-02-08'),
 (14, 'avocado', 2.05, 70, '2020-02-08'),
 (15, 'romaine', 0.2, 5, '2020-02-07'),
 (20, 'balsamic vinaigrette', 1.45, 40, '2020-02-07')]

Which ingredients cost more than 1.00 but less than 1.50?

In [29]:
def ingredients_middle_price():
    statement = """SELECT * FROM ingredients WHERE cost_per_ounce > 1.00 and cost_per_ounce < 1.50"""
    cursor.execute(statement)
    return cursor.fetchall()

In [30]:
all([(ingredient[2] < 1.50 or ingredient[2] > 1) for ingredient in ingredients_middle_price()])

True

Which employees' home zip code is 10001 or 10002?

In [34]:
def employees_in_nearby_zip_code():
    statement = """SELECT * FROM employees WHERE zip_code = 10001 or zip_code = 10002"""
    cursor.execute(statement)
    return cursor.fetchall()

In [35]:
all([(employee[-2] == 10001 or employee[-2] == 10002) for employee in employees_in_nearby_zip_code()])

True

### Conclusion

Using the SELECT statement in sql is how we query the data in the database. We use the SELECT statement to retrieve exactly the data we are interested in analyzing. In this lab we started off with some basic SELECT commands using `SELECT *` to select everything from a table, and `SELECT column_name` to select all the data from a specific column, or columns, in a table.

Most of the time however, we are going to want to add some more complex filters to our SELECT statements, so we can be more precise in the data we are querying. For example, if our employees table had a 1,000,000 rows but we only wanted to see those rows for employees that started after a certain date, we would use a filter to limit the results of a query. The `WHERE` clause is how we filter in our SELECT statements.

At the end of the lab, we used the `WHERE` clause, in conjunction with boolean logic, to apply filters to the queries.