# The Basics

## Simple SELECTs
1) Query all the data in the `pets` table.
2) Query only the first 5 rows of the `pets` table.
3) Query only the names and ages of the pets in the `pets` table.
4) Query the pets in the `pets` table, sorted youngest to oldest.
5) Query the pets in the `pets` table alphabetically.
6) Query all the male pets in the `pets` table.
7) Query all the cats in the `pets` table.
8) Query all the pets in the `pets` table that are at least 5 years old.
9) Query all the male dogs in the `pets` table. Do not include the sex or species column, since you already know them.
10) Get all the names of the dogs in the `pets` table that are younger than 5 years old.
11) Query all the pets in the `pets` table that are either male dogs or female cats.
12) Query the five oldest pets in the `pets` table.
13) Get the names and ages of all the female cats in the `pets` table sorted by age, descending.
14) Get all pets from `pets` whose names start with P.
15) Select all employees from `employees_null` where the salary is missing.
16) Select all employees from `employees_null` where the salary is below $35,000 or missing.
17) Select all employees from `employees_null` where the job title is missing. What do you see?
18) Who is the newest employee in `employees`? The most senior?
19) Select all employees from `employees` named Thomas.
20) Select all employees from `employees` named Thomas or Shannon.
21) Select all employees from `employees` named Robert, Lisa, or any name that begins with a J. In addition, only show employees who are _not_ in sales. This will be a little bit of a longer query.
    * _Hint:_ There will only be 6 rows in the result.

## Column Operations
22) Query the top 5 rows of the `employees` table to get a glimpse of these new data.
23) Query the `employees` table, but convert their salaries to Euros. 
    * _Hint:_ 1 Euro = 1.1 USD.
    * _Hint2:_ If you think the output is ugly, try out the `ROUND()` function.
24) Repeat the previous problem, but rename the column `salary_eu`.
25) Query the `employees` table, but combine the `firstname` and `lastname` columns to be "Firstname, Lastname" format. Call this column `fullname`. For example, the first row should contain `Thompson, Christine` as `fullname`. Also, display the rounded `salary_eu` instead of `salary`.
    * _Hint:_ The string concatenation operator is `||`
26) Query the `employees` table, but replace `startdate` with `startyear` using the `SUBSTR()` function. Also include `fullname` and `salary_eu`.
27) Repeat the above problem, but instead of using `SUBSTR()`, use `STRFTIME()`.
28) Query the `employees` table, replacing `firstname`/`lastname` with `fullname` and `startdate` with `startyear`. Print out the salary in USD again, except format it with a dollar sign, comma separators, and no decimal. For example, the first row should read `$123,696`. This column should still be named `salary`.
    * _Hint:_ Check out SQLite's `printf` function.
    * _Hint2:_ The format string you'll need is `$%,.2d`. You should read more about such formatting strings as they're useful in Python, too!

**Note:** For the next few problems, you'll probably want to use `CASE`/`WHEN` statements.

29) Last year, only salespeople were eligible for bonuses. Create a column `bonus` that is "Yes" if you're eligible for a bonus, otherwise "No".
30) This year, only sales people with a salary of $100,000 or higher are eligible for bonuses. Create a `bonus` column like in the last problem for salespeople with salaries at least $100,000.
31) Next year, the bonus structure will be a little more complicated. You'll create a `target_comp` column which represents an employee's target total compensation after their bonus. Here is the company's bonus structure:

* Salespeople who make more than $100,000 will be eligible for a 10% bonus.
* Salespeople who make less than $100,000 will be eligible for a 5% bonus.
* Administrators will also be eligible for a 5% bonus.
* Anyone who does not meet any of the above descriptions is not eligible for a bonus.

Create this `target_comp` column, making sure to format _both_ the `salary` and `target_comp` columns nicely (ie, with dollar signs and comma separators)

In [1]:
# Import Pandas and Create_Engine

from sqlalchemy import create_engine
import pandas as pd
import sqlite3

con = sqlite3.connect("../ladder.db")

1) Query all the data in the pets table. 

In [2]:
pd.read_sql("SELECT * FROM pets;", con)

Unnamed: 0,name,sex,species,age
0,Chloe,F,dog,9
1,Paddington,M,dog,10
2,Petey,M,dog,7
3,Bella,F,dog,1
4,Glenn Coco,M,cat,6
5,Alanna,F,cat,3
6,Mimi,F,dog,4
7,Midge,F,cat,7
8,Eli,M,dog,8
9,Shuri,F,cat,2


 2) Query only the first 5 rows of the pets table.

In [3]:
sql = '''
SELECT *
FROM pets
LIMIT 5
'''
pd.read_sql_query(sql, con)


Unnamed: 0,name,sex,species,age
0,Chloe,F,dog,9
1,Paddington,M,dog,10
2,Petey,M,dog,7
3,Bella,F,dog,1
4,Glenn Coco,M,cat,6


3) Query only the names and ages of the pets in the pets

In [4]:
sql = '''
SELECT name, age
FROM pets
LIMIT 5
'''
pd.read_sql_query(sql, con)


Unnamed: 0,name,age
0,Chloe,9
1,Paddington,10
2,Petey,7
3,Bella,1
4,Glenn Coco,6


4) Query the pets in the `pets` table, sorted youngest to oldest.

In [5]:
sql = '''
SELECT species, age
FROM pets
ORDER BY age 
LIMIT 5;
'''
pd.read_sql_query(sql, con)

Unnamed: 0,species,age
0,dog,1
1,cat,2
2,cat,3
3,cat,3
4,lobster,3


5) Query the pets in the `pets` table alphabetically.

In [6]:
sql = '''
SELECT species
FROM pets
ORDER BY species;
'''
pd.read_sql_query(sql, con)

Unnamed: 0,species
0,cat
1,cat
2,cat
3,cat
4,cat
5,cat
6,dog
7,dog
8,dog
9,dog


6) Query all the male pets in the `pets` table.

In [7]:
sql = '''
SELECT *
FROM pets
WHERE sex = 'M';
'''
pd.read_sql_query(sql, con)

Unnamed: 0,name,sex,species,age
0,Paddington,M,dog,10
1,Petey,M,dog,7
2,Glenn Coco,M,cat,6
3,Eli,M,dog,8
4,Oliver,M,cat,5
5,Pinchy,M,lobster,3


7) Query all the cats in the `pets` table.

In [8]:
sql = '''
SELECT *
FROM pets
WHERE species = 'cat';
'''
pd.read_sql_query(sql, con)

Unnamed: 0,name,sex,species,age
0,Glenn Coco,M,cat,6
1,Alanna,F,cat,3
2,Midge,F,cat,7
3,Shuri,F,cat,2
4,Oliver,M,cat,5
5,Dottie,F,cat,3


8) Query all the pets in the `pets` table that are at least 5 years old.

In [9]:
sql = '''
SELECT *
FROM pets
WHERE age >= 5;
'''
pd.read_sql_query(sql, con)

Unnamed: 0,name,sex,species,age
0,Chloe,F,dog,9
1,Paddington,M,dog,10
2,Petey,M,dog,7
3,Glenn Coco,M,cat,6
4,Midge,F,cat,7
5,Eli,M,dog,8
6,Oliver,M,cat,5


9) Query all the male dogs in the `pets` table. Do not include the sex or species column, since you already know them.

In [10]:
sql = '''
SELECT name, age
FROM pets
WHERE (sex = 'M') and (species = 'dog');
'''
pd.read_sql_query(sql, con)

Unnamed: 0,name,age
0,Paddington,10
1,Petey,7
2,Eli,8


10) Get all the names of the dogs in the `pets` table that are younger than 5 years old.

In [11]:
sql = '''
SELECT name
FROM pets
WHERE (age < 5) AND (species = 'dog');
'''
pd.read_sql_query(sql, con)

Unnamed: 0,name
0,Bella
1,Mimi


11) Query all the pets in the `pets` table that are either male dogs or female cats.

In [12]:
sql = '''
SELECT *
FROM pets
WHERE (sex = 'M' AND species = 'dog') OR \
(sex = 'F' AND species = 'cat');
'''
pd.read_sql_query(sql, con)

Unnamed: 0,name,sex,species,age
0,Paddington,M,dog,10
1,Petey,M,dog,7
2,Alanna,F,cat,3
3,Midge,F,cat,7
4,Eli,M,dog,8
5,Shuri,F,cat,2
6,Dottie,F,cat,3


12) Query the five oldest pets in the `pets` table.

In [13]:
sql = '''
SELECT *
FROM pets
ORDER BY age DESC
LIMIT 5;
'''
pd.read_sql_query(sql, con)

Unnamed: 0,name,sex,species,age
0,Paddington,M,dog,10
1,Chloe,F,dog,9
2,Eli,M,dog,8
3,Petey,M,dog,7
4,Midge,F,cat,7


13) Get the names and ages of all the female cats in the `pets` table sorted by age, descending.

In [14]:
sql = '''
SELECT name, age
FROM pets
WHERE species = 'cat' AND sex = 'F'
ORDER BY age DESC;
'''
pd.read_sql_query(sql, con)

Unnamed: 0,name,age
0,Midge,7
1,Alanna,3
2,Dottie,3
3,Shuri,2


14) Get all pets from `pets` whose names start with P.

In [15]:
sql = '''
SELECT *
FROM pets
WHERE name LIKE "P%%"
ORDER BY age DESC;
'''
pd.read_sql_query(sql, con)

Unnamed: 0,name,sex,species,age
0,Paddington,M,dog,10
1,Petey,M,dog,7
2,Pinchy,M,lobster,3


15) Select all employees from `employees_null` where the salary is missing.

In [16]:
sql = '''
SELECT *
FROM employees_null
WHERE salary IS NULL;
'''
pd.read_sql_query(sql, con)

Unnamed: 0,ID,firstname,lastname,job,salary,startdate
0,22,Maria,Owen,Sales,,2010-03-22
1,47,Shannon,Rivera,Administrator,,2007-09-12
2,53,Jennifer,Cruz,Sales,,2015-11-09
3,56,Russell,Rice,Sales,,2008-09-02
4,59,Andre,Levine,Sales,,2016-03-27
5,60,Anna,Fischer,Sales,,1994-12-25
6,69,Danielle,Orr,Sales,,2004-06-24
7,79,Tamara,Douglas,Sales,,2017-10-03
8,85,Candice,Wright,Sales,,1998-03-04
9,98,Bradley,Romero,Administrator,,2009-05-10


16) Select all employees from `employees_null` where the salary is below $35,000 or missing.

In [17]:
sql = '''
SELECT *
FROM employees_null
WHERE salary < 35000 OR salary IS NULL;
'''
pd.read_sql_query(sql, con)

Unnamed: 0,ID,firstname,lastname,job,salary,startdate
0,22,Maria,Owen,Sales,,2010-03-22
1,26,Samantha,Nichols,Sales,33248.0,2015-04-20
2,47,Shannon,Rivera,Administrator,,2007-09-12
3,53,Jennifer,Cruz,Sales,,2015-11-09
4,55,Julian,Martinez,Sales,34597.0,2009-06-19
5,56,Russell,Rice,Sales,,2008-09-02
6,59,Andre,Levine,Sales,,2016-03-27
7,60,Anna,Fischer,Sales,,1994-12-25
8,65,Michael,West,Sales,32781.0,1991-08-08
9,69,Danielle,Orr,Sales,,2004-06-24


17) Select all employees from `employees_null` where the job title is missing. What do you see?

In [18]:
sql = '''
SELECT *
FROM employees_null
WHERE job IS NULL;
'''
pd.read_sql_query(sql, con)

Unnamed: 0,ID,firstname,lastname,job,salary,startdate
0,101,UNKNOWN,UNKNOWN,,-1,


18) Who is the newest employee in `employees`? The most senior?

In [19]:
sql = '''
SELECT lastname, firstname, startdate
FROM employees 
WHERE startdate = (SELECT MAX(startdate) FROM employees)\
OR startdate = (SELECT MIN(startdate) FROM employees);
'''
pd.read_sql_query(sql, con)

Unnamed: 0,lastname,firstname,startdate
0,Conner,Roger,2019-05-24
1,Nash,Mary,1990-02-16


19) Select all employees from `employees` named Thomas.

In [20]:
sql = '''
SELECT firstname || " " || lastname as "Name"
FROM employees
WHERE LOWER(firstname) = "thomas" OR LOWER(lastname) = "thomas";

'''
pd.read_sql_query(sql, con)

Unnamed: 0,Name
0,Thomas Peck
1,Thomas Miller
2,Thomas Gonzalez


20) Select all employees from `employees` named Thomas or Shannon.

In [21]:
sql = '''
SELECT firstname || " " || lastname as "Name"
FROM employees
WHERE (LOWER(firstname) = "thomas" OR LOWER(lastname) = "thomas") OR \
(LOWER(firstname) = 'shannon' or LOWER(lastname) = "shannon");

'''
pd.read_sql_query(sql, con)

Unnamed: 0,Name
0,Thomas Peck
1,Shannon Bailey
2,Thomas Miller
3,Shannon Rivera
4,Thomas Gonzalez


21) Select all employees from `employees` named Robert, Lisa, or any name that begins with a J. In addition, only show employees who are _not_ in sales. This will be a little bit of a longer query.
    * _Hint:_ There will only be 6 rows in the result.

In [22]:
sql = '''
SELECT firstname || " " || lastname as "Name", job
FROM employees
WHERE job != "Sales" AND
((LOWER(firstname) = "robert" OR LOWER(lastname) = "robert") OR 
(LOWER(firstname) = 'lisa' or LOWER(lastname) = "or")
OR LOWER(firstname) LIKE "j%%" OR LOWER(lastname) LIKE "j%%") 
;

'''
pd.read_sql_query(sql, con)

Unnamed: 0,Name,job
0,Janice Martin,IT
1,Lisa Cooper,Operations
2,Jose Rosario,Administrator
3,Kelly Joseph,Operations
4,Jenny Barber,IT
5,Joseph Campos,Operations
6,Robert Diaz,IT


22) Query the top 5 rows of the `employees` table to get a glimpse of these new data.

In [23]:
sql = '''
SELECT *
FROM employees
LIMIT 5;
'''
pd.read_sql_query(sql, con)

Unnamed: 0,ID,firstname,lastname,job,salary,startdate
0,0,Christine,Thompson,Sales,123696,2005-01-20
1,1,Thomas,Peck,Sales,112972,2011-01-04
2,2,Christopher,Robles,IT,78426,2003-12-10
3,3,Elizabeth,Munoz,Sales,55824,1993-07-28
4,4,Janice,Martin,IT,62007,2011-02-25


 23) Query the employees table, but convert their salaries to Euros. * Hint: 1 Euro = 1.1 USD. * Hint2: If you think the output is ugly, try out the ROUND() function.

In [24]:
sql = '''
SELECT ID, firstname, lastname, job, ROUND(salary / 1.1,2), startdate
FROM employees
LIMIT 5;
'''
pd.read_sql_query(sql, con)

Unnamed: 0,ID,firstname,lastname,job,"ROUND(salary / 1.1,2)",startdate
0,0,Christine,Thompson,Sales,112450.91,2005-01-20
1,1,Thomas,Peck,Sales,102701.82,2011-01-04
2,2,Christopher,Robles,IT,71296.36,2003-12-10
3,3,Elizabeth,Munoz,Sales,50749.09,1993-07-28
4,4,Janice,Martin,IT,56370.0,2011-02-25


24) Repeat the previous problem, but rename the column salary_eu. 

In [25]:
sql = '''
SELECT ID, firstname, lastname, job, ROUND(salary / 1.1,2) AS salary_eu, startdate
FROM employees
LIMIT 5;
'''
pd.read_sql_query(sql, con)

Unnamed: 0,ID,firstname,lastname,job,salary_eu,startdate
0,0,Christine,Thompson,Sales,112450.91,2005-01-20
1,1,Thomas,Peck,Sales,102701.82,2011-01-04
2,2,Christopher,Robles,IT,71296.36,2003-12-10
3,3,Elizabeth,Munoz,Sales,50749.09,1993-07-28
4,4,Janice,Martin,IT,56370.0,2011-02-25


 25) Query the employees table, but combine the firstname and lastname columns to be "Firstname, Lastname" format. Call this column fullname. For example, the first row should contain Thompson, Christine as fullname. Also, display the rounded salary_eu instead of salary. * Hint: The string concatenation operator is ||

In [26]:
sql = '''
SELECT ID, lastname || ', ' || firstname AS fullname , job, ROUND(salary / 1.1,2) AS salary_eu, startdate
FROM employees
LIMIT 5;
'''
pd.read_sql_query(sql, con)

Unnamed: 0,ID,fullname,job,salary_eu,startdate
0,0,"Thompson, Christine",Sales,112450.91,2005-01-20
1,1,"Peck, Thomas",Sales,102701.82,2011-01-04
2,2,"Robles, Christopher",IT,71296.36,2003-12-10
3,3,"Munoz, Elizabeth",Sales,50749.09,1993-07-28
4,4,"Martin, Janice",IT,56370.0,2011-02-25


 26) Query the employees table, but replace startdate with startyear using the SUBSTR() function. Also include fullname and salary_eu.

In [27]:
sql = '''

SELECT SUBSTR(startdate,1,4) AS startyear, lastname || ', ' || firstname AS fullname ,
ROUND(salary / 1.1,2) AS salary_eu
FROM employees
LIMIT 5;

'''
pd.read_sql_query(sql, con)

Unnamed: 0,startyear,fullname,salary_eu
0,2005,"Thompson, Christine",112450.91
1,2011,"Peck, Thomas",102701.82
2,2003,"Robles, Christopher",71296.36
3,1993,"Munoz, Elizabeth",50749.09
4,2011,"Martin, Janice",56370.0


27) Repeat the above problem, but instead of using `SUBSTR()`, use `STRFTIME()`.

In [28]:
sql = '''

SELECT STRFTIME('%Y',startdate) AS startyear, lastname || ', ' || firstname AS fullname ,
ROUND(salary / 1.1,2) AS salary_eu
FROM employees
LIMIT 5;

'''
pd.read_sql_query(sql, con)

Unnamed: 0,startyear,fullname,salary_eu
0,2005,"Thompson, Christine",112450.91
1,2011,"Peck, Thomas",102701.82
2,2003,"Robles, Christopher",71296.36
3,1993,"Munoz, Elizabeth",50749.09
4,2011,"Martin, Janice",56370.0


28) Query the `employees` table, replacing `firstname`/`lastname` with `fullname` and `startdate` with `startyear`. Print out the salary in USD again, except format it with a dollar sign, comma separators, and no decimal. For example, the first row should read `$123,696`. This column should still be named `salary`.
    * _Hint:_ Check out SQLite's `printf` function.
    * _Hint2:_ The format string you'll need is `$%,.2d`. You should read more about such formatting strings as they're useful in Python, too!

**Note:** For the next few problems, you'll probably want to use `CASE`/`WHEN` statements.

In [29]:
sql = '''

SELECT  lastname || ', ' || firstname AS fullname , STRFTIME('%Y',startdate) AS startyear,
FORMAT("$%,.2d", ROUND(salary)) AS salary
FROM employees
LIMIT 5;

'''
pd.read_sql_query(sql, con)

Unnamed: 0,fullname,startyear,salary
0,"Thompson, Christine",2005,"$123,696"
1,"Peck, Thomas",2011,"$112,972"
2,"Robles, Christopher",2003,"$78,426"
3,"Munoz, Elizabeth",1993,"$55,824"
4,"Martin, Janice",2011,"$62,007"


29) Last year, only salespeople were eligible for bonuses. Create a column bonus that is "Yes" if you're eligible for a bonus, otherwise "No".

In [30]:
sql = '''

SELECT  lastname || ', ' || firstname AS fullname ,job, 
CASE 
WHEN LOWER(job) = 'sales' THEN 'Yes'
ELSE 'No'
END AS bonus
FROM employees
LIMIT 5;

'''
pd.read_sql_query(sql, con)

Unnamed: 0,fullname,job,bonus
0,"Thompson, Christine",Sales,Yes
1,"Peck, Thomas",Sales,Yes
2,"Robles, Christopher",IT,No
3,"Munoz, Elizabeth",Sales,Yes
4,"Martin, Janice",IT,No


30) This year, only sales people with a salary of  100,000 𝑜𝑟 ℎ𝑖𝑔ℎ𝑒𝑟 𝑎𝑟𝑒 𝑒𝑙𝑖𝑔𝑖𝑏𝑙𝑒 𝑓𝑜𝑟 𝑏𝑜𝑛𝑢𝑠𝑒𝑠. 𝐶𝑟𝑒𝑎𝑡𝑒 𝑎‘𝑏𝑜𝑛𝑢𝑠‘𝑐𝑜𝑙𝑢𝑚𝑛 𝑙𝑖𝑘𝑒 𝑖𝑛 𝑡ℎ𝑒 𝑙𝑎𝑠𝑡 𝑝𝑟𝑜𝑏𝑙𝑒𝑚 𝑓𝑜𝑟 𝑠𝑎𝑙𝑒𝑠 𝑝𝑒𝑜𝑝𝑙𝑒 𝑤𝑖𝑡ℎ 𝑠𝑎𝑙𝑎𝑟𝑖𝑒𝑠 𝑎𝑡𝑙𝑒𝑎𝑠𝑡 100,000.

In [31]:
sql = '''

SELECT  lastname || ', ' || firstname AS fullname ,job, salary,
CASE 
WHEN (LOWER(job) = 'sales' AND salary >= 100000) THEN 'Yes'
ELSE 'No'
END AS bonus
FROM employees
LIMIT 5;

'''
pd.read_sql_query(sql, con)

Unnamed: 0,fullname,job,salary,bonus
0,"Thompson, Christine",Sales,123696,Yes
1,"Peck, Thomas",Sales,112972,Yes
2,"Robles, Christopher",IT,78426,No
3,"Munoz, Elizabeth",Sales,55824,No
4,"Martin, Janice",IT,62007,No


31) Next year, the bonus structure will be a little more complicated. You'll create a target_comp column which represents an employee's target total compensation after their bonus. Here is the company's bonus structure:

Salespeople who make more than  100,000 𝑤𝑖𝑙𝑙 𝑏𝑒 𝑒𝑙𝑖𝑔𝑖𝑏𝑙𝑒 𝑓𝑜𝑟 𝑎 10∗𝑆𝑎𝑙𝑒𝑠 𝑝𝑒𝑜𝑝𝑙𝑒 𝑤ℎ𝑜 𝑚𝑎𝑘𝑒 𝑙𝑒𝑠𝑠 𝑡ℎ𝑎𝑛
 100,000 will be eligible for a 5% bonus.
Administrators will also be eligible for a 5% bonus.
Anyone who does not meet any of the above descriptions is not eligible for a bonus.
Create this target_comp column, making sure to format both the salary and target_comp columns nicely (ie, with dollar signs and comma separators)

In [33]:
sql = '''

SELECT  lastname || ', ' || firstname AS fullname ,job, salary,
    CASE 
        WHEN job = 'Sales' AND salary > 100000 THEN FORMAT("$%,.2d", salary * 1.10)
        WHEN job = 'Sales' AND salary < 100000 THEN FORMAT("$%,.2d" ,salary * 1.05)
        WHEN job = 'Administrator' THEN FORMAT("$%,.2d", salary * 1.05)
        ELSE FORMAT("$%,.2d", salary)
    END AS target_comp
FROM employees
LIMIT 8;

'''
pd.read_sql_query(sql, con)

Unnamed: 0,fullname,job,salary,target_comp
0,"Thompson, Christine",Sales,123696,"$136,065"
1,"Peck, Thomas",Sales,112972,"$124,269"
2,"Robles, Christopher",IT,78426,"$78,426"
3,"Munoz, Elizabeth",Sales,55824,"$58,615"
4,"Martin, Janice",IT,62007,"$62,007"
5,"Alvarez, Amanda",Administrator,50611,"$53,141"
6,"Cooper, Lisa",Operations,56265,"$56,265"
7,"Schwartz, Megan",Administrator,62615,"$65,745"
