### Lots of data types in SQL
#### we will only use a few: text (a string of any length, close to str() in python), integer (like python's int()), and date(a calendar date)
#### important note: always use single quote for text and dates. a date without a single quote will interpret the value as an integer expression

### full list of SQL data types below
https://www.postgresql.org/docs/9.4/static/datatype.html

### SELECT WHERE STATEMENTS

#### WHERE allows conditions on what you are selecting. Boolean operators 'and', 'or', & 'not' are all usable in SQL

In [None]:
# example statement

SELECT NAME
FROM ANIMALS
WHERE NOT SPECIES = 'gorilla' AND NOT name = 'Max';

# alternatively, you can also write above statement as

SELECT NAME
FROM ANIMALS
WHERE NOT(species = 'gorilla' OE name = 'Max');

# or, last alternative

SELECT NAME
FROM ANIMALS
WHERE species != 'gorilla' AND name != 'Max';

### SQL can do comparisons outside of not, and, or
#### = < > <= >= != are all capable in SQL

In [None]:
# example statement

SELECT name 
FROM animals
WHERE species = 'llama' AND 
birthdate <= '1998-12-31' AND 
birthdate >= '1995-01-01';

### The one thing SQL is kind of terrible at is returning lists of names of tables in databases and names of columns of tables

In [2]:
# each different SQL tool has different ways to perform this list

# PostgreSQL: \dt and \d (tablename)
# MySQL: show tables and describe (tablename)
# SQLite: .tables and .schema (tablename)

# parentheses are not needed in above syntax

Reference
For reference, here's a list of all the tables in the zoo database:

animals
This table lists individual animals in the zoo. Each animal has only one row. There may be multiple animals with the same name, or even multiple animals with the same name and species.
name — the animal's name (example: 'George')
species — the animal's species (example: 'gorilla')
birthdate — the animal's date of birth (example: '1998-05-18')
diet
This table matches up species with the foods they eat. Every species in the zoo eats at least one sort of food, and many eat more than one. If a species eats more than one food, there will be more than one row for that species.
species — the name of a species (example: 'hyena')
food — the name of a food that species eats (example: 'meat')
taxonomy
This table gives the (partial) biological taxonomic names for each species in the zoo. It can be used to find which species are more closely related to each other evolutionarily.
name — the common name of the species (e.g. 'jackal')
species — the taxonomic species name (e.g. 'aureus')
genus — the taxonomic genus name (e.g. 'Canis')
family — the taxonomic family name (e.g. 'Canidae')
t_order — the taxonomic order name (e.g. 'Carnivora')
If you've never heard of this classification, don't worry about it; the details won't be necessary for this course. But if you're curious, Wikipedia articles Taxonomy and Biological classification may help.

ordernames
This table gives the common names for each of the taxonomic orders in the taxonomy table.
t_order — the taxonomic order name (e.g. 'Cetacea')
name — the common name (e.g. 'whales and dolphins')
The SQL for it
And here are the SQL commands that were used to create those tables. We won't cover the create table command until lesson 4, but it may be interesting to look at:

create table animals (  
       name text,
       species text,
       birthdate date);

create table diet (
       species text,
       food text);  

create table taxonomy (
       name text,
       species text,
       genus text,
       family text,
       t_order text); 

create table ordernames (
       t_order text,
       name text);

In [None]:
# Experimenting with features of SQL

QUERY = "select max(name) from animals;"
# returns the animal name at the end of the alphabet

QUERY = "select * from animals limit 10;"
# returns the first 10 animals and it looks like it is ordered by alphabetical name

QUERY = "select * from animals where species = 'orangutan' order by birthdate;"
# returns all orangutans and ordered by birthdate field ascending

QUERY = "select name from animals where species = 'orangutan' order by birthdate desc;"
# returns names of orangutans and descends order by birthdate descending

QUERY = "select name, birthdate from animals order by name limit 10 offset 20;"
# returns name and birthdate, limit of 10 rows, and chooses the 20th-30th names & birthdate

QUERY = "select species, min(birthdate) from animals group by species;"
# returns columns species & min(birthdate), grouping the data by species

QUERY = '''
select name, count(*) as num from animals
group by name
order by num desc
limit 5;
'''
# returns the name of animals & the count of all names as the field 'num'
# the data is also grouped by name and ordered by the 'num' field desc.
# limit of 5 returned entries

In [None]:
# Just a few of the SELECT Clauses below

# limit & offset

Limit count[Offset skip]
# count = how many rows to return
# offset = how far into the results to start

# example:
limit 10 offset 50
# Returns 10 rows, starting with the 51st row


# order by

Order by columns[desc]
# columns = which columns to sort by, separated with commas
# desc = sort in reverse order (descending)

# example:
order by species, name
# sort result rows first by the species column,
# then within each species sort by the name column

# group by

Group by columns
# clumns = which columns to use as groupings when aggregating

# example:
select species, min(birthdate) from animals group by species;

select name, count(*) as num from animals group by name;
# count(*) = count all the rows
# as num - and call the count column 'num'
# group by name = aggregate by values of the name column

### SQL and Python are a lot like each other!

#### Count(*) = len(results)
#### Limit 100 offset 10 = restults[10:110]
#### order by column = sorted(results, key = lambda(x):x[column])

### So why do it in the database?
### SPEED & SPACE!

#### Sorting a million rows in a database via python takes about a second, so if a web page is looking up a database that is time a person is waiting for the code to find the data queried on site

In [None]:
# count all the species quiz!

# write a query that returns all the species in the zoo, 
# and how many animals of each species there are, sorted 
# with the most populous species at the top

select species, count(*) as num from animals
group by species
order by num desc;

In [None]:
# Insert adding rows to a table

# To add a row:
    Insert into table values(42, 'stuff');
    
# If the new values arten't in the same order as the table's columns:
    Insert into table(col2, col1)
        values('stuff', 42);
        
select_query = 'Select...'
insert_query = 'Insert...'

SELECT_QUERY = "SELECT 'opossum' WHERE 'birthdate' order by desc;"

INSERT_QUERY = "INSERT INTO animals values('Baby Nicolas', 'opossum', '7/1/2017');"



In [None]:
# Join Statements!

# Joining Tables!

select T.thing, S.stuff
# rows to join
from T join S
# joined tables
on T.target = S.match
# join condition

OR

# Simple join

select T.target, S.stuff
# rows targeted
from T, S
# tables
where T.target = S.match
# restriction


# example quiz:

# Find the names of the individual animals that eat fish.
#
# The animals table has columns (name, species, birthdate) for each individual.
# The diet table has columns (species, food) for each food that a species eats.

select animals.name, diet.food = 'fish'
from animals join diet
on animals.species = diet.species
where food = 'fish';

OR

select name from animals, diet
where animals.species = diet.species
and diet.food = 'fish';

In [None]:
# Which species does the zoo have only one of?
# IMPORTANT LESSON

select species, count(*) as num
from animals 
group by species
having num = 1;

# where is a restriction on the source tables
# having is a restriction on the result after aggregationg!

select food, count(*) as num 
from animals, diet
where animals.species = diet.species
group by diet.food
having num = 1;

### Multiple Join Statements!

In [3]:
# animals
# This table lists individual animals in the zoo. 
# Each animal has only one row. There may be multiple animals 
# with the same name, or even multiple animals with the same name 
# and species.

# name — the animal's name (example: 'George')
# species — the animal's species (example: 'gorilla')
# birthdate — the animal's date of birth (example: '1998-05-18')

# taxonomy
# This table giaves the (partial) biological taxonomic names 
# for each species in the zoo. It can be used to find which 
# species are more closely related to each other evolutionarily.

# name — the common name of the species (e.g. 'jackal')
# species — the taxonomic species name (e.g. 'aureus')
# genus — the taxonomic genus name (e.g. 'Canis')
# family — the taxonomic family name (e.g. 'Canidae')
# t_order — the taxonomic order name (e.g. 'Carnivora')
# If you've never heard of this classification, don't worry about it; the details won't be necessary for this course. But if you're curious, Wikipedia articles Taxonomy and Biological classification may help.

# ordernames
# This table gives the common names for each of the taxonomic orders 
# in the taxonomy table.
# t_order — the taxonomic order name (e.g. 'Cetacea')
# name — the common name (e.g. 'whales and dolphins')

SELECT ordernames.name, count(animals.species) as num
FROM ordernames
JOIN taxonomy ON ordernames.t_order = taxonomy.t_order
JOIN animals ON taxonomy.name = animals.species
GROUP BY ordernames.t_order
ORDER BY num desc;

SyntaxError: invalid syntax (<ipython-input-3-e3166adbbfc0>, line 29)

### Quiz 1 - Q

#### SQL Statement Order
#### don't forget, to start SQLite, type in sqlite3 in terminal, but you have to also initiate the database you want to query

In [None]:
SELECT Composer, COUNT(*)
FROM Track
GROUP BY Composer
ORDER BY COUNT(*)
DESC
Limit 10;

### Quiz 2 - U

In [None]:
SELECT name, milliseconds
FROM Track
WHERE milliseconds >2500000
AND Milliseconds < 2600000
ORDER BY Milliseconds;

### Quiz 3 - E

In [None]:
SELECT Artist.name, Album.Title
FROM Album JOIN Artist
on Artist.ArtistId = Album.ArtistId
Where name = 'Iron Maiden'
OR name = 'Amy Winhouse';

### Quiz 4 - R

In [None]:
SELECT BillingCountry, COUNT(*) as totalInvoices
FROM Invoice
GROUP BY BillingCountry
ORDER BY totalInvoices desc
Limit 3;

### Quiz 5 - I

In [None]:
SELECT Customer.Email, Customer.FirstName, Customer.LastName, SUM(Invoice.Total) as Total
FROM Customer JOIN Invoice
ON Customer.CustomerId = Invoice.CustomerId
GROUP BY Customer.Email
ORDER BY Total desc
LIMIT 1;

### Quiz 6 - E

In [None]:
SELECT Customer.Email, Customer.FirstName, Customer.LastName, Genre.Name
FROM Customer
JOIN Invoice ON Customer.CustomerId = Invoice.CustomerId
JOIN InvoiceLine ON Invoice.InvoiceId = InvoiceLine.InvoiceId
JOIN Track ON InvoiceLine.TrackId = Track.TrackId
JOIN Genre ON Track.GenreId = Genre.GenreId
WHERE Genre.Name = 'Rock'
GROUP BY Customer.Email
ORDER BY Customer.Email;

### Quiz 7 - S

In [None]:
SELECT BillingCity, SUM(Total)
FROM Invoice
GROUP BY BillingCity
ORDER BY SUM(Total) desc
LIMIT 1;

### Quiz 8 - R

In [None]:
SELECT Invoice.BillingCity, COUNT(Genre.Name), Genre.Name
FROM Invoice
JOIN InvoiceLine ON Invoice.InvoiceId = InvoiceLine.InvoiceId
JOIN Track ON InvoiceLine.TrackId = Track.TrackId
JOIN Genre ON Track.GenreId = Genre.GenreId
WHERE Invoice.BillingCity = 'Prague'
GROUP BY Genre.Name
ORDER BY COUNT(Genre.Name) desc
LIMIT 3;

### Quiz 9 - F

In [None]:
SELECT Artist.Name, COUNT(Genre.Name)
FROM Genre
JOIN Track ON Genre.GenreId = Track.GenreId
JOIN Album ON Track.ALbumId = Album.AlbumId
JOIN Artist ON Album.ArtistId = Artist.ArtistId
WHERE Genre.Name = 'Rock'
GROUP BY Artist.Name
ORDER BY COUNT(Genre.Name) desc
LIMIT 10;

### Quiz 10 - U

In [None]:
SELECT Invoice.BillingCity, COUNT(InvoiceLine.TrackId) as Numtracks
FROM Invoice
JOIN InvoiceLine ON Invoice.InvoiceId = InvoiceLine.InvoiceId
JOIN Track ON InvoiceLine.TrackId = Track.TrackId
JOIN Genre ON Genre.GenreId = Track.GenreId
WHERE Invoice.BillingCountry = 'France'
AND Genre.Name = 'Alternative & Punk'
GROUP BY Invoice.BillingCity
ORDER BY Numtracks desc;

### Quiz 11 - N