### 5. Select Where

The syntax of the select statement with a where clause:

select columns from tables where condition ;

Columns are separated by commas; use * to select all columns.

The condition is a Boolean expression on column values. SQL supports the Boolean operations and, or, and not which work the same as in Python.

We can switch between the expression form (not X) and (not Y) and the form not (X or Y) because of a logic rule called DeMorgan's Law. You can read more about it in its Wikipedia article.

### 6. Comparison Operators

The comparison operators in SQL are almost the same as the ones in Python: < for less than, > for greater than, != for not equal, <= for less than or equal, and so forth. One difference is that SQL uses = instead of == to represent equality. You can apply all the basic comparison operators to strings, numbers, dates, and other values.

The columns in the animals table are name (a text string), species (also a text string), and birthdate (a date).

Reminder: Dates in our databases will always be in the international standard format, e.g. '1999-12-31'. Make sure to put single quotes around dates.

As a preview, here's the SQL command used to initially create this table:

~~~~
create table animals (  
    name text,  
    species text,  
    birthdate date  
);  
~~~~

### 7. SQL sucks at some things

#### Reference 
For reference, here's a list of all the tables in the zoo database:

**animals**  
This table lists individual animals in the zoo. Each animal has only one row. There may be multiple animals with the same name, or even multiple animals with the same name and species.
* name — the animal's name (example: 'George')
* species — the animal's species (example: 'gorilla')
* birthdate — the animal's date of birth (example: '1998-05-18')

**diet**  
This table matches up species with the foods they eat. Every species in the zoo eats at least one sort of food, and many eat more than one. If a species eats more than one food, there will be more than one row for that species.
* species — the name of a species (example: 'hyena')
* food — the name of a food that species eats (example: 'meat')

**taxonomy**  
This table gives the (partial) biological taxonomic names for each species in the zoo. It can be used to find which species are more closely related to each other evolutionarily.
* name — the common name of the species (e.g. 'jackal')
* species — the taxonomic species name (e.g. 'aureus')
* genus — the taxonomic genus name (e.g. 'Canis')
* family — the taxonomic family name (e.g. 'Canidae')
* t_order — the taxonomic order name (e.g. 'Carnivora')
If you've never heard of this classification, don't worry about it; the details won't be necessary for this course. But if you're curious, Wikipedia articles Taxonomy and Biological classification may help.

**ordernames**  
This table gives the common names for each of the taxonomic orders in the taxonomy table.
* t_order — the taxonomic order name (e.g. 'Cetacea')
* name — the common name (e.g. 'whales and dolphins')


#### The SQL for it  
And here are the SQL commands that were used to create those tables. We won't cover the create table command until lesson 4, but it may be interesting to look at:

~~~~
create table animals (  
       name text,
       species text,
       birthdate date);

create table diet (
       species text,
       food text);  

create table taxonomy (
       name text,
       species text,
       genus text,
       family text,
       t_order text); 

create table ordernames (
       t_order text,
       name text);
~~~~
       
*Remember: In SQL, we always put string and date values inside single quotes.*

### 8. Experiment Page

In [None]:
# Uncomment one of these QUERY variables at a time and use "Test Run" to run it.
# You'll see the results below.  Then try your own queries as well!
#

#QUERY = "select max(name) from animals;"

#QUERY = "select * from animals limit 10;"

#QUERY = "select * from animals where species = 'orangutan' order by birthdate;"

#QUERY = "select name from animals where species = 'orangutan' order by birthdate desc;"

#QUERY = "select name, birthdate from animals order by name limit 10 offset 20;"

#QUERY = "select species, min(birthdate) from animals group by species;"

#QUERY = '''
#select name, count(*) as num from animals
#group by name
#order by num desc
#limit 5;
#'''

### 9. Select Clauses

Here are the new select clauses introduced in the previous video:

... limit count
Return just the first count rows of the result table.

... limit count offset skip
Return count rows starting after the first skip rows.

... order by columns
... order by columns desc
Sort the rows using the columns (one or more, separated by commas) as the sort key. Numerical columns will be sorted in numerical order; string columns in alphabetical order. With desc, the order is reversed (desc-ending order).

... group by columns
Change the behavior of aggregations such as max, count, and sum. With group by, the aggregation will return one row for each distinct value in columns.

### 10. Count all the species

Select clauses  
These are all the select clauses we've seen in the lesson so far.

where  
The where clause expresses restrictions — filtering a table for rows that follow a particular rule. where supports equalities, inequalities, and boolean operators (among other things):
* where species = 'gorilla' — return only rows that have 'gorilla' as the value of the species column.
* where name >= 'George' — return only rows where the name column is alphabetically after 'George'.
* where species != 'gorilla' and name != 'George' — return only rows where species isn't 'gorilla' and name isn't 'George'.

limit / offset  
The limit clause sets a limit on how many rows to return in the result table. The optional offset clause says how far to skip ahead into the results. So limit 10 offset 100 will return 10 results starting with the 101st.

order by  
The order by clause tells the database how to sort the results — usually according to one or more columns. So order by species, name says to sort results first by the species column, then by name within each species.
Ordering happens before limit/offset, so you can use them together to extract pages of alphabetized results. (Think of the pages of a dictionary.)

The optional desc modifier tells the database to order results in descending order — for instance from large numbers to small ones, or from Z to A.

group by  
The group by clause is only used with aggregations, such as max or sum. Without a group by clause, a select statement with an aggregation will aggregate over the whole selected table(s), returning only one row. With a group by clause, it will return one row for each distinct value of the column or expression in the group by clause.

### 11. Insert: Adding rows

The basic syntax for the insert statement:

insert into table ( column1, column2, ... ) values ( val1, val2, ... );

If the values are in the same order as the table's columns (starting with the first column), you don't have to specify the columns in the insert statement:

insert into table values ( val1, val2, ... );

For instance, if a table has three columns (a, b, c) and you want to insert into a and b, you can leave off the column names from the insert statement. But if you want to insert into b and c, or a and c, you have to specify the columns.

A single insert statement can only insert into a single table. (Contrast this with the select statement, which can pull data from several tables using a join.)

### 13. Find join

To join two tables, first choose the join condition, or the rule you want the database to use to match rows from one table up with rows of the other table. Then write a join in terms of the columns in each table.

For instance, if you want to join tables T and S by matching rows where T.color is the same as S.paint, you'd write a select statement using T join S on T.color = S.paint.

### 14. After aggregating

The having clause works like the where clause, but it applies after group by aggregations take place. The syntax is like this:

select columns from tables group by column having condition ;

Usually, at least one of the columns will be an aggregate function such as count, max, or sum on one of the tables' columns. In order to apply having to an aggregated column, you'll want to give it a name using as. For instance, if you had a table of items sold in a store, and you wanted to find all the items that have sold more than five units, you could use:
~~~~
select name, count(*) as num from sales having num > 5;
~~~~

---
You can have a select statement that uses only where, or only group by, or group by and having, or where and group by, or all three of them!

But it doesn't usually make sense to use having without group by.

If you use both where and having, the where condition will filter the rows that are going into the aggregation, and the having condition will filter the rows that come out of it.

You can read more about having here:

http://www.postgresql.org/docs/9.4/static/sql-select.html#SQL-HAVING

There are a few different ways to solve this, but here's one of them:
~~~~
select food, count(animals.name) as num
       from diet join animals 
       on diet.species = animals.species
       group by food
       having num = 1
~~~~
And here is another:
~~~~
select food, count(animals.name) as num
       from diet, animals 
       where diet.species = animals.species
       group by food
       having num = 1
       ~~~~