Lecture 5: SQL
======================

Quantifiers: existential and universal
------------------------------------

A massive amount of user interviewing has suggested that something called "shmear" (or *schmear*) is of critical importance to market success.  You decide to look for competitors that have some shmear on the menu:

In [36]:
%sql SELECT DISTINCT made_by FROM bagel WHERE name LIKE '%shmear%';

Done.


made_by
Bobs Bagels
BAGEL CORP


A condition of this form (*"there exists some r s.t. C(r)"*) is known as an **existential** quantifier.  As is apparent above, these are fairly easy to write in SQL.  A **universal** quantifier on the other hand (of the form *"C(r) for all r"*) is a bit harder but still simple enough.

So, for example, to find competitors with products that *all* have shmear in them:

In [37]:
%%sql
SELECT DISTINCT made_by
FROM bagel
WHERE made_by NOT IN (
    SELECT made_by
    FROM bagel
    WHERE name NOT LIKE '%shmear%');

Done.


made_by
Bobs Bagels


NULL values in SQL
-----------------

Let's look at an odd pair of queries:

In [38]:
%sql SELECT * FROM purchase WHERE bagel_name LIKE '%shmear%';

Done.


bagel_name,franchise,date,quantity,purchaser_age
Plain with shmear,Bobs Bagels,1,12,28.0
Egg with shmear,Bobs Bagels,2,6,47.0
Plain with shmear,BAGEL CORP,2,12,24.0
Plain with shmear,BAGEL CORP,3,1,17.0
Plain with shmear,Bobs Bagels,4,24,


In [39]:
%%sql SELECT * FROM purchase 
WHERE bagel_name LIKE '%shmear%' 
  AND (purchaser_age >= 5 OR purchaser_age < 5);

Done.


bagel_name,franchise,date,quantity,purchaser_age
Plain with shmear,Bobs Bagels,1,12,28
Egg with shmear,Bobs Bagels,2,6,47
Plain with shmear,BAGEL CORP,2,12,24
Plain with shmear,BAGEL CORP,3,1,17


We see that `NULL` values are treated specially.  In SQL, there are actually three effective boolean values- `TRUE`, `FALSE`, and `UNKNOWN`.  Any comparison operation between a `NULL` value and a constant will return `UNKNOWN`- and in SQL, relations are only passed through when the condition over them returns `TRUE`.  We can of course handle them specially though:

In [40]:
%%sql SELECT * FROM purchase
WHERE bagel_name LIKE '%shmear%'
  AND (purchaser_age >= 5 OR purchaser_age < 5 
       OR purchaser_age IS NULL);

Done.


bagel_name,franchise,date,quantity,purchaser_age
Plain with shmear,Bobs Bagels,1,12,28.0
Egg with shmear,Bobs Bagels,2,6,47.0
Plain with shmear,BAGEL CORP,2,12,24.0
Plain with shmear,BAGEL CORP,3,1,17.0
Plain with shmear,Bobs Bagels,4,24,


What happens when there are nulls in a join?

In [41]:
%%sql 
SELECT DISTINCT b.name 
FROM bagel b, purchase p 
WHERE b.name = p.bagel_name AND b.made_by = p.franchise;

Done.


name
Plain with shmear
Egg with shmear
eBagel Expansion Pack


We're missing bagels which were never purchased!

Inner and Outer Joins
--------------------

A join query using a `WHERE` clause like the one just showed is actually an `INNER JOIN`, and can also be written as follows:

In [42]:
%%sql 
SELECT DISTINCT b.name 
FROM bagel b
    INNER JOIN purchase p ON b.name = p.bagel_name AND b.made_by = p.franchise;

Done.


name
Plain with shmear
Egg with shmear
eBagel Expansion Pack


An `INNER JOIN` on tables `A` and `B` with join condition `C(A,B)` returns only relations `(a,b)` such that `C(a,b) = TRUE`.  If, as in our example above, there is no `b` such that `C(a,b)` is true, then `a` is simply not returned in thr output multiset.

We can use an `OUTER JOIN` instead, however, which comes in three varieties: `LEFT`, `RIGHT`, and `FULL`. 

In our current situation, what we needed was a `LEFT OUTER JOIN`.  A left outer join will also return `(a, NULL)` for left relations `a` such that there is no `b` for which `C(a,b) = TRUE`:

In [43]:
%%sql 
SELECT DISTINCT b.name 
FROM bagel b
    LEFT OUTER JOIN purchase p ON b.name = p.bagel_name AND b.made_by = p.franchise;

Done.


name
Plain with shmear
Egg with shmear
eBagel Drinkable Bagel
eBagel Expansion Pack
Organic Flax-seed bagel chips
