#  SQL Queries 01

For more SQL examples in the SQLite3 dialect, seee [SQLite3 tutorial](https://www.techonthenet.com/sqlite/index.php). 

For a deep dive, see [SQL Queries for Mere Mortals](https://www.amazon.com/SQL-Queries-Mere-Mortals-Hands/dp/0134858336/ref=dp_ob_title_bk).

## Data

In [1]:
%load_ext sql

In [20]:
%sql sqlite:///data/faculty.db

'Connected: None@data/faculty.db'

In [21]:
%%sql

SELECT * FROM sqlite_master WHERE type='table';

Done.


type,name,tbl_name,rootpage,sql
table,person,person,2,"CREATE TABLE person ( 	""index"" BIGINT, person_id BIGINT, first TEXT, last TEXT, age BIGINT, height FLOAT, weight BIGINT, country_id TEXT, gender_id BIGINT )"
table,confidential,confidential,18,"CREATE TABLE confidential ( 	""index"" BIGINT, person_id BIGINT, salary BIGINT )"
table,person_language,person_language,33,"CREATE TABLE person_language ( 	""index"" BIGINT, person_id BIGINT, language_id BIGINT )"
table,language,language,50,"CREATE TABLE language ( 	""index"" BIGINT, language_id BIGINT, language_name TEXT )"
table,gender,gender,55,"CREATE TABLE gender ( 	""index"" BIGINT, gender_id BIGINT, gender TEXT )"
table,country,country,57,"CREATE TABLE country ( 	""index"" BIGINT, country_id TEXT, country TEXT, nationality TEXT )"


## Basic Structure

```SQL
SELECT DISTINCT value_expression AS alias
FROM tables AS alias
WHERE predicate
ORDER BY value_expression
```

### Types

- Character (Fixed width, variable width)
- National Character (Fixed width, variable width)
- Binary
- Numeric (Exact, Arpproximate)
- Boolean
- DateTime
- Interval

The SQL standard specifies that character strings and datetime literals are enclosed by single quotes. Two single quotes wihtin a string is intepreted as a literal single quote.

```sql
'Gilligan''s island'
```

#### The CAST function

```sql
CAST(X as CHARACTER(10))
```

### Value expression

- Literal
- Column reference
- Function
- CASES
- (Value expression)
- (SELECT expression)

which may be prefixed with unary operators `-` and `+` and combined with binary operators appropriate for the data type.

### Bineary operators

#### Concatenation

```SQL
A || B
```

#### Mathematical

```SQL
A + B
A - B
A * B
A / B
```

#### Data and time arithmetic

```SQL
'2018-08-29' + 3
'11:59' + '00:01'
```

In [23]:
%%sql

SELECT DISTINCT language_name
FROM language
LIMIT 5;

Done.


language_name
PHP
Clojure
Dylan
GNU Octave
D


### Sorting

```SQL
SELECT DISTINCT value_expression AS alias
FROM tables AS alias
ORDER BY value_expression
```

In [24]:
%%sql

SELECT DISTINCT language_name
FROM language
ORDER BY language_name ASC
LIMIT 5;

Done.


language_name
ASP
Assembly
AutoIt
Awk
Bash


### Filtering

For efficiency, place the most stringent filters first.

```SQL
SELECT DISTINCT value_expression AS alias
FROM tables AS alias
WHERE predicate
ORDER BY value_expression
```

#### Predicates for filtering rows

- Comparison operators (=, <>, <, >, <=, >=)
- BETWEEN start AND end
- IN(A, B, C)
- LIKE
- IS NULL
- REGEX

Use NOT prefix for negation

#### Combining predicates

```sql
AND
OR
```

USe parenthesis to indicate order of evaluation for compound statements.

In [25]:
%%sql

SELECT first, last, age
FROM person
WHERE age BETWEEN 16 AND 17
LIMIT 5;

Done.


first,last,age
Antoine,Beard,16
Augustine,Mejia,16
Boris,Mejia,16
Brain,Haney,16
Burl,Mayo,17


### Joins

Joins combine data from 1 or more tables to form a new result set.

#### Natural join

Uses all common columns in Tables 1 and 2 for JOIN

```SQL
FROM Table1 
NATURAL INNER JOIN Table 2
```

#### Inner join

General form of INNER JOIN uisng ON

```SQL
FROM Table1 
INNER JOIN Table2
ON Table1.Column = Table2.Column
```

If there is a common column in both tables

```SQL
FROM Table1
INNER JOIN Table2
USING Column
```

Joining more than two tables

```SQL
From (Table1 
      INNER JOIN Table2
      ON Table1.column1 = Table2.Column1)
      INNER JOIN Table3 
      ON Table3.column2 = Table2.Column2
```

#### Outer join

General form of OUTER JOIN uisng ON

```SQL
FROM Table1 
RIGHT OUTER JOIN Table2
ON Table1.Column = Table2.Column
```

```SQL
FROM Table1 
LEFT OUTER JOIN Table2
ON Table1.Column = Table2.Column
```

```SQL
FROM Table1 
FULL OUTER JOIN Table2
ON Table1.Column = Table2.Column
```

In [26]:
%%sql

SELECT first, last, language_name 
FROM person
INNER JOIN person_language ON person.person_id = person_language.person_id
INNER JOIN language ON language.language_id = person_language.language_id
LIMIT 10;

Done.


first,last,language_name
Aaron,Alexander,Haskell
Aaron,Kirby,GNU Octave
Aaron,Kirby,haXe
Aaron,Kirby,Falcon
Abram,Allen,TypeScript
Abram,Boyer,Io
Abram,Boyer,Lua
Abram,Boyer,Falcon
Adan,Brown,F#
Adolph,Dalton,Dart


### Set operations 

```SQL
SELECT a, b 
FROM table1
SetOp
SELECT a, b 
FROM table2
```

wehre SetOp is `INTERSECT`, `EXCEPT`, `UNION` or `UNION ALL`.

#### Intersection

```sql
INTERSECT
```

Alternative using `INNER JOIN`

#### Union

```SQL
UNION
UNION ALL (does not eliminate duplicate rows)
```

#### Difference

```SQL
EXCEPT
```

Alternative using `OUTER JOIN` with test for `NULL`

In [27]:
%%sql

DROP VIEW IF EXISTS language_view;
CREATE VIEW language_view AS
SELECT first, last, language_name 
FROM person
INNER JOIN person_language ON person.person_id = person_language.person_id
INNER JOIN language ON language.language_id = person_language.language_id
;

Done.
Done.


[]

In [28]:
%%sql

SELECt * 
FROM language_view 
LIMIT 10;

Done.


first,last,language_name
Aaron,Alexander,Haskell
Aaron,Kirby,GNU Octave
Aaron,Kirby,haXe
Aaron,Kirby,Falcon
Abram,Allen,TypeScript
Abram,Boyer,Io
Abram,Boyer,Lua
Abram,Boyer,Falcon
Adan,Brown,F#
Adolph,Dalton,Dart


In [29]:
%%sql

SELECt * 
FROM language_view 
WHERE language_name = 'Python'
UNION
SELECt * 
FROM language_view 
WHERE language_name = 'Haskell'
LIMIT 10;

Done.


first,last,language_name
Aaron,Alexander,Haskell
Andree,Douglas,Haskell
Arlie,Terrell,Python
Boyd,Blackwell,Haskell
Buck,Howe,Haskell
Carlton,Richard,Haskell
Carylon,Zamora,Python
Clarisa,Rodgers,Python
Dinorah,O'brien,Haskell
Dorian,Lloyd,Haskell


In [30]:
%%sql

SELECt * 
FROM language_view 
WHERE language_name IN ('Python', 'Haskell')
ORDER BY first
LIMIT 10;

Done.


first,last,language_name
Aaron,Alexander,Haskell
Andree,Douglas,Haskell
Arlie,Terrell,Python
Boyd,Blackwell,Haskell
Buck,Howe,Haskell
Carlton,Richard,Haskell
Carylon,Zamora,Python
Clarisa,Rodgers,Python
Dinorah,O'brien,Haskell
Dorian,Lloyd,Haskell


### Aggregate functions

```SQL
COUNT
MIN
MAX
AVG
SUM
```

In [31]:
%%sql

SELECT count(language_name) 
FROM language_view;

Done.


count(language_name)
2297


### Grouping

```SQL
SELECT a, MIN(b) AS min_b, MAX(b) AS max_b, AVG(b) AS mean_b
FROM table
GROUP BY a
HAVING mean_b > 5
```

The `HAVING` is analagous to the `WHERE` clause, but filters on aggregate conditions. Note that the `WHERE` statement filters rows BEFORE the grouping is done.

Note: Any variable in the SELECT part that is not an aggregte function needs to be in the GROUP BY part.

```SQL
SELECT a, b, c, COUNT(d)
FROM table
GROUP BY a, b, c
```

In [32]:
%%sql

SELECT language_name, count(*) AS n
FROM language_view
GROUP BY language_name
HAVING n > 45;

Done.


language_name,n
AutoIt,61
Bash,48
ECMAScript,48
GNU Octave,49
JavaScript,48
Perl,55
PowerShell,50
Prolog,50


### The CASE switch

#### Simple CASE

```SQL
SELECT name,
(CASE sex 
 WHEN 'M' THEN 1.5*dose
 WHEN 'F' THEN dose
 END) as adjusted_dose
FROM table
```

#### Searched CASE

```SQL
SELECT name,
(CASE  
 WHEN sex = 'M' THEN 1.5*dose
 WHEN sex = 'F' THEN dose
 END) as adjusted_dose
FROM table
```

In [33]:
%%sql

SELECT first, last, language_name,
(CASE
    WHEN language_name LIKE 'H%' THEN 'Hire'
    ELSE 'FIRE'
END
) AS outcome
FROM language_view
LIMIT 10;

Done.


first,last,language_name,outcome
Aaron,Alexander,Haskell,Hire
Aaron,Kirby,GNU Octave,FIRE
Aaron,Kirby,haXe,Hire
Aaron,Kirby,Falcon,FIRE
Abram,Allen,TypeScript,FIRE
Abram,Boyer,Io,FIRE
Abram,Boyer,Lua,FIRE
Abram,Boyer,Falcon,FIRE
Adan,Brown,F#,FIRE
Adolph,Dalton,Dart,FIRE


## Exercises

1. Find the youngest and oldest faculty member of each gender.

2. Find the median age of the faculty members who know Python.

As SQLite3 does not provide a median function, you can create a User Defined Function (UDF) to do this. See [documentation](https://docs.python.org/2/library/sqlite3.html#sqlite3.Connection.create_function).

3. Arrange countries by the average age of faculty in descending order. Countries are only included in the table if there are at least 3 faculty members from that country.

4. Which country has the most highest average body mass index (BMII) among the faculty? Recall that BMI is weight (kg) / (height (m))^2.

5. Do obese faculty (BMI > 30) know more languages on average than non-obese faculty?