# SQL Tutorial 1: Simple Queries

- `pip install requirements.txt` to install `jupysql` *before* running `jupyter lab`
- Load the [jupysql](https://jupysql.ploomber.io/) extension

In [1]:
%load_ext sql

- Connect to the `penguins.db` SQLite database
- This database has one table called `penguins` which was created from `penguins.csv`

In [2]:
%sql sqlite:///penguins.db

- Create a SQL cell by putting `%%sql` on the first line and a valid SQL query in the rest of the cell.
- The query `select * from penguins` gets all columns for all rows from the `penguins` table.
- The semi-colon at the end is required.
- The `jupysql` extension truncates the display to 10 rows by default.

In [3]:
%%sql
select *
from penguins;

species,island,bill_length_mm,bill_depth_mm,flipper_length_mm,body_mass_g,sex
Adelie,Torgersen,39.1,18.7,181.0,3750.0,MALE
Adelie,Torgersen,39.5,17.4,186.0,3800.0,FEMALE
Adelie,Torgersen,40.3,18.0,195.0,3250.0,FEMALE
Adelie,Torgersen,,,,,
Adelie,Torgersen,36.7,19.3,193.0,3450.0,FEMALE
Adelie,Torgersen,39.3,20.6,190.0,3650.0,MALE
Adelie,Torgersen,38.9,17.8,181.0,3625.0,FEMALE
Adelie,Torgersen,39.2,19.6,195.0,4675.0,MALE
Adelie,Torgersen,34.1,18.1,193.0,3475.0,
Adelie,Torgersen,42.0,20.2,190.0,4250.0,


- We can select only the columns we want by giving their names (in any order)
- And use `limit N` to specify a maximum number of rows

In [4]:
%%sql
select island, species, sex
from penguins
limit 5;

island,species,sex
Torgersen,Adelie,MALE
Torgersen,Adelie,FEMALE
Torgersen,Adelie,FEMALE
Torgersen,Adelie,
Torgersen,Adelie,FEMALE


- The order of the clauses in the `select` statement matters

In [5]:
%%sql
from penguins
select island, species, sex
limit 5;

RuntimeError: If using snippets, you may pass the --with argument explicitly.
For more details please refer: https://jupysql.ploomber.io/en/latest/compose.html#with-argument


Original error message from DB driver:
(sqlite3.OperationalError) near "from": syntax error
[SQL: from penguins
select island, species, sex
limit 5;]
(Background on this error at: https://sqlalche.me/e/20/e3q8)

If you need help solving this issue, send us a message: https://ploomber.io/community


- Let's change the configuration of our display
  - These are *not* SQL or SQLite commands: they are specific to Jupyter and the `jupysql` plugin

In [6]:
%config SqlMagic.autolimit = 0
%config SqlMagic.displaylimit = 0

- The `distinct` keyword eliminates duplicated rows from the results
- We don't have to sort the data first (unlike `sort | uniq` in the Unix shell)

In [7]:
%%sql
select distinct island, species, sex
from penguins;

island,species,sex
Torgersen,Adelie,MALE
Torgersen,Adelie,FEMALE
Torgersen,Adelie,
Biscoe,Adelie,FEMALE
Biscoe,Adelie,MALE
Dream,Adelie,FEMALE
Dream,Adelie,MALE
Dream,Adelie,
Dream,Chinstrap,FEMALE
Dream,Chinstrap,MALE


- To sort, we can use `order by asc` (for "ascending") or `desc` (for "descending") to sort results
  - If we just use `order by`, we get ascending order
  - But please make the ordering explicit

In [8]:
%%sql
select distinct island, species
from penguins
order by species asc, island desc;

island,species
Torgersen,Adelie
Dream,Adelie
Biscoe,Adelie
Dream,Chinstrap
Biscoe,Gentoo


- We can *filter* data by specifying a Boolean condition using `where`
- Each row is tested independently
- Only the rows that pass the test are kept

In [9]:
%%sql
select island, species, sex
from penguins
where sex = "FEMALE"
limit 5;

island,species,sex
Torgersen,Adelie,FEMALE
Torgersen,Adelie,FEMALE
Torgersen,Adelie,FEMALE
Torgersen,Adelie,FEMALE
Torgersen,Adelie,FEMALE


- Re-run that with `distinct` to show what kinds of rows it selects

In [10]:
%%sql
select distinct island, species, sex
from penguins
where sex = "FEMALE";

island,species,sex
Torgersen,Adelie,FEMALE
Biscoe,Adelie,FEMALE
Dream,Adelie,FEMALE
Dream,Chinstrap,FEMALE
Biscoe,Gentoo,FEMALE


- When using `and`, row must pass *both* tests in order to be included in the results

In [11]:
%%sql
select distinct island, species, sex
from penguins
where sex = "FEMALE" and island = "Dream";

island,species,sex
Dream,Adelie,FEMALE
Dream,Chinstrap,FEMALE


- When using `or`, select rows that pass *either* test (or both)

In [12]:
%%sql
select distinct island, species, sex
from penguins
where sex = "FEMALE" or island = "Dream";

island,species,sex
Torgersen,Adelie,FEMALE
Biscoe,Adelie,FEMALE
Dream,Adelie,FEMALE
Dream,Adelie,MALE
Dream,Adelie,
Dream,Chinstrap,FEMALE
Dream,Chinstrap,MALE
Biscoe,Gentoo,FEMALE


- Can negate a condition with `not` or use `!=` (not-equals operator)

In [13]:
%%sql
select distinct island, species, sex
from penguins
where sex != "FEMALE";

island,species,sex
Torgersen,Adelie,MALE
Biscoe,Adelie,MALE
Dream,Adelie,MALE
Dream,Chinstrap,MALE
Biscoe,Gentoo,MALE


- Can rename columns
  - Changes name in results, *not* in database

In [14]:
%%sql
select island as location, sex
from penguins
limit 5;

location,sex
Torgersen,MALE
Torgersen,FEMALE
Torgersen,FEMALE
Torgersen,
Torgersen,FEMALE


- Can do arithmetic and string operations on values while selecting them

In [15]:
%%sql
select flipper_length_mm / 10.0 as flipper_cm, body_mass_g / 1000.0 as weight_kg
from penguins
limit 5;

flipper_cm,weight_kg
18.1,3.75
18.6,3.8
19.5,3.25
,
19.3,3.45


- What `None` doing there??
- SQL uses a special value `null` to represent missing data
- It is not 0, not the empty string, not NaN: it means "I don't know"
- Nothing is equal to it

In [16]:
%%sql
select *
from penguins
where sex == null;

species,island,bill_length_mm,bill_depth_mm,flipper_length_mm,body_mass_g,sex


- But nothing is *not* equal to it either
  - If we don't know the value, we don't know that something *isn't* the same value

In [17]:
%%sql
select *
from penguins
where sex != null;

species,island,bill_length_mm,bill_depth_mm,flipper_length_mm,body_mass_g,sex


- In fact, `null` isn't even equal or not equal to itself
  - If I have two values, and I don't know what they are, I can't tell if they're equal or not equal
- Use `is null` and `is not null` to test

In [18]:
%%sql
select *
from penguins
where sex is null;

species,island,bill_length_mm,bill_depth_mm,flipper_length_mm,body_mass_g,sex
Adelie,Torgersen,,,,,
Adelie,Torgersen,34.1,18.1,193.0,3475.0,
Adelie,Torgersen,42.0,20.2,190.0,4250.0,
Adelie,Torgersen,37.8,17.1,186.0,3300.0,
Adelie,Torgersen,37.8,17.3,180.0,3700.0,
Adelie,Dream,37.5,18.9,179.0,2975.0,
Gentoo,Biscoe,44.5,14.3,216.0,4100.0,
Gentoo,Biscoe,46.2,14.4,214.0,4650.0,
Gentoo,Biscoe,47.3,13.8,216.0,4725.0,
Gentoo,Biscoe,44.5,15.7,217.0,4875.0,
