# Set-Up

In [1]:
import sqlite3
import pandas as pd

In [2]:
conn = sqlite3.connect('../data/data.db')
c = conn.cursor()

def execute_statement(statement):
    c.execute(statement)
    res = c.fetchall()
    column_names = [description[0] for description in c.description]
    return pd.DataFrame(res, columns=column_names)

# NULL

NULL is used in databases to specify a lack of a value. For example, if taking temperature readings and one of the sensors fails, the data from that sensor could be represented with the NULL marker. That is different from 0 (or an empty string), as 0 could be a valid value from the sensor.

If filtering rows with a WHERE clause, NULL markers will never evaluate to true because NULL is not a value and therefore the clause will not evaluate as true or false, but 'unknown'. Some may expect _LENGTH(column) < 20_ to evaluate as true when evaluated on a NULL marker (as NULL is sometimes shown as an empty string), but as previously explained, it would evaluate to 'unknown'.

# Checking For NULL
To check for NULL values, use the IS NULL operator in the WHERE clause. The example below joins the CITES and LeicestershireBirds tables, and then uses the IS NULL operator to get all of the birds in Leicestershire that do not appear in the CITES trade database.

In [3]:
execute_statement('''SELECT LeicestershireBirds.Taxon
                     FROM LeicestershireBirds
                     LEFT JOIN (
                         SELECT Taxon, Count(Taxon) AS Count
                         FROM cites GROUP BY Taxon)
                     AS Cit
                     ON LeicestershireBirds.Taxon = Cit.Taxon
                     WHERE Cit.Count IS NULL''')

Unnamed: 0,Taxon
0,Cygnus olor
1,Cygnus columbianus
2,Cygnus cygnus
3,Anser fabalis
4,Anser brachyrhynchus
...,...
274,Emberiza citrinella
275,Emberiza cirlus
276,Emberiza schoeniclus
277,Emberiza melanocephala


It is also possible to perform the opposite check by using the IS NOT NULL operator in the WHERE clause. The example below joins the CITES and LeicestershireBirds tables, and then uses the IS NOT NULL operator to get all of the birds in Leicestershire that do appear in the CITES trade database.

In [4]:
execute_statement('''SELECT LeicestershireBirds.Taxon
                     FROM LeicestershireBirds
                     LEFT JOIN (
                         SELECT Taxon, Count(Taxon) AS Count
                         FROM cites GROUP BY Taxon)
                     AS Cit
                     ON LeicestershireBirds.Taxon = Cit.Taxon
                     WHERE Cit.Count IS NOT NULL''')

Unnamed: 0,Taxon
0,Alopochen aegyptiaca
1,Anas crecca
2,Anas acuta
3,Aythya nyroca
4,Oxyura jamaicensis
5,Bubulcus ibis
6,Egretta garzetta
7,Ardea alba
8,Ciconia nigra
9,Platalea leucorodia
