<font size = '5'><b>Filtering and Ordering in SQL</b><font>

In [1]:
import pandas as pd
import numpy as np
import sqlite3
conn = sqlite3.connect('pets_database.db')
cur = conn.cursor()

In [2]:
cur.execute("""SELECT * FROM cats
               ORDER BY age DESC;""")
df = pd.DataFrame(cur.fetchall())
df.columns = [x[0] for x in cur.description]
df.head(6)

Unnamed: 0,id,name,age,breed,owner_id
0,4,Moe,10.0,Tabby,
1,3,Lil' Bub,5.0,American Shorthair,
2,1,Maru,3.0,Scottish Fold,1.0
3,5,Patches,2.0,Calico,
4,2,Hana,1.0,Tabby,1.0
5,6,,,Tabby,


In [3]:
df.shape

(6, 5)

If you wanted to only get the details of the oldest cat, you can do it by using LIMIT:

In [4]:
cur.execute("""SELECT * FROM cats
               ORDER BY age DESC
               LIMIT 1;""")
df = pd.DataFrame(cur.fetchall())
df.columns = [x[0] for x in cur.description]
df.head()

Unnamed: 0,id,name,age,breed,owner_id
0,4,Moe,10,Tabby,


For the youngest cart, just change the ORDER BY to ASC and that should do the trick...lets try it out:

In [5]:
cur.execute("""SELECT * FROM cats
               ORDER BY age ASC
               LIMIT 1;""")
df = pd.DataFrame(cur.fetchall())
df.columns = [x[0] for x in cur.description]
df.head()

Unnamed: 0,id,name,age,breed,owner_id
0,6,,,Tabby,


In [6]:
cur.execute("""SELECT * FROM cats
               ORDER BY age ASC
               LIMIT 1;""")
df = pd.DataFrame(cur.fetchall())
df.columns = [x[0] for x in cur.description]
df.head()

Unnamed: 0,id,name,age,breed,owner_id
0,6,,,Tabby,


<b>Using the option WHERE</b>

In [7]:
cur.execute("""SELECT * FROM cats
               WHERE age BETWEEN 1 AND 3
               ORDER BY age DESC;""")
df = pd.DataFrame(cur.fetchall())
df.columns = [x[0] for x in cur.description]
df.head()

Unnamed: 0,id,name,age,breed,owner_id
0,1,Maru,3,Scottish Fold,1.0
1,5,Patches,2,Calico,
2,2,Hana,1,Tabby,1.0


As you might have realized, some cats, or one cat, has no name. You can find out about these cases (Null) by using the <b>NULL</b> option:

In [8]:
cur.execute("""SELECT * FROM cats
               WHERE name IS null;""")
df = pd.DataFrame(cur.fetchall())
df.columns = [x[0] for x in cur.description]
df.head()

Unnamed: 0,id,name,age,breed,owner_id
0,6,,,Tabby,


Some really useful things that can be do with SQL, is using aggregates functions such as <b>COUNT</b>. These functions can get you the average of a column's values, retrieve the minimum and maximum values of a column, sum values in a column, or count a number of records that meet certain conditions. For the time being lets just use <b>COUNT</b>

In [9]:
cur.execute("""SELECT COUNT(owner_id) 
               FROM cats
               WHERE owner_id = 1;""").fetchall()
#df = pd.DataFrame(cur.fetchall())
#df.columns = [x[0] for x in cur.description]
#df.head()

[(2,)]

Another aggregate function that is very useful is the <b>GROUP BY</b>

In [10]:
cur.execute("""SELECT breed, owner_id, COUNT(breed) 
               FROM cats
               GROUP BY breed
               ORDER BY breed DESC;""")
df = pd.DataFrame(cur.fetchall())
df.columns = [x[0] for x in cur.description]
df.head()

Unnamed: 0,breed,owner_id,COUNT(breed)
0,Tabby,1.0,3
1,Scottish Fold,1.0,1
2,Calico,,1
3,American Shorthair,,1


<font size='5'><b>Filtering and Ordering Lab</b><font>

In [11]:
conn = sqlite3.connect('dogs.db')
cur = conn.cursor()

In [12]:
cur.execute("""SELECT * FROM dogs;""")
df = pd.DataFrame(cur.fetchall())
df.columns = [x[0] for x in cur.description]
df.head()

Unnamed: 0,id,name,age,gender,breed,temperament,hungry
0,1,Snoopy,3,M,beagle,friendly,1
1,2,McGruff,10,M,bloodhound,aware,0
2,3,Scooby,6,M,great dane,hungry,1
3,4,Little Ann,5,F,coonhound,loyal,0
4,5,Pickles,13,F,black lab,mischievous,1


In [13]:
cur.execute("""SELECT name, breed 
               FROM dogs
               WHERE gender == 'F';""")
df = pd.DataFrame(cur.fetchall())
df.columns = [x[0] for x in cur.description]
df.head()

Unnamed: 0,name,breed
0,Little Ann,coonhound
1,Pickles,black lab
2,Lassie,collie
3,Snowy,fox terrier


In [18]:
cur.execute("""SELECT name 
               FROM dogs
               ORDER BY name ASC;""")
df = pd.DataFrame(cur.fetchall())
df.columns = [x[0] for x in cur.description]
df.head(10)

Unnamed: 0,name
0,
1,Clifford
2,Lassie
3,Little Ann
4,McGruff
5,Pickles
6,Scooby
7,Snoopy
8,Snowy


In [20]:
cur.execute("""SELECT *
               FROM dogs
               WHERE name IS null;""")
df = pd.DataFrame(cur.fetchall())
df.columns = [x[0] for x in cur.description]
df.head(10)

Unnamed: 0,id,name,age,gender,breed,temperament,hungry
0,9,,4,M,golden retriever,playful,1


In [22]:
cur.execute("""SELECT name, breed
               FROM dogs
               WHERE hungry == 1;""")
df = pd.DataFrame(cur.fetchall())
df.columns = [x[0] for x in cur.description]
df.head(10)

Unnamed: 0,name,breed
0,Snoopy,beagle
1,Scooby,great dane
2,Pickles,black lab
3,Clifford,big red
4,Lassie,collie
5,,golden retriever


In [23]:
cur.execute("""SELECT name, age, temperament
               FROM dogs
               ORDER BY age DESC
               LIMIT 1;""")
df = pd.DataFrame(cur.fetchall())
df.columns = [x[0] for x in cur.description]
df.head()

Unnamed: 0,name,age,temperament
0,Pickles,13,mischievous


In [24]:
cur.execute("""SELECT * FROM dogs
               ORDER BY age ASC
               LIMIT 3;""")
df = pd.DataFrame(cur.fetchall())
df.columns = [x[0] for x in cur.description]
df.head(3)

Unnamed: 0,id,name,age,gender,breed,temperament,hungry
0,1,Snoopy,3,M,beagle,friendly,1
1,6,Clifford,4,M,big red,smiley,1
2,9,,4,M,golden retriever,playful,1


In [27]:
cur.execute("""SELECT name, breed
               FROM dogs
               WHERE age BETWEEN 5 AND 10
               ORDER BY age DESC;""")
df = pd.DataFrame(cur.fetchall())
df.columns = [x[0] for x in cur.description]
df.head()

Unnamed: 0,name,breed
0,McGruff,bloodhound
1,Snowy,fox terrier
2,Lassie,collie
3,Scooby,great dane
4,Little Ann,coonhound


In [32]:
cur.execute("""SELECT name, age, hungry
               FROM dogs
               WHERE hungry = 1 AND age BETWEEN 2 AND 7
               ORDER BY name ASC;""")
df = pd.DataFrame(cur.fetchall())
df.columns= [x[0] for x in cur.description]
df.head()

Unnamed: 0,name,age,hungry
0,,4,1
1,Clifford,4,1
2,Lassie,7,1
3,Scooby,6,1
4,Snoopy,3,1
