# <font color='#eb3483'> SQL Exercises </font>
In today's exercises, you'll get the chance to hone your SQL skills. We'll be using the same database as class (nobel winners). Start by loading the sqlite3 package, and pandas.

In [1]:
import sqlite3
import pandas as pd

## <font color='#eb3483'> Exercise 1 - Connecting to the Database </font>

Connect to the nobel database (located in the data folder)

In [2]:
conn = sqlite3.connect('data/nobel.db')
cur = conn.cursor()

List all the tables in the database, and their columns.

In [7]:
query = "SELECT * FROM sqlite_master WHERE type='table';"
#This runs our query and fetches all of it's results
results = cur.execute(query).fetchall()
for table in results:
    print(table[4])

CREATE TABLE "prizes" (
"year" TEXT,
  "category" TEXT,
  "laureate_id" TEXT
)
CREATE TABLE "laureates" (
"id" TEXT,
  "firstname" TEXT,
  "surname" TEXT,
  "born" TEXT,
  "died" TEXT,
  "bornCountry" TEXT,
  "bornCountryCode" TEXT,
  "bornCity" TEXT,
  "diedCountry" TEXT,
  "diedCountryCode" TEXT,
  "diedCity" TEXT,
  "gender" TEXT
)


## <font color='#eb3483'> Exercise 2 - Select Basics</font>
Select the first ten records from the laureate table.

In [8]:
pd.read_sql("SELECT * FROM laureates LIMIT 10;", conn)

Unnamed: 0,id,firstname,surname,born,died,bornCountry,bornCountryCode,bornCity,diedCountry,diedCountryCode,diedCity,gender
0,1,Wilhelm Conrad,Röntgen,1845-03-27,1923-02-10,Prussia (now Germany),DE,Lennep (now Remscheid),Germany,DE,Munich,male
1,2,Hendrik A.,Lorentz,1853-07-18,1928-02-04,the Netherlands,NL,Arnhem,the Netherlands,NL,,male
2,3,Pieter,Zeeman,1865-05-25,1943-10-09,the Netherlands,NL,Zonnemaire,the Netherlands,NL,Amsterdam,male
3,4,Henri,Becquerel,1852-12-15,1908-08-25,France,FR,Paris,France,FR,,male
4,5,Pierre,Curie,1859-05-15,1906-04-19,France,FR,Paris,France,FR,Paris,male
5,6,Marie,Curie,1867-11-07,1934-07-04,Russian Empire (now Poland),PL,Warsaw,France,FR,Sallanches,female
6,8,Lord,Rayleigh,1842-11-12,1919-06-30,United Kingdom,GB,"Langford Grove, Maldon, Essex",United Kingdom,GB,,male
7,9,Philipp,Lenard,1862-06-07,1947-05-20,Hungary (now Slovakia),SK,Pressburg (now Bratislava),Germany,DE,Messelhausen,male
8,10,J.J.,Thomson,1856-12-18,1940-08-30,United Kingdom,GB,Cheetham Hill,United Kingdom,GB,Cambridge,male
9,11,Albert A.,Michelson,1852-12-19,1931-05-09,Prussia (now Poland),PL,Strelno (now Strzelno),USA,US,"Pasadena, CA",male


Find the birth and death dates for Albert Einstein.

In [9]:
pd.read_sql("SELECT born, died FROM laureates WHERE firstname= 'Albert' AND surname = 'Einstein';", conn)

Unnamed: 0,born,died
0,1879-03-14,1955-04-18


Find the Nobel Laureates who died in 2015 and whose name begins with 'Y'.

In [10]:
pd.read_sql("SELECT * FROM laureates WHERE firstname LIKE 'Y%' AND died LIKE '2015%';", conn)

Unnamed: 0,id,firstname,surname,born,died,bornCountry,bornCountryCode,bornCity,diedCountry,diedCountryCode,diedCity,gender
0,794,Yves,Chauvin,1930-10-10,2015-01-27,Belgium,BE,Menin,France,FR,Tours,male
1,826,Yoichiro,Nambu,1921-01-18,2015-07-05,Japan,JP,Tokyo,Japan,JP,Osaka,male


**Challenge** Find the last three Nobel Laureates born in 1900. Hint to solve this you might want to order your results by birthdate (check out the ORDER BY operator [here](https://www.sqlitetutorial.net/sqlite-order-by/)).

In [17]:
pd.read_sql("SELECT * FROM laureates WHERE born LIKE '1900%' ORDER BY born DESC LIMIT 3;", conn)

Unnamed: 0,id,firstname,surname,born,died,bornCountry,bornCountryCode,bornCity,diedCountry,diedCountryCode,diedCity,gender
0,198,Richard,Kuhn,1900-12-03,1967-07-31,Austria-Hungary (now Austria),AT,Vienna,West Germany (now Germany),DE,Heidelberg,male
1,385,Ragnar,Granit,1900-10-30,1991-03-12,Russian Empire (now Finland),FI,Helsinki,Sweden,SE,Stockholm,male
2,354,Hans,Krebs,1900-08-25,1981-11-22,Germany,DE,Hildesheim,United Kingdom,GB,Oxford,male


## <font color='#eb3483'> Exercise 2 - Select Aggregations </font>


Find the number of Nobel Prizes awarded between 1950 and 1960 (inclusive).

In [4]:
pd.read_sql("SELECT year, COUNT(*) FROM prizes where year >= '1950' and year <= '1960' GROUP BY year;", conn)

Unnamed: 0,year,COUNT(*)
0,1950,8
1,1951,7
2,1952,7
3,1953,6
4,1954,8
5,1955,5
6,1956,9
7,1957,6
8,1958,9
9,1959,7


Find the number of Nobel Prizes awarded in each year.

In [21]:
pd.read_sql("SELECT year, COUNT(*) FROM prizes GROUP BY year;", conn)

Unnamed: 0,year,COUNT(*)
0,1901,6
1,1902,7
2,1903,7
3,1904,6
4,1905,5
...,...,...
111,2015,11
112,2016,11
113,2017,12
114,2018,13


In which year was the greatest number of Nobel Prizes awarded?

In [23]:
pd.read_sql("SELECT year, COUNT(*) as num_awards \
             FROM prizes GROUP BY year \
             ORDER BY num_awards DESC \
             LIMIT 1;", conn)

Unnamed: 0,year,num_awards
0,2001,15


## <font color='#eb3483'> Exercise 3 - Joins </font>

Which year has the most women Nobel Laureates?

In [24]:
pd.read_sql("SELECT year, COUNT(*) as num_awards \
             FROM prizes \
             JOIN laureates \
             ON prizes.laureate_id = laureates.id\
             WHERE gender = 'female' \
             GROUP BY year \
             ORDER BY num_awards DESC \
             LIMIT 1;", conn)

Unnamed: 0,year,num_awards
0,2009,5


Which category has the most women Nobel Laureates?

In [26]:
pd.read_sql("SELECT category, COUNT(*) as num_awards \
             FROM prizes \
             JOIN laureates \
             ON prizes.laureate_id = laureates.id\
             WHERE gender = 'female' \
             GROUP BY category \
             ORDER BY num_awards DESC \
             LIMIT 1;", conn)

Unnamed: 0,category,num_awards
0,peace,17


What countries (top 3) have the most nobel prizes? We say a country has a nobel prize if it was won by a nobel laureate that was born there.

In [27]:
pd.read_sql("SELECT bornCountryCode, COUNT(*) as num_awards \
             FROM prizes \
             JOIN laureates \
             ON prizes.laureate_id = laureates.id\
             GROUP BY bornCountryCode \
             ORDER BY num_awards DESC \
             LIMIT 3;", conn)

Unnamed: 0,bornCountryCode,num_awards
0,US,274
1,GB,103
2,DE,82
