# Pandas with SQL

In [37]:
import sqlite3 as sqlite
import pandas as pd
from sqlalchemy import create_engine

Create a database, `population.db`, and insert one table with name, `Population`, representing populations of different countries.

In [38]:
with sqlite.connect('./_data/population.db') as con:
    cur = con.cursor()    
    cur.execute("CREATE TABLE IF NOT EXISTS Population(id INTEGER PRIMARY KEY, country TEXT, population INT)")
    cur.execute("INSERT INTO Population VALUES(NULL,'Germany',81197537)")
    cur.execute("INSERT INTO Population VALUES(NULL,'France', 66415161)")
    cur.execute("INSERT INTO Population VALUES(NULL,'Spain', 46439864)")
    cur.execute("INSERT INTO Population VALUES(NULL,'Italy', 60795612)")
    cur.execute("INSERT INTO Population VALUES(NULL,'Spain', 46439864)")

## Connect with SQLAlchemy

The only parameter required in `create_engine` is a connection string. This string consists of the name of the database driver and the database name.

In [39]:
connection = create_engine('sqlite:///_data/population.db')

## Reading SQL data with Pandas

`pandas` allows us to query `sql` tables with 3 methods:
- `read_sql_query()`
- `read_sql_table()`
- `read_sql()`

### 1. `read_sql_query()`

Requires 2 parameters as input:
1. String with an SQL query
2. Connection object to the database.

In [40]:
sql = 'SELECT * FROM Population'

with connection.connect() as con:
  df = pd.read_sql_query(sql, con)

In [41]:
df.head()

Unnamed: 0,id,country,population
0,1,Germany,81197537
1,2,France,66415161
2,3,Spain,46439864
3,4,Italy,60795612
4,5,Spain,46439864


### 2. `read_sql_table()`
Unlike the `read_sql_query()`, the first parameter is a table:
1. Database table name
2. Connection object

In [42]:
table = 'Population'

with connection.connect() as con:
  df = pd.read_sql_table(table, con)

In [43]:
df.head()

Unnamed: 0,id,country,population
0,1,Germany,81197537
1,2,France,66415161
2,3,Spain,46439864
3,4,Italy,60795612
4,5,Spain,46439864


### 3. `read_sql()`
Changes parameter requirements depending on what is passed through the first parameter.

In [44]:
sql = 'SELECT * FROM Population'

with connection.connect() as con:
  df = pd.read_sql(sql, con)

In [45]:
df.head()

Unnamed: 0,id,country,population
0,1,Germany,81197537
1,2,France,66415161
2,3,Spain,46439864
3,4,Italy,60795612
4,5,Spain,46439864


In [46]:
table = 'Population'

with connection.connect() as con:
  df = pd.read_sql(table, con)

In [47]:
df.head()

Unnamed: 0,id,country,population
0,1,Germany,81197537
1,2,France,66415161
2,3,Spain,46439864
3,4,Italy,60795612
4,5,Spain,46439864
