# ETL Pipeline: Extraction (part 3)

In this notebook, we will work through extracting data from SQL databases. Pandas offers different ways to open a SQL database or to run a SQL query against a database, depending on the type of SQL database you are working with: the [sqlite3 library](https://www.sqlite.org/about.html) or the [sqlalchemy library](https://www.sqlalchemy.org/).

### Pandas and SQLite3

We can use SQLite3 library with pandas when working with a different type of database such as MySQL or PostgreSQL.
Instructions on how to do this can be found inthis [link](http://docs.sqlalchemy.org/en/latest/core/engines.html)

In [None]:
import sqlite3
import pandas as pd

# connect to the database
conn = sqlite3.connect('population_data.db')

# run a query
pd.read_sql('SELECT * FROM population_data', conn)

In [None]:
pd.read_sql('SELECT "Country_Name", "Country_Code", "1960" FROM population_data', conn)

### Pandas and SQLAlchemy

We can use SQLAlchemy library with pandas when working with a different type of database such as MySQL or PostgreSQL.
Instructions on how to do this can be found inthis [link](http://docs.sqlalchemy.org/en/latest/core/engines.html)

In [None]:
# to find the correct file path, use the python os library:
import os
print(os.getcwd())

In [None]:
from sqlalchemy import create_engine

""" Below we will create a database engine using the file path printed above"""

engine = create_engine('sqlite:////f/population_data.db')

In [None]:
pd.read_sql("SELECT * FROM population_data", engine).head(2)

#### Trying out some sql quries

In [None]:
#Write a query that finds the change in population in 1960 to 1962 and the change in Aruba
pd.read_sql('SELECT "1960", "1961", "1961" - "1960" AS "Change" FROM population_data WHERE Country_Name = "Aruba"', engine)

In [None]:
#Write a query that finds the population of Belgium and also Luxembourg in 1975.
pd.read_sql('SELECT "Country_Name", "1975" FROM population_data WHERE Country_Name = "Belgium" OR Country_Name = "Luxembourg"', engine)