## Running SQL in Jupyter Notebooks

May want to do a quick analysis from a database without having to set up a Pandas DataFrame. This is especially useful when you have to join one or more tables.

In [1]:
# import libraries
import pandas as pd
import sqlite3

In [2]:
# install the ipython-sql library
!pip install ipython-sql



In [3]:
# create a dataframe to import data for the tutorial
df= pd.DataFrame({'name': ['Juan','Victoria','Mary'], 
                  'age': [23,34,43], 
                  'city': ['Miami','Buenos Aires','Santiago']})

df

Unnamed: 0,name,age,city
0,Juan,23,Miami
1,Victoria,34,Buenos Aires
2,Mary,43,Santiago


The connection command will create a database. You can always use an existing database which you have stored but in this case we are creating a new one.

In [4]:
# use sqlite3 to create a connection
cnn = sqlite3.connect('jupyter_sql_tutorial.db')

We are only using two arguments for this function; the first one 'people' is the name of the table that we'll be creating. The second argument is the cnn, or connection variable created in the previous cell. 

Now we will run jupyter inline magic to load the SQL module. A single % sign represents inline magic, or line magic. %% represents cell magic. 

In [5]:
df.to_sql('people', cnn)
%load_ext sql

Next we want to point the jupyter sql library to the database we want to use.

In [6]:
%sql sqlite:///jupyter_sql_tutorial.db

'Connected: @jupyter_sql_tutorial.db'

Now we are pretty much ready to run our queries.

In [7]:
%%sql
SELECT *
FROM people

 * sqlite:///jupyter_sql_tutorial.db
Done.


index,name,age,city
0,Juan,23,Miami
1,Victoria,34,Buenos Aires
2,Mary,43,Santiago


In [8]:
%%sql
SELECT count(*)
FROM people

 * sqlite:///jupyter_sql_tutorial.db
Done.


count(*)
3


In [9]:
%%sql
SELECT sum(age) as 'age_sum'
FROM people

 * sqlite:///jupyter_sql_tutorial.db
Done.


age_sum
100
