# Connecting to SQLite databases with Python
## Objectives
- Extracting database tables and loading them into a `pandas` dataframe.  
- Exporting `pandas` dataframes into an SQLite database.  
  
We will be working with `sqlite` which is a file-based database, the library associated for the database comes standard with vanilla python, and goes by the name `sqlite3`.

## Imports
`pip3 install pandas`

In [12]:
# imports
import sqlite3
import pandas as pd

## Loading data into SQLite

### Connection
Below we are creating the connection to `sqlite`, when creating the connection `sqlite3.connect()` will either be creating a new file if it does not exist, or accessing the database if it does exist.  
  
Following this, we are creating the cursor by using the method `.cursor()` which will allow us to execute statements, query data, and so on. The cursor is what we use to actually write the SQL statements as if we were using SQL.

In [13]:
# Connecting to a file
conn = sqlite3.connect("mydb.db")

# Initializing the cursor
cur = conn.cursor()

### Creation of an SQL table in Python
After creating the connection and initializing the cursor we begin by making a table to place into `sqlite3`. To do this we use the `.execute()` method on the cursor object, and using a string we write the SQL statement of which we are trying to have execute. 
  
**The following SQL statement reads as so:**  
"Create a table named 'people' if it does not currently exist in the database, in this table there will be three columns each named 'ssn', 'name', and 'age'. For 'ssn' the datatype is integer and the column is a primary key. For 'name' the datatype is a varchar(255) and this column can not be null. Lastly, the column 'age' will contain a value of datatype integer."  
  
NOTE: We use triple quotes (string datatype) in order to preserve the schema of SQL.
NOTE: `sqlite3.connect("mydb.db").cursor().execute()` 

In [14]:
# Creating the table/entry
cur.execute("""
CREATE TABLE IF NOT EXISTS people (
    ssn INTEGER PRIMARY KEY,
    name VARCHAR(255) NOT NULL,
    age INTEGER)
""")

<sqlite3.Cursor at 0x11c7ad340>

### Inserting data into SQL using Python
In order to insert data into the `sqlite3` database we again use the `.execute()` method on our cursor. By using the `INSERT OR IGNORE` statement, SQLite will attempt to insert the data into the table, but if there are any conflicts (e.g., duplicate entries based on a primary key or unique constraint), it will ignore those specific rows and continue with the insertion of the remaining data.
  
**The following SQL statement reads as so:**  
"I would like to insert into the database table named 'people', and the columns that I would like to insert data into are 'ssn', 'name', and 'age'. The values to insert are as follows."  
  


In [15]:
# Insertion of data
cur.execute("""
INSERT OR IGNORE INTO people (ssn, name, age) VALUES
(1010, 'Mike', 25),
(9090, 'Hannah', 18),
(7654, 'Michelle', 22),
(2363, 'Josh', 35),
(1264, 'Blake', 55)
""")

<sqlite3.Cursor at 0x11c7ad340>

### Committing our executions to the database 
To forward our script/executions to the `sqlite` database we must run the method `.commit()` on our connection.

In [16]:
# Committing our script
conn.commit()

## Extracting data from SQL into a Pandas dataframe
In order to load data from and SQLite database into a pandas dataframe we use the `pandas` method `pd.read_sql_query()`, inside the method we type an SQL query statement as a string.

In [17]:
# Initialize the connection
conn = sqlite3.connect("mydb.db")

# Extracting the data from SQLite, data type is dataframe on import
sql = pd.read_sql_query("SELECT * FROM people", conn)

# Display
print(sql)
print('\n')
print(sql.info())
print(sql.describe())

    ssn      name  age
0  1010      Mike   25
1  1264     Blake   55
2  2363      Josh   35
3  7654  Michelle   22
4  9090    Hannah   18


<class 'pandas.core.frame.DataFrame'>
RangeIndex: 5 entries, 0 to 4
Data columns (total 3 columns):
 #   Column  Non-Null Count  Dtype 
---  ------  --------------  ----- 
 0   ssn     5 non-null      int64 
 1   name    5 non-null      object
 2   age     5 non-null      int64 
dtypes: int64(2), object(1)
memory usage: 248.0+ bytes
None
              ssn        age
count     5.00000   5.000000
mean   4276.20000  31.000000
std    3807.35961  14.815532
min    1010.00000  18.000000
25%    1264.00000  22.000000
50%    2363.00000  25.000000
75%    7654.00000  35.000000
max    9090.00000  55.000000


## Inserting Pandas dataframes into SQL

In [18]:
# Creating a dataframe
df = pd.DataFrame({
    'ssn' : [9999, 8888, 7777],
    'name' : ['Jack', 'David', 'Rick'],
    'age' : [88, 44, 31]
})

### Exporting the dataframe into the SQL database
In order to export a dataframe into the SQLite database we first have to establish the connection to the database via `sqlite3.connect()`. Once this is accomplished we specify the table we want to create or append to, connect to the database, and place a condition using the `if_exists=` argument, lastly we specify that the index should be False. 
  
NOTE: `if_exists=` accepts 3 potential values 'fail', 'replace', 'append'

In [19]:
# Establishing connection to the SQLite database
conn = sqlite3.connect('mydb.db')

# Exporting the dataframe, and placing it into the SQLite database
df.to_sql('people', con=conn, if_exists='append', index=False)

3