# **E. Database(SQLITE)**

In this section, we're going to learn what database is and how to work with a database in Python.



## _Objective_
1. **Database**: Understanding what Database is
2. **Database practice with SQLite3**: Practising working with database in Python

# [1. Database]

We're in the era of big data, it has become a crucial task to pull data from a database or push data into the database,  
and process it in the appropriate form for data analysis.<br>
First, let's find out what a database is.


<img src = 'https://imgur.com/KPX8DXj.jpg' align=left width=200 height=200/>

## 1. What is Database?

### (1) Definition

A database is a space in an integrated environment where duplicate-free data are stored at the disposal of multiple users, typically within an organization.

## 2. Structure of Database
The database has a similar structure to DataFrame for being a tabular object.<br>
Let's take a closer look at the structure of the database by comparing it to DataFrames.

<img src= 'https://imgur.com/b6edS7y.jpg' align=left width=600 height=600 />

### (1) Relation
In DataFrame, data are tabulated and managed in tabular order.<br>
Data stored in a database are also managed in tabular order, and the data table is called a **`relation`** in the database.

### (2) Attributes

In relational databases, attributes are the describing characteristics or properties  
that define all data values pertaining to a certain category applied to all cells of a column.

In DataFrames, attributes are often referred to as **Column** or **Variable**.

### (3) Tuples

any single row representing a single entity is called a **tuple** or **record** in the database.


# [2. Database Practice with SQLite]
SQLite is a database management system like MySQL or PostgreSQL, but it is not a server, but a lightweight database.<br>
Let's practice SQLite to manipulate databases in Python.

## 1. Creating Database

### (1) CREATE relation

First, import a library `sqlite3` for implementation of a database.

In [1]:
import sqlite3

With connect(), you can create a temporary database in memory and objectify it.

In [2]:
conn = sqlite3.connect(":memory:")

With `.cursor()`, you can create a cursor on the connected database object.

In [3]:
cur = conn.cursor()

Specify the attributes of the relation to be created.

In [4]:
query = """
CREATE TABLE student
(Name VARCHAR(20), Age INTEGER, Sex Varchar(10));
"""

With `.execute()`, you can create a relation.

In [5]:
conn.execute(query)

<sqlite3.Cursor at 0x28d03444b90>

Check the type of the `conn` object.

In [6]:
type(conn)

sqlite3.Connection

### (2) INSERT data

First, specify the tuples to be inserted into the relation.

In [7]:
form = "INSERT INTO student VALUES(?, ?, ?)"
data = [('Jake', 20, 'Male'),
        ('Anna', 23, 'Femlae'),
        ('Peter', 21, 'Male')]

You can insert data using the `conn.executemany()` method.

In [8]:
conn.executemany(form, data)

<sqlite3.Cursor at 0x28d03444ab0>

Save the changes with the `conn.commit()` method.

In [9]:
conn.commit()

## 2. DataFrame <-> Database
You can use Pandas to read the database and convert it to DataFrame.

### (1) Read Database with Pandas

Import modules for the conversion of the database.

In [10]:
import pandas as pd
import pandas.io.sql as sql

Once the libraries are imported, you can load and read a database object as DataFrame using `.read_sql()`.

In [11]:
pd.read_sql('select * from student', conn)

Unnamed: 0,Name,Age,Sex
0,Jake,20,Male
1,Anna,23,Femlae
2,Peter,21,Male


### (2) Converting Database to DataFrame

By assigning the object created by `.read_sql()` to a variable, you can manipulate data from the database in the same way as data in DataFrame.

In [12]:
df = pd.read_sql('select * from student', conn)
df

Unnamed: 0,Name,Age,Sex
0,Jake,20,Male
1,Anna,23,Femlae
2,Peter,21,Male


### (3) Converting DataFrame to Databse 
The `.to_sql()` method converts a DataFrame object to a database object.

Create an in-memory database engine.<br>
※In-memory database: A database management system installed and operated on main memory for computer data storage.

`SQLAlchemy` is the Python SQL library that gives application developers the full power and flexibility of SQL.<br>
Let's import `SQLAlchemy`'s `.create_engine()` to produce an engine object accessing a database based on a URL.

In [13]:
from sqlalchemy import create_engine
engine = create_engine('sqlite://', echo=False)

Convert `df` to a database named `student`.

In [14]:
df.to_sql('student', con=engine)

You can access the database using `.execute()` and `.fetchall()`.

In [15]:
engine.execute("SELECT * FROM student").fetchall()

[(0, 'Jake', 20, 'Male'), (1, 'Anna', 23, 'Femlae'), (2, 'Peter', 21, 'Male')]

## 3. Database Manipulation

Data can be manipulated in the form of a relation or of a DataFrame.<br>
Let's learn by comparing the two methods.

### (1) SELECT Name

#### Data of a person named 'Jake'

In [16]:
#SQL
pd.read_sql('select * from student where student.Name="Jake"', conn)

Unnamed: 0,Name,Age,Sex
0,Jake,20,Male


In [17]:
#Pandas
df[df.Name == 'Jake']

Unnamed: 0,Name,Age,Sex
0,Jake,20,Male


### (2) SELECT Age

#### Data of people over the age of 21

In [18]:
#SQL
pd.read_sql('select * from student where Age>21', conn)

Unnamed: 0,Name,Age,Sex
0,Anna,23,Femlae


In [19]:
#Pandas
df[df.Age>21]

Unnamed: 0,Name,Age,Sex
1,Anna,23,Femlae


### (3) SELECT Sex

#### Data of all men

In [20]:
#SQL
pd.read_sql('select * from student where Sex="Male"', conn)

Unnamed: 0,Name,Age,Sex
0,Jake,20,Male
1,Peter,21,Male


In [21]:
#Pandas
df[df.Sex=='Male']

Unnamed: 0,Name,Age,Sex
0,Jake,20,Male
2,Peter,21,Male
