![image.png](https://raw.githubusercontent.com/fjvarasc/DSPXI/master/figures/py_logo.png)


<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Database-operations" data-toc-modified-id="Database-operations-3"><span class="toc-item-num">3&nbsp;&nbsp;</span>Database operations</a></span><ul class="toc-item"><li><span><a href="#Modifying-database-rows" data-toc-modified-id="Modifying-database-rows-3.1"><span class="toc-item-num">3.1&nbsp;&nbsp;</span>Modifying database rows</a></span></li><li><span><a href="#Inserting-rows-with-Python" data-toc-modified-id="Inserting-rows-with-Python-3.2"><span class="toc-item-num">3.2&nbsp;&nbsp;</span>Inserting rows with Python</a></span></li><li><span><a href="#Passing-parameters-into-a-query" data-toc-modified-id="Passing-parameters-into-a-query-3.3"><span class="toc-item-num">3.3&nbsp;&nbsp;</span>Passing parameters into a query</a></span></li><li><span><a href="#Updating-rows" data-toc-modified-id="Updating-rows-3.4"><span class="toc-item-num">3.4&nbsp;&nbsp;</span>Updating rows</a></span></li><li><span><a href="#Deleting-rows" data-toc-modified-id="Deleting-rows-3.5"><span class="toc-item-num">3.5&nbsp;&nbsp;</span>Deleting rows</a></span></li><li><span><a href="#Creating-tables" data-toc-modified-id="Creating-tables-3.6"><span class="toc-item-num">3.6&nbsp;&nbsp;</span>Creating tables</a></span></li><li><span><a href="#Creating-tables-with-pandas" data-toc-modified-id="Creating-tables-with-pandas-3.7"><span class="toc-item-num">3.7&nbsp;&nbsp;</span>Creating tables with pandas</a></span></li><li><span><a href="#Altering-tables-with-Pandas" data-toc-modified-id="Altering-tables-with-Pandas-3.8"><span class="toc-item-num">3.8&nbsp;&nbsp;</span>Altering tables with Pandas</a></span></li><li><span><a href="#Altering-tables-with-Pandas" data-toc-modified-id="Altering-tables-with-Pandas-3.9"><span class="toc-item-num">3.9&nbsp;&nbsp;</span>Altering tables with Pandas</a></span></li></ul></li></ul></div>

# Database operations

## Modifying database rows
We can use the `sqlite3` package to modify a SQLite database by inserting, updating, or deleting rows. Creating the Connection is the same for this as it is when you're querying a table, so we'll skip that part.

## Inserting rows with Python
To insert a row, we need to write an `INSERT` query. The below code will add a new row to the `airlines` table. We specify `9` values to insert, one for each column in `airlines`. This will add a new row.

Let's look at the `airlines` table using `tail()` in our dataframe we can check the last rows of the table.

In [18]:
df = pd.read_sql_query("select * from airlines;", connection)
df.tail()

Unnamed: 0,index,id,name,alias,iata,icao,callsign,country,active,airplanes
6044,6044,19830,All Australia,All Australia,88,8K8,,Australia,Y,
6045,6045,19831,Fly Europa,,ER,RWW,,Spain,Y,
6046,6046,19834,FlyPortugal,,PO,FPT,FlyPortugal,Portugal,Y,
6047,6047,19845,FTI Fluggesellschaft,,,FTI,,Germany,N,
6048,6048,19846,Test flight4,,,,,,Y,


## Passing parameters into a query
In the last query, we hardcoded the values we wanted to insert into the database. Most of the time, when you insert data into a database, it won't be hardcoded, it will be dynamic values you want to pass in. These dynamic values might come from downloaded data, or might come from user input.

When working with dynamic data, it might be tempting to insert values using Python string formatting:

In [58]:
name = "Test Flight 256"
connection.execute("insert into airlines values (6049, 19847, '{0}', '', '', null, null, null, 'Y')".format(name));

You want to avoid doing this! Inserting values with Python string formatting makes your program vulnerable to [SQL Injection](https://en.wikipedia.org/wiki/SQL_injection) attacks. Luckily, `sqlite3` has a straightforward way to inject dynamic values without relying on string formatting:

In [59]:
values = ('Test Flight12345', 'Y')
connection.execute("insert into airlines values (6049, 19847, ?, '', '', null, null, null, ?)", values);

Any `?` value in the query will be replaced by a `value` in values. The first `?` will be replaced by the first item in `values`, the second by the second, and so on. This works for any type of query. This created a SQLite [parameterized query](https://www.sqlite.org/lang_expr.html), which avoids SQL injection issues.

## Updating rows
We can modify rows in a SQLite table using the `execute` method:

In [60]:
values = ('USA', 19847)
connection.execute("update airlines set country=? where id=?", values);

We can then verify that the update happened:

In [61]:
pd.read_sql_query("select * from airlines where id=19847;", connection)

Unnamed: 0,index,id,name,alias,iata,icao,callsign,country,active
0,6049,19847,Test Flight 256,,,,,USA,Y
1,6049,19847,Test Flight12345,,,,,USA,Y


## Deleting rows
Finally, we can delete the rows in a database using the `execute` method:

In [62]:
values = (19847, )
connection.execute("delete from airlines where id=?", values);

We can then verify that the deletion happened, by making sure no rows match our query:

In [63]:
pd.read_sql_query("select * from airlines where id=19847;", connection)

Unnamed: 0,index,id,name,alias,iata,icao,callsign,country,active


## Creating tables
We can create tables by executing a SQL query. We can create a table to represent each daily flight on a route, with the following columns:

* id — integer
* departure — date, when the flight left the airport
* arrival — date, when the flight arrived at the destination
* number — text, the flight number
* route_id — integer, the id of the route the flight was flying

In [65]:
connection.execute("create table daily_flights (id integer, departure date, arrival date, number text, route_id integer)");

Once we create a table, we can insert data into it normally:

In [66]:
connection.execute("insert into daily_flights values (1, '2016-09-28 0:00', '2016-09-28 12:00', 'T1', 1)");

When we query the table, we'll now see the row:

In [67]:
pd.read_sql_query("select * from daily_flights;", connection)

Unnamed: 0,id,departure,arrival,number,route_id
0,1,2016-09-28 0:00,2016-09-28 12:00,T1,1


## Creating tables with pandas
The pandas package gives us a much faster way to create tables. We just have to create a DataFrame first, then export it to a SQL table. First, we'll create a DataFrame:

In [19]:
from datetime import datetime
df = pd.DataFrame(
    [[1, datetime(2016, 9, 29, 0, 0) , datetime(2016, 9, 29, 12, 0), 'T1', 1]], 
    columns=["id", "departure", "arrival", "number", "route_id"]
)
df

Unnamed: 0,id,departure,arrival,number,route_id
0,1,2016-09-29,2016-09-29 12:00:00,T1,1


Then, we'll be able to call the [to_sql](http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.to_sql.html) method to convert `df` to a table in a database. We set the `keep_exists` parameter to replace to delete and replace any existing tables named `daily_flights`:

In [69]:
df.to_sql("daily_flights", connection, if_exists="replace")

We can then verify that everything worked by querying the database:

In [20]:
pd.read_sql_query("select * from daily_flights;", connection)

Unnamed: 0,level_0,index,id,departure,arrival,number,route_id,delay_minutes
0,0,0,1,2016-09-29 00:00:00.000000,2016-09-29 12:00:00.000000,T1,1,


## Altering tables with Pandas
One of the hardest parts of working with real-world data science is that the data you have per record changes often. Using our airline example, we may decide to add an airplanes field to the airlines table that indicates how many airplanes each airline owns. Luckily, there's a way to alter a table to add columns in SQLite:

In [71]:
connection.execute("alter table airlines add column airplanes integer;");

Note that we don't need to call commit — alter table queries are immediately executed, and aren't placed into a transaction. We can now query and see the extra column:

In [21]:
pd.read_sql_query("select * from airlines limit 1;", connection)


Unnamed: 0,index,id,name,alias,iata,icao,callsign,country,active,airplanes
0,0,1,Private flight,\N,-,,,,Y,


Note that all the columns are set to null in SQLite (which translates to None in Python) because there aren't any values for the column yet.

## Altering tables with Pandas
It's also possible to use Pandas to alter tables by exporting the table to a DataFrame, making modifications to the DataFrame, then exporting the DataFrame to a table:

In [76]:
df = pd.read_sql("select * from daily_flights", connection)
df["delay_minutes"] = None
df.to_sql("daily_flights", connection, if_exists="replace")

The above code will add a column called delay_minutes to the daily_flights table.