# Using SQLite with Python

Documentation can be found [here](https://docs.python.org/3/library/sqlite3.html).

In [3]:
import pandas as pd
import sqlite3 as sq

## Some DB basics

SQLite offers persistent storage instead of using RAM, and offers full CRUD support. RAM offers really fast access to data, but as we learnt in the last module, RAM is volatile, so any data disappears at shutdown, and it cannot be accessed by multiple users. SQL still gives fast access to data, but it is stored on the hard drive or external servers; it also allows multiple users to query simultaneously, and and stores data relationally using tables to allow for more efficient storage.

SQL gives us relational databases, as opposed to NoSQL, which is a document-based language. The emphasis is on storing data using keys and values to avoid unnecessarily large data being repeated multiple times in a single column.

## Comparison between Pandas and SQLite

More documentation [here](https://pandas.pydata.org/pandas-docs/stable/getting_started/comparison/comparison_with_sql.html).

Storing large datasets in csv becomes problematic when manual updates are required, for example updating addresses where a user is listed multiple times. Queries in SQL get around this by systematically editing data based on select parameters, and you can easily store queries that are re-run when triggered.

It is possible to join tables from CSV files based on common values using the `merge` command and the <i>left_on</i>,<i>right_on</i> and <i>how</i> parameters. But SQL allows for relationships between columns that can be referred to when filtering data from different tables, so it's only ever a line of code to do this.


## Creating a SQLite database

In [4]:
con = sq.connect("tutorial.db")