Sometimes a database is part of your project, or will be very helpful in your project.  This notebook gives a quick overview of Python's two built-in database interfaces: interfaces to DBM and SQLite.

# `dbm`: easy interfaces with simple DBM-style databases

[DBM](https://en.wikipedia.org/wiki/DBM_(computing)) databases are very old, very simple, and very useful.  They're extremely simple forms of key-value storage (like even more simple JSONs): all keys and values must be strings (or bytes) when accessing a database via Python.  DBM databases aren't very fault-tolerant, robust, or flexible as other database options, so their use cases are fairly limited these days.  (especially with SQLite--see a bit further down in this notebook--being as easy to use, similarly fast, and way more flexible and widely available).  That said, sometimes the simplicity of a DBM database is all you need.

The Python interface to DBM is *very* simple, and mostly looks like accessing a dictionary.  Sometimes this simplicity is all you need, which can make `dbm` pretty useful in spite of its limitations. (other database interfaces require more code and more complexity).  The only tricky-ish part is that the file modes are a bit different than normal files; check the `dbm` documentation in Python for details.

In [1]:
import dbm
# "c" mode -> create a blank database, open in read+write mode
with dbm.open("my_dbm_database.db", "c") as my_db:
    my_db["Henry"] = "Data scientist"
    # this will throw an error
    # my_db["Number"] = 100
    
# there is now a DBM database stored in `my_dbm_database.db1
# which can be re-loaded and accessed later.
with dbm.open("my_dbm_database.db", "r") as my_db:
    # note that text stored in these databases gets converted to bytestrings
    print(my_db["Henry"])

b'Data scientist'


# sqlite3: Easily work with local SQL databases

Note: if you don't have any experience with SQL, this library won't be any use to you--but SQL is extremely easy to learn.  It's definitely something you *should* learn, but it goes beyond the scope of what we're covering in these workshops.

Most SQL databases rely on a *client-server* architecture: you have one program running that hosts the database and runs queries, and a separate program that lets the user write queries.  These are generally on different machines: the server is running on, well, a server somewhere, and the client program is running on your machine, sending your queries to the server to be executed.  This is great for something like a central database for a company, school, or department.

But, SQL is *super* convenient for writing all sorts of data queries.  [SQLite](https://www.sqlite.com/index.html) is a variant of SQL that makes it easy to use SQL in your own projects without needing a whole, dedicated database.  SQLite differs from other SQL implementations in a few important ways:
- It is *self-contained.*  There is no server and client.  SQLite reads and writes databases to and from single files on your computer.
- It is *zero-configuration.*  Most SQL server implementations need a fair bit of configuration before you can use them, both on the server and client end.  SQLite requires nothing other than pointing it at the right file to get up and running.
- It uses a very simple, stripped down version of the SQL language.  It doesn't have many of the conveniences of bigger implementations, but it's also not designed for the same kinds of complex queries and workloads.

In other words: SQLite is an amazing choice when your project would benefit from a (usually small) database.  But it is *not* a good choice for running your company or school's central database off of.

Python's standard library has a `sqlite3` module that lets you interact with SQLite databases very easily.

In [2]:
import sqlite3

with sqlite3.connect("my_sqlite_db.db") as con:
    # interactiong with the database go through the `cursor` object.
    # `cursor.execute("some SQL code")` executes SQL queries in the
    # database.
    cursor = con.cursor()
    
    # create a table
    cursor.execute("drop table if exists TestTable")
    cursor.execute("create table TestTable (id int, name text)")
    
    cursor.execute("insert into TestTable values (1, 'Henry')")
    cursor.execute("insert into TestTable values (2, 'George')")
    cursor.execute("insert into TestTable values (3, 'Justin')")
    
    # call con.commit()--not cursor.commit()--to save any changes
    con.commit()
    
    # querying the database returns an object containing the query results
    query_result = cursor.execute("select * from TestTable")
    print(query_result)
    
    # iterate through it to get results as tuples, one tuple per row, one entry
    # per column.
    for row in query_result:
        print(row)

<sqlite3.Cursor object at 0x000001CDE8DDE880>
(1, 'Henry')
(2, 'George')
(3, 'Justin')


If you know some SQL, you can use pretty much all of that knowledge in SQLite.  Joins, filters, creating table, etc.  The only thing to be careful of is how SQLite handles data types: it is not as strict as other SQL databases when it comes to enforcing types.  It'll often coerce types, and it's happy to have a column that stores a mix of integer and text values.  Usually this isn't an issue--if you're using SQLite, you probably aren't doing something where the type checking and such has to be handled by the database itself.  Either you can handle that in your own code, or you have little enough data it isn't an issue, or something else.  But, it can be a bit of a stumbling block for people coming from other SQL implementations.

There are several other libraries worth knowing about for database access:
- `sqlalchemy`: interfaces to all sorts of SQL databases, with all sorts of different impelementations.
- `pyodbc`: connect to any SQL database that used the ODBC connection standard (pretty much all SQL servers have this available, but it may be disabled in some installations).
- `redis`: access Redis databases, which are like DBM, but far, far more robust and flexible.  And faster.
- `pymongo`: connect to MongoDB databases, which are *document stores*.  They essentially store their data as JSON/JSON-like objects, which is a different paradigm to SQL's relational database model.

There are many more out there.  Pretty much all major databases will have a Python interface, either through more generic libaries like `sqlalchemy` and `pyodbc`, or through a more custom-tailored interface.