Coming from a computational science/applied mathematics background many of the ideas of data science/machine learning are second nature to me. One thing that was very new to me was creating and manipulating databases. In computational science we don't really deal with databases and I/O is something to avoid as much as possible because it limits performance. Therefore, in this blog post I'll be going what I have learend about SQL databases, i.e. what they are, how to set them up, and how to use them.
SQL stands for Structured Query Language. It is a domain specific language used in programming to deal with data that is stored in a relational database. SQL is designed for a specific purpose: to query data contained in a relational database. There are plently of good references on how to learn SQL query commands, two that I used are,
However, this is not what I intend to cover here. Instead, I would like to look at how one can create and interact with different implementations of SQL databases. And specifically how to do this using Python. There are many different implementations of SQL: SQLite, Oracle, MySQL, PostgreSQL, etc. The basic operations on SQL databases that are common to all the implementations are described in the acronym, C.R.U.D.:
- Create: How to create a database and tables.
- Read: How read from a table in a database.
- Update: How to update the values in a table in the database.
- Delete: How to delete rows from a table in the database.
For now the SQL implementations I'll be focusing on are,
and
- PostgreSQL and Python's interface to it sqlalchemy and psycopg2.
As we'll see most of the differences between working with the two implementations will be in how we create the databases. The queries will be relatively the same, but the libaries we use to interact with the databases will be different depending on the SQL implementation. I'll be updating this blog post as a time goes on, so check back later for new additions.
To install the requirements with pip (except for Python), type in the main directory:
pip install -r requirements.txt