# Intro to Relational Databases

### Introduction

When we say relational databases, we mean databases like SQL, or Postgres.  Relational databases are used to store and retrieve data, and to do so quickly.  So far, we have used simply used CSV to store or data, or perhaps in a past life we have some experience with Microsoft Excel.  These are good tools, but when the size of our data increases and as the questions we have about the data become more complex, we need to move to relational databases like SQL.

### SQL vs Excel

Now one way to learn about SQL is to compare it to some software that we have already used in our non-coding life: namely a spreadsheet like Microsoft Excel or a Google spreadsheet.  

> Now your not familiar with spreadsheets, that's ok -- we'll explore the concepts that we'll need to know.  

A spreadsheet is good for organizing, storing, and asking questions about our data.  Let's get started by using a Google spreadsheet to organize some information.  Imagine that we run a barber shop, and we want to use a Google doc to help us keep track of our customers and employees.  To do so, we created the following spreadsheet.

<img src="./barbershop.png" width='80%'/>

<img src="table-names.png" width="40%">

At the very bottom of the Google spreadsheet, you can see that the first sheet in the file is for storing information about Employees and the second stores information about Customers.

Now a lot of the components that we see in the Google spreadsheet we'll also see in SQL.

* Table
    * The `Employees` spreadsheet is similar to a table in a database.  A table stores information about just a single entity.  So for example, we have separate tables for `Customers` and `Employees`.  We'll discuss how to know when to separate data into multiple tables in future lessons.
    
* Columns
    * The table above has columns of `Name`, `Phone Number`, and `Email`.  In a database, each table will also have columns used to store different attributes about our data.
    
* Rows
    * We see each individual `employee` is stored in a separate row.  It will be the same in SQL.  For each individual *member* of a table, we will have a separate row, and each attribute of that row is in a column. 
    
* Document Name
    * Finally, notice that our Google document has a name of `Barbershop` at the top.  This document holds separate spreadsheets about employees and customers.  Similarly, we will create SQL database named `Barbershop` that will hold our tables of `employees` and `customers`.

### Get started with SQL

There is various relational database software that we can use: Postgres, SQL, or SQLite.  They all work similarly.  So we'll get started with SQLite3 as it's lightweight and easy to set up.

If we have a Mac, we can install SQLite3 with the following:

`brew install sqlite3`

So now that we have installed the SQLite3 software, the next step is create a database.

![](./create-database.png)

Once Now we could type in the command to create the table in the sqlite console but we do so let's write it out below.

Now a database only makes sense if we have at least one table.  Here's the command to create an `employees` table with columns of `name` and `email`:

```sql
CREATE TABLE employees (name TEXT, phone_number TEXT, email TEXT);
```

We can think of the statement above as two main components:

* `CREATE TABLE employees`
    * The first component is "CREATE TABLE employees". `CREATE TABLE` is the SQL command to create a new table in the database. We always follow `CREATE TABLE` by the name of the table, in this case `employees`.

* `(name TEXT, phone_number TEXT, email TEXT)`
    * The second part of the statement concerns the columns of our new table. Inside of the parentheses we specify each name of the column followed by the datatype of that column.  Like Python, SQLite has different datatypes.  But for now, we'll only use the datatypes of TEXT and INTEGER. Above, our first column is called `name` which has the datatype `TEXT`. Each of the columns is separated by a comma.

### Executing our Command 
The easiest way to execute the above code is to place our CREATE TABLE command in separate file, called `create_employees.sql` 

```sql
CREATE TABLE employees (id INTEGER PRIMARY KEY AUTOINCREMENT, name TEXT, phone_number TEXT, email TEXT);
```

And then from our terminal, we can execute our CREATE TABLE command in our database the following. 


`sqlite3 barbershop.db < create_employees.sql`

The above line tells SQLite3 to run the SQL statement in the `create_employees.sql` file in our `barbershop.db` database.

Our table is now stored in the barbershop.db file. We can check on this by viewing our database with the SQLite3 console:

`sqlite3 barbershop.db`
`.tables`

If successful, you should receive a response with the table names.

Running `.schema` in place of `.tables`, should return the schema of the tables in the datatbase.

![](./create-database.png)

#### Primary Keys

What happens when two rows in a table have the same exact records in each of the columns? How do we differentiate between the two or more duplicates?  We do so by assigning each row an `id`. 

```sql
CREATE TABLE employees (id INTEGER PRIMARY KEY AUTOINCREMENT, name TEXT, phone_number TEXT, email TEXT);
```
You'll notice the above CREATE TABLE statement includes one new column at the beginnning: 

`id INTEGER PRIMARY KEY AUTOINCREMENT, `

Above we add a new column of `id`, which is a primary key that auto-increments. A primary key is a field in a table which uniquely identifies each row in a table. Primary keys must contain unique values. So here, the id will uniquely identify eahc of our rows. AUTOINCREMENT just tells SQL to automatically increment our id by one with each new row. For example, the first employee will automatically be assigned the id 1, and the second employee will be assigned the id 2. 
> We'll see this in action when we create data for our tables in the next lesson.

### Conclusion

At the beginning of the lesson we talked about some of the key concepts of SQL databases. Databases are made up of **tables** which store information about a single entity. These tables are made up of **columns** that store different attributes about our data. **Rows** in the table represent individual members of a table. For each individual member of a table, we will have a separate row, and each attribute of that row is in a column. 
In our example, the barbershop database has a table called `employees`. The columns of our `employees` table are attributes of the employees, like name and phone number. 
Each row of our `employees` table will represent one member of this entity. In our next lesson, we will learn how to insert the data, or rows, into the table.