# Experimenting with PostgresSQL

This time we will create a table of movie data in a SQL database in this case we have a PostgreSQL database setup for us to use.

If you look in the `docker-compose.yaml` file you will see the database settings configured, please see how they are defined, but they are repated here for convenience:
- database address & port = postgres & 5432
- database name = ibddb
- database username = ibduser
- database password = datarocks

## First we need to establish a connection to the database 

We use the sqlalchemy package for this:

In [31]:
import sqlalchemy as sa

In [3]:
engine = sa.create_engine('postgresql://ibduser:datarocks@postgres:5432/ibddb')

## Creating our database table

Note how this is different to when we we working with Mongo as we have to declare the datatypes that a field in the table will hold at the time the table is created

In [5]:
create_table_query = '''CREATE TABLE movies
      (ID INT PRIMARY KEY     NOT NULL,
      Title           TEXT    NOT NULL,
      Director        TEXT    NOT NULL,
      Year            INT     NOT NULL,
      BoxOffice         REAL); '''
# Execute a command: this creates a new table
engine.execute(create_table_query)

<sqlalchemy.engine.cursor.LegacyCursorResult at 0x7fddf438a4a0>

## Inserting data into the database

### Creating some fake data records

You will notice we've used a similar form to when we did this in Mongo DB. However, this is purely so that it is clear what data is going into what field. You will see that we need to extract the field values when it comes to inserting the data into the database

In [6]:
sample_movie1 = {
    "Title": "Star Wars",
    "Director": "George Lucas",
    "Year": 1977,
    "BoxOffice": 775_000_000
}

sample_movie2 = {
    "Title": "Tenet",
    "Director": "Christopher Nolan",
    "Year": 2020,
    "BoxOffice": 363_700_000
}

sample_movie3 = {
    "Title": "Batman Begins",
    "Director": "Christopher Nolan",
    "Year": 2005,
    "BoxOffice": 371_900_000
}

### Inserting a single record into the table

We can use the sqlaclhemy engine to insert a record into the movies table we created

In [11]:
postgres_insert_query = """ INSERT INTO movies (ID, Title, Director, Year, BoxOffice) VALUES (%s,%s,%s,%s,%s)"""
record_to_insert = (1, *sample_movie1.values())
engine.execute(postgres_insert_query, record_to_insert)

<sqlalchemy.engine.cursor.LegacyCursorResult at 0x7fddf438ac50>

### Inserting multiple records into the table

In a very similar fashion we can reuse the insert query with a list of records and they will all be inserted into the table

In [19]:
records_to_insert = [(2, *sample_movie2.values()), (3, *sample_movie3.values())]
engine.execute(postgres_insert_query, records_to_insert)

<sqlalchemy.engine.cursor.LegacyCursorResult at 0x7fddca3a7bb0>

## Retreiving data from the table

To do this we will switch gears and use the ipython-sql extenstion that allows us to call SQL directly in a Jupyter code cell. We still need to use the sqlalchemy engine as the connection point and there is a little bit of setup.

In [12]:
%load_ext sql

In [13]:
%sql $engine.url

'Connected: ibduser@ibddb'

#### We can now SELECT the first five records from our movies table

In [21]:
%sql SELECT * FROM movies LIMIT 5

 * postgresql://ibduser:***@postgres:5432/ibddb
3 rows affected.


id,title,director,year,boxoffice
1,Star Wars,George Lucas,1977,775000000.0
2,Tenet,Christopher Nolan,2020,363700000.0
3,Tenet,Christopher Nolan,2020,363700000.0


#### Or we could only SELECT movies that came out after 2001

In [32]:
%sql SELECT * FROM movies WHERE year > 2001;

 * postgresql://ibduser:***@postgres:5432/ibddb
2 rows affected.


id,title,director,year,boxoffice
2,Tenet,Christopher Nolan,2020,363700000.0
3,Tenet,Christopher Nolan,2020,363700000.0


## Feel free to experiment and play - note all the data is destroyed when the Docker containers are shutdown and deleted