# Make a SQL database using `psycopg2`

When you make a database in SQL, you have to execute the command from within a database that already exists.

By default every PostgreSQL cluster has a database named `postgres`, so in the code below we connect to that database
before issuing the command to create the new database named in the `MY_DB` variable.

Note: If the database already exists, this will fail with a `DuplicateDatabase` error

In [1]:
import psycopg2

### Let's define our connection variables and turn it into a "connection string"

We're using an "F-string" here, which is a new feature that appeared in Python 3.6. It's much more readable than the older style of string substitution.

In [4]:
MY_DB = "teaching_bucket"
UN = "aaron"
PW = "my_password"
HOST = "localhost"

connection_string = f"postgresql://{UN}:{PW}@{HOST}:5432/postgres"

### Connect to the database and execute a SQL command to create a new database

The raw SQL here is `CREATE DATABASE name_of_database;`

The process of doing this from Python includes:
- connect with `psycopg2`
- create a `cursor()` for the conneection
- use the cursor to `execute()` a SQL statement
- `close()` the cursor, `commit()` our changes to the connection, then `close()` the connection

Note that we've set the "isolation level" to `ISOLATION_LEVEL_AUTOCOMMIT`. This is only necessary for certain actions, namely creating a new database.

In [5]:
connection = psycopg2.connect(connection_string)
connection.set_isolation_level(psycopg2.extensions.ISOLATION_LEVEL_AUTOCOMMIT)
cursor = connection.cursor()

# Execute the CREATE command
query_to_make_db = f"CREATE DATABASE {MY_DB};"
cursor.execute(query_to_make_db)

# Commit and close the database connection
cursor.close()
connection.commit()
connection.close()

### Load the `PostGIS` extension into the database

Assuming your PostgreSQL installation has PostGIS available, a single command will enable the spatial functionality within the database.

Note that we're re-defining our connection string to point at the new database, not the `postgres` database we used earlier

In [9]:
connection_string = f"postgresql://{UN}:{PW}@{HOST}:5432/{MY_DB}"
    
connection = psycopg2.connect(connection_string)
cursor = connection.cursor()

cursor.execute("CREATE EXTENSION postgis;")

cursor.close()
connection.commit()
connection.close()

### Don't repeat yourself!

Imagine a script where you use a lot of SQL commands. Do you really want to type all those cursor, execute, close, and commit commands each time? NO!

Here's a function that handles all of the plumbing for you. All you need to provide is the connection string, SQL command, and a True or False if the `ISOLATION_LEVEL_AUTOCOMMIT` is necessary.

In [10]:
def execute_sql(connection_str, sql_statement, autocommit=False):

    connection = psycopg2.connect(connection_str)
    if autocommit:
        connection.set_isolation_level(psycopg2.extensions.ISOLATION_LEVEL_AUTOCOMMIT)
    
    cursor = connection.cursor()

    cursor.execute(sql_statement)

    cursor.close()
    connection.commit()
    connection.close()

### Let's use our new function to load the UUID extension

In [11]:
execute_sql(connection_string, 'CREATE EXTENSION IF NOT EXISTS "uuid-ossp";')

# Check out the `postgis-helpers` module

If you find yourself doing a lot of SQL work within Python, you may want to take a look at the [`postgis-helpers` module](https://github.com/aaronfraint/postgis-helpers).

It includes a variety of helper functions for connection to PostgreSQL databases, loading data in, extracting data out, and modifying in-place.