Tutorials:
- http://www.postgresqltutorial.com/
- https://www.postgresql.org/docs/11/tutorial-sql.html

Postgres jupyter kernel: https://github.com/bgschiller/postgres_kernel

pgspecial must be installed for running special command postgres command in this notebook

Use `sqlmagic` and set sqlmagic compatible uri

In [16]:
%load_ext sql

The sql extension is already loaded. To reload it, use:
  %reload_ext sql


# Initialisation

In [17]:
uri = "postgresql+psycopg2://postgres:postgres@db.postgres.app.com"

In [18]:
%sql {uri}

'Connected: postgres@None'

SQL for table deletion and creation

In [19]:
%sql select current_database()

 * postgresql+psycopg2://postgres:***@db.postgres.app.com
1 rows affected.


current_database
postgres


In [20]:
%%sql

SELECT * FROM pg_tables
WHERE schemaname NOT IN ('pg_catalog', 'information_schema');

 * postgresql+psycopg2://postgres:***@db.postgres.app.com
1 rows affected.


schemaname,tablename,tableowner,tablespace,hasindexes,hasrules,hastriggers,rowsecurity
public,country,postgres,,True,False,False,False


In [21]:
%%sql

DROP TABLE IF EXISTS Person;
DROP TABLE IF EXISTS Country;

 * postgresql+psycopg2://postgres:***@db.postgres.app.com
Done.
Done.


[]

# Create table

**Primary key**, auto-increment (see [post](https://chartio.com/resources/tutorials/how-to-define-an-auto-increment-primary-key-in-postgresql/)):
- usage of the SERIAL or BIGSERIAL data types when CREATING a new table.
- creating a custom SEQUENCE

In [22]:
%%sql

CREATE TABLE Country (
    country_id varchar(2) PRIMARY KEY,
    country_name varchar(255)
);

CREATE TABLE Person (
    person_id SERIAL PRIMARY KEY,
    person_first varchar(255),
    person_last varchar(255),
    country_id varchar(2) NOT NULL,
    FOREIGN KEY (country_id) REFERENCES Country(country_id),
    CON
);

 * postgresql+psycopg2://postgres:***@db.postgres.app.com
Done.
(psycopg2.errors.SyntaxError) syntax error at or near ")"
LINE 8: );
        ^

[SQL: CREATE TABLE Person (
    person_id SERIAL PRIMARY KEY,
    person_first varchar(255),
    person_last varchar(255),
    country_id varchar(2) NOT NULL,
    FOREIGN KEY (country_id) REFERENCES Country(country_id),
    CON
);]
(Background on this error at: http://sqlalche.me/e/f405)


View table list, either with postgres special command `\dt` or from *pg_catalog*

In [23]:
%%sql
\dt

 * postgresql+psycopg2://postgres:***@db.postgres.app.com
1 rows affected.


Schema,Name,Type,Owner
public,country,table,postgres


In [24]:
%%sql
DESCRIBE accounts;

 * postgresql+psycopg2://postgres:***@db.postgres.app.com
(psycopg2.errors.SyntaxError) syntax error at or near "DESCRIBE"
LINE 1: DESCRIBE accounts;
        ^

[SQL: DESCRIBE accounts;]
(Background on this error at: http://sqlalche.me/e/f405)


In [25]:
%%sql
SELECT * FROM pg_catalog.pg_tables
WHERE schemaname != 'pg_catalog'
AND schemaname != 'information_schema';

 * postgresql+psycopg2://postgres:***@db.postgres.app.com
1 rows affected.


schemaname,tablename,tableowner,tablespace,hasindexes,hasrules,hastriggers,rowsecurity
public,country,postgres,,True,False,False,False


# Insert rows

In [26]:
%%sql

INSERT INTO Country(country_id, country_name) 
VALUES 
('FR', 'France'),
('CU', 'CUBA');

 * postgresql+psycopg2://postgres:***@db.postgres.app.com
2 rows affected.


[]

The `pg_relation_size()` function returns the size of the table only, not included indexes or additional objects.

In [27]:
%%sql

SELECT pg_size_pretty(pg_relation_size('Country'));

 * postgresql+psycopg2://postgres:***@db.postgres.app.com
1 rows affected.


pg_size_pretty
8192 bytes


In [28]:
%%sql

INSERT INTO Person (person_first, person_last, country_id) 
VALUES 
('Napolean', 'Bonaparte', 'FR'),
('Luis','Alvarez', 'CU');

 * postgresql+psycopg2://postgres:***@db.postgres.app.com
(psycopg2.errors.UndefinedTable) relation "person" does not exist
LINE 1: INSERT INTO Person (person_first, person_last, country_id) 
                    ^

[SQL: INSERT INTO Person (person_first, person_last, country_id) 
VALUES 
('Napolean', 'Bonaparte', 'FR'),
('Luis','Alvarez', 'CU');]
(Background on this error at: http://sqlalche.me/e/f405)


Accessing the RDBMS dictionary.

In [29]:
%%sql

SELECT * FROM pg_catalog.pg_tables
WHERE schemaname != 'pg_catalog';

 * postgresql+psycopg2://postgres:***@db.postgres.app.com
8 rows affected.


schemaname,tablename,tableowner,tablespace,hasindexes,hasrules,hastriggers,rowsecurity
public,country,postgres,,True,False,False,False
information_schema,sql_features,postgres,,False,False,False,False
information_schema,sql_implementation_info,postgres,,False,False,False,False
information_schema,sql_languages,postgres,,False,False,False,False
information_schema,sql_packages,postgres,,False,False,False,False
information_schema,sql_parts,postgres,,False,False,False,False
information_schema,sql_sizing,postgres,,False,False,False,False
information_schema,sql_sizing_profiles,postgres,,False,False,False,False


In [30]:
%%sql

SELECT sql FROM postgres 
WHERE name='Person';

 * postgresql+psycopg2://postgres:***@db.postgres.app.com
(psycopg2.errors.UndefinedTable) relation "postgres" does not exist
LINE 1: SELECT sql FROM postgres 
                        ^

[SQL: SELECT sql FROM postgres 
WHERE name='Person';]
(Background on this error at: http://sqlalche.me/e/f405)


SQL as a Query Language.

In [31]:
%%sql

SELECT person_first as first, person_last AS last, country_name AS nationality
FROM Person 
INNER JOIN country 
ON Person.country_id = Country.country_id;

 * postgresql+psycopg2://postgres:***@db.postgres.app.com
(psycopg2.errors.UndefinedTable) relation "person" does not exist
LINE 2: FROM Person 
             ^

[SQL: SELECT person_first as first, person_last AS last, country_name AS nationality
FROM Person 
INNER JOIN country 
ON Person.country_id = Country.country_id;]
(Background on this error at: http://sqlalche.me/e/f405)


Visualizing the entitry-relationship diagram (ERd).

In [12]:
import ibis

In [13]:
import eralchemy

![](erd_from_sqlalchemy.png)

## Homework walk-through

Convert the flat file data in `data/flat.csv` into a well-structured relational database in SQLite3 stored as `data/faculty.db`. Note - salary information is confidential and should be kept in a separate table from other personal data.

In [32]:
import pandas as pd

In [33]:
flat = pd.read_csv('../data/flat.csv', keep_default_na=False)
flat.sample(3)

Unnamed: 0,name,gender,age,height,weight,salary,nationality,code,country,language1,language2,language3,first,last
915,Lynwood Pope,Male,20,1.51,40,88000,Jordanian,JO,Jordan,ASP,Scala,,Lynwood,Pope
1,Aaron Kirby,Male,59,1.69,43,80000,Spanish,SP,Spain,Falcon,haXe,GNU Octave,Aaron,Kirby
239,Clarita Carver,Female,35,1.66,71,84000,Finnish,FI,Finland,Prolog,Erlang,Smalltalk,Clarita,Carver


In [34]:
%%sql

USE faculty;

 * postgresql+psycopg2://postgres:***@db.postgres.app.com
(psycopg2.errors.SyntaxError) syntax error at or near "USE"
LINE 1: USE faculty;
        ^

[SQL: USE faculty;]
(Background on this error at: http://sqlalche.me/e/f405)


In [22]:
%%sql

DROP TABLE IF EXISTS Person;
DROP TABLE IF EXISTS Country;

CREATE TABLE Country (
    country_id varchar(2) PRIMARY KEY,
    country_name varchar(255)
);

CREATE TABLE Person (
    person_id SERIAL PRIMARY KEY,
    name varchar(255),
    age INTEGER NOT NULL,
    country_id varchar(2) NOT NULL,
      FOREIGN KEY (country_id) REFERENCES Country(country_id)
);

 * postgresql+psycopg2://postgres:***@db.postgres.app.com/postgres
Done.
Done.
Done.
Done.


[]

In [23]:
%%sql

INSERT INTO Country(country_id, country_name) 
VALUES ('FR', 'France'), ('CU', 'CUBA');

 * postgresql+psycopg2://postgres:***@db.postgres.app.com/postgres
2 rows affected.


[]

In [24]:
%%sql
DELETE FROM Country

 * postgresql+psycopg2://postgres:***@db.postgres.app.com/postgres
2 rows affected.


[]

In [25]:
%%sql
SELECT * FROM Country

 * postgresql+psycopg2://postgres:***@db.postgres.app.com/postgres
0 rows affected.


country_id,country_name


In [26]:
from sqlalchemy import create_engine

In [27]:
engine = create_engine(uri)

In [28]:
conn = engine.connect()

In [29]:
flat.columns

Index(['name', 'gender', 'age', 'height', 'weight', 'salary', 'nationality',
       'code', 'country', 'language1', 'language2', 'language3', 'first',
       'last'],
      dtype='object')

In [30]:
flat.rename(mapper={'code': 'country_id',
                    'country': 'country_name'},
           inplace=True)


In [31]:
country = flat[['country_id', 'country_name']]

KeyError: "None of [Index(['country_id', 'country_name'], dtype='object')] are in the [columns]"

In [103]:
country.set_index('country_id').to_sql('Country', engine, if_exists='append')

ProgrammingError: (psycopg2.errors.UndefinedColumn) column "country_id" of relation "Country" does not exist
LINE 1: INSERT INTO "Country" (country_id, country_name) VALUES ('GB...
                               ^

[SQL: INSERT INTO "Country" (country_id, country_name) VALUES (%(country_id)s, %(country_name)s)]
[parameters: ({'country_id': 'GB', 'country_name': 'United Kingdom'}, {'country_id': 'SP', 'country_name': 'Spain'}, {'country_id': 'IT', 'country_name': 'Italy'}, {'country_id': 'IT', 'country_name': 'Italy'}, {'country_id': 'UY', 'country_name': 'Uruguay'}, {'country_id': 'CM', 'country_name': 'Cambodia'}, {'country_id': 'CM', 'country_name': 'Cameroon'}, {'country_id': 'BE', 'country_name': 'Belgium'}  ... displaying 10 of 1523 total bound parameter sets ...  {'country_id': 'ET', 'country_name': 'Ethiopia'}, {'country_id': 'VE', 'country_name': 'Venezuela'})]
(Background on this error at: http://sqlalche.me/e/f405)

# %%sql
SELECT * FROM Country

In [62]:
flat.to_sql?

[0;31mSignature:[0m
[0mflat[0m[0;34m.[0m[0mto_sql[0m[0;34m([0m[0;34m[0m
[0;34m[0m    [0mname[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0mcon[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0mschema[0m[0;34m=[0m[0;32mNone[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0mif_exists[0m[0;34m=[0m[0;34m'fail'[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0mindex[0m[0;34m=[0m[0;32mTrue[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0mindex_label[0m[0;34m=[0m[0;32mNone[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0mchunksize[0m[0;34m=[0m[0;32mNone[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0mdtype[0m[0;34m=[0m[0;32mNone[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0mmethod[0m[0;34m=[0m[0;32mNone[0m[0;34m,[0m[0;34m[0m
[0;34m[0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0;31mDocstring:[0m
Write records stored in a DataFrame to a SQL database.

Databases supported by SQLAlchemy [1]_ are supported. Tables can be
newly created, appended to, or overwritten.

Par

In [None]:
import asyncio

async def coro(int):
    print(f"running {int}")
    await asyncio.sleep(1)
    print(f"continuing {int}")
    return int

In [109]:
asyncio.run(coro)

RuntimeError: asyncio.run() cannot be called from a running event loop