<div style="width: 100%; background-color: #ef7d22; text-align: center">
<br><br>

<h1 style="color: white; font-weight: bold;">
    Python DB-API
</h1>

<br><br> 
</div>


Python defines a standard interface for all SQL relational database system, called the DB-API.  Most database drivers within the Python ecosystem follow this API standard; any features specific to a particular Relational Database Management System (RDBMS), such as PostgreSQL, are communicated at the SQL level rather than with special Python methods.

The Python Enhancement Proposal (PEP) 249 describes the requirements of the DB-API 2.0.  Details of the degree of support and choice among optional features are exposed in module interfaces.

## Adapter capabilities
![orange-divider](https://user-images.githubusercontent.com/7065401/98619088-44ab6000-22e1-11eb-8f6d-5532e68ab274.png)

For comparison, let us inspect adapters to an SQLite database and a PostgreSQL database.  Some parameters are coded compactly.


| threadsafety | Meaning
|-------------:|:--------------------------------------
| 0            | Threads may not share the module.
| 1            | Threads may share the module, but not connections.
| 2            | Threads may share the module and connections.
| 3            | Threads may share the module, connections and cursors.

---

| paramstyle | Meaning
|-----------:|:----------------------------------------
| qmark      | Question mark style, e.g. ...WHERE name=?
| numeric    | Numeric, positional style, e.g. ...WHERE name=:1
| named      | Named style, e.g. ...WHERE name=:name
| format     | ANSI C printf format codes, e.g. ...WHERE name=%s
| pyformat   | Python extended format codes, e.g. ...WHERE name=%(name)s

In [1]:
import sqlite3
print(f"API level       | {sqlite3.apilevel}")
print(f"Parameter style | {sqlite3.paramstyle}")
print(f"Thread safety   | {sqlite3.threadsafety}")

API level       | 2.0
Parameter style | qmark
Thread safety   | 1


In [2]:
import psycopg2
print(f"API level       | {psycopg2.apilevel}")
print(f"Parameter style | {psycopg2.paramstyle}")
print(f"Thread safety   | {psycopg2.threadsafety}")

API level       | 2.0
Parameter style | pyformat
Thread safety   | 2


## A connection and cursors
![orange-divider](https://user-images.githubusercontent.com/7065401/98619088-44ab6000-22e1-11eb-8f6d-5532e68ab274.png)

We can see from the threadsafety level our `psycopg2` adapter provides, that we can create a single connection for all the threads we may wish to use.  The cursors should remain distinct between threads.  This lesson will not use Python threading, which is a separate course, but we can create multiple cursors in the main thread, if we wish.  For this lesson, we simply assume that a database called `ine` exists, and the PostgreSQL user and password configured will work.

In [3]:
user = 'ine_student'
pwd = 'ine-password'
host = 'localhost'
port = '5432'
db = 'ine'
conn = psycopg2.connect(database=db, host=host, user=user, password=pwd, port=port)

If it is convenient, we can work with multiple cursor.  Keep in mind, however, that performing a commit or a rollback will happen at the connection level.  However, it may be useful, for example, to create temporary cursors within a function, and only pass around a connection object.

In [4]:
cur = conn.cursor()

The main action we peform with a cursor is to `.execute()` SQL statements.

In [5]:
# Create the table with cursor#1
sql_create = """
CREATE TABLE IF NOT EXISTS users (
  user_id SERIAL PRIMARY KEY,
  username VARCHAR(50) UNIQUE NOT NULL,
  password VARCHAR(50) NOT NULL,
  age SMALLINT,
  created_on TIMESTAMP NOT NULL
);
"""
cur.execute('DROP TABLE IF EXISTS users;')
cur.execute(sql_create)

PostgreSQL allows an SQL extension of `IF NOT EXISTS` in SQL statements.  The table may or may not have existed initially, but this will not fail if it did.  However, if a table already exists, a second `CREATE TABLE` with this option will ignore the field names and data types in the new SQL statement.

At this point, the table has not actually been created, but rather the action has been placed in the transaction queue.  It may or may not be committed.  In fact, if we attempt to commit it, it is *possible* that some other action by another connection would be inconsistent with this, and the transaction would be rolled back.  In this case, and most cases, a commit will succeed.

In [6]:
conn.commit()

As described, a `CREATE TABLE IF NOT EXISTS` can succeed at the query level, but not alter a table.

In [7]:
sql_bad_create = """
CREATE TABLE IF NOT EXISTS users (
  not_an_id SERIAL PRIMARY KEY,
  not_a_user INTEGER UNIQUE NOT NULL,
  not_a_password VARCHAR(30) NOT NULL
);
"""
cur.execute(sql_bad_create)
conn.commit()

We can check the table structure using a query, and verify which version exists in the database.

In [8]:
sql_schema = """
SELECT column_name, data_type, character_maximum_length, 
       column_default, is_nullable
FROM INFORMATION_SCHEMA.COLUMNS 
WHERE table_name = 'users';
"""
cur.execute(sql_schema)
cur.fetchall()

[('user_id', 'integer', None, "nextval('users_user_id_seq'::regclass)", 'NO'),
 ('username', 'character varying', 50, None, 'NO'),
 ('password', 'character varying', 50, None, 'NO'),
 ('age', 'smallint', None, None, 'YES'),
 ('created_on', 'timestamp without time zone', None, None, 'NO'),
 ('user_id',
  'integer',
  None,
  "nextval('business.users_user_id_seq'::regclass)",
  'NO'),
 ('username', 'character varying', 50, None, 'NO'),
 ('password', 'character varying', 50, None, 'NO'),
 ('age', 'smallint', None, None, 'YES'),
 ('created_at', 'timestamp with time zone', None, 'CURRENT_TIMESTAMP', 'YES'),
 ('zipcode', 'character', 5, 'NULL::bpchar', 'YES')]

## Working with data
![orange-divider](https://user-images.githubusercontent.com/7065401/98619088-44ab6000-22e1-11eb-8f6d-5532e68ab274.png)

With the table we created above, let us write some data to it.  Remember that the `pscyopg2` adapter uses the `pyformat` parameter style.

In [9]:
from datetime import datetime
def add_user(conn, user):
    # We make a new cursor every time this function is called
    # ... would work even if the function was called per-thread
    cursor = conn.cursor()
    user['now'] = datetime.now().isoformat()
    user['age'] = user.get('age')
    sql = """INSERT INTO users (username, password, age, created_on) 
             VALUES (%(username)s, %(password)s, %(age)s, %(now)s)"""
    cursor.execute(sql, user)

We can call this function with user data a few times.

In [10]:
users_info = [
  dict(username='Alice', password='bad_pw_1', age=37),
  dict(username='Bob', password='bad_pw_2'),
  dict(username='Carlos', password='bad_pw_3', age=62)
]
for user_info in users_info:
    add_user(conn, user_info)

So far, so good.  However, these data have not actually been stored in the database, only queued as a transaction.  The current cursor sees them as present, but another cursor will not yet.

In [11]:
cur.execute("SELECT * FROM users;")
for row in cur:
    print(row)

(1, 'Alice', 'bad_pw_1', 37, datetime.datetime(2020, 11, 30, 16, 27, 30, 115556))
(2, 'Bob', 'bad_pw_2', None, datetime.datetime(2020, 11, 30, 16, 27, 30, 116392))
(3, 'Carlos', 'bad_pw_3', 62, datetime.datetime(2020, 11, 30, 16, 27, 30, 116660))


In [12]:
conn2 = psycopg2.connect(database=db, host=host, user=user, password=pwd, port=port)
cur2 = conn2.cursor()
cur2.execute("SELECT * FROM users;")
cur2.fetchmany(2)

[]

To make the data available to all connections, we want to commit it.

In [13]:
conn.commit()
cur2.execute("SELECT * FROM users;")
print(cur2.fetchone())
print(next(cur2))

(1, 'Alice', 'bad_pw_1', 37, datetime.datetime(2020, 11, 30, 16, 27, 30, 115556))
(2, 'Bob', 'bad_pw_2', None, datetime.datetime(2020, 11, 30, 16, 27, 30, 116392))


## Uncommitted data
![orange-divider](https://user-images.githubusercontent.com/7065401/98619088-44ab6000-22e1-11eb-8f6d-5532e68ab274.png)

A batch of SQL statements may not succeed.  In such a case, we may not wish for *any* of them to be recorded.  In such a case, we want to call `.rollback()` on the connection to inform the server to discard the transaction from the queue.  We might rollback because of a problem the server reports, or we may do so because of something we determine at an application level.

In [14]:
def add_many(conn, users_info):
    try:
        for user_info in users_info:
            if 'password' in user_info['password']:
                raise ValueError(f"Terrible password for {user_info['username']}")
            add_user(conn, user_info)
    except Exception as err:
        conn.rollback()
        print("Transaction rolled back because of:", type(err).__name__)
        print(err)
        return False
    else:
        return True

Perhaps the datatypes are wrong:

In [15]:
users_bad_data = [
    dict(username='Dave', password='insecure_1'),
    dict(username='Erin', password='insecure_2'),
    dict(username='Faythe', password='insecure_3', age="ABC")
]
add_many(conn, users_bad_data)

Transaction rolled back because of: InvalidTextRepresentation
invalid input syntax for type smallint: "ABC"
LINE 2:              VALUES ('Faythe', 'insecure_3', 'ABC', '2020-11...
                                                     ^



False

Or perhaps a uniqueness constraint is violated:

In [16]:
users_dup_data = [
    dict(username='Dave', password='insecure_1'),
    dict(username='Erin', password='insecure_2'),
    dict(username='Carlos', password='bad_pw_4')
]
add_many(conn, users_dup_data)

Transaction rolled back because of: UniqueViolation
duplicate key value violates unique constraint "users_username_key"
DETAIL:  Key (username)=(Carlos) already exists.



False

Or it might be that the application itself is able to exclude some data:

In [17]:
users_app_rules = [
    dict(username='Grace', password='insecure_77'),
    dict(username='Heidi', password='insecure_88'),
    dict(username='Ivan', password='password_55')
]
add_many(conn, users_app_rules)

Transaction rolled back because of: ValueError
Terrible password for Ivan


False

## Working in batches
![orange-divider](https://user-images.githubusercontent.com/7065401/98619088-44ab6000-22e1-11eb-8f6d-5532e68ab274.png)


For the last few cells, we will configure the connection to AUTOCOMMIT.  That is, every time an insertion is made, a COMMIT is implictly performed afterwards.

In [18]:
conn.set_session(autocommit=True)

We have seen several ways to fetch the results from a query.  We can use `.fetchone()`, or `.fetchmany()`, or `.fetchall()`.  We can also loop over the cursor object to bind each row, or using the same Python iterator protocol, call `next(cursor)`.

A similar capability is available for excucting statements.  In concept, this could be many SELECT queries, but more commonly, it is many INSERT or UPDATE commands.

In [19]:
now = datetime.now().isoformat()
users_more = [
    dict(username='Sybil', password='M7c&sd31&0hA', age=44, created_on=now),
    dict(username='Trudy', password='y9bD6SA2O%$t', age=22, created_on=now),
    dict(username='Vanna', password='9$Ts9HK*3!tR', age=55, created_on=now)
]
sql = """
INSERT INTO users (username, password, age, created_on) 
VALUES (%(username)s, %(password)s, %(age)s, %(created_on)s)
"""
cur.executemany(sql, users_more)

In practice, you probably want to catch exceptions and do conditional rollbacks and remediation around your `.executemany()` calls.  But we assume it succeeded, and was automatically committed.

Querying it again, we can explicitly ask for details on the columns returned by a query, and the number of them.

In [20]:
cur.execute('SELECT user_id, username, age FROM users;')
for item in cur.description:
    print(item)
print("Rows returned:", cur.rowcount)

Column(name='user_id', type_code=23)
Column(name='username', type_code=1043)
Column(name='age', type_code=21)
Rows returned: 6


One thing to notice is that the SERIAL `user_id` column was incremented on the various failures that were not committed.  This makes sense since a unique sequential number has to be assigned before the server can know whether that transaction will be committed.

In [21]:
cur.fetchall()

[(1, 'Alice', 37),
 (2, 'Bob', None),
 (3, 'Carlos', 62),
 (11, 'Sybil', 44),
 (12, 'Trudy', 22),
 (13, 'Vanna', 55)]

## Summary
![orange-divider](https://user-images.githubusercontent.com/7065401/98619088-44ab6000-22e1-11eb-8f6d-5532e68ab274.png)


To use adapters that follow the DB-API requires learning only a few fairly simple APIs, while offering flexibility at the Python level.  Once you have mastered that, everything else you really need to know is specific to PostgreSQL as an RDBMS, and is accessed via SQL interfaces rather than Python functions or methods.