# Course 3, Module 3: Advanced Techniques and Best Practices

This final notebook covers advanced topics and best practices that are essential for building robust, scalable, and maintainable applications with Python and PostgreSQL.

We will explore:
1.  **Proper Error Handling**: How to catch and inspect specific database errors.
2.  **Bulk Operations**: The efficient way to insert many rows of data at once using `executemany()`.
3.  **Calling Stored Procedures**: How to execute server-side logic from your Python application.
4.  **Connection Pooling (Conceptual)**: Understanding why connection pooling is critical for performance in real-world applications.

--- 
## Setup

We'll import `psycopg2` and `time` for a performance test, then define our connection details.

In [None]:
import psycopg2
import time

DB_HOST = "localhost"
DB_NAME = "people"
DB_USER = "fahad"
DB_PASS = "secret"

--- 
## 1. Proper Error Handling

So far, we've caught generic `psycopg2.Error`. However, `psycopg2` provides specific exception types, allowing you to handle different errors in different ways. A common example is trying to insert a duplicate value into a column with a `UNIQUE` constraint.

In [None]:
# Let's reuse the 'users' table which has a UNIQUE constraint on username
sql_insert = "INSERT INTO users (username, password) VALUES (%s, %s);"

try:
    with psycopg2.connect(host=DB_HOST, dbname=DB_NAME, user=DB_USER, password=DB_PASS) as conn:
        with conn.cursor() as cur:
            # This first insert will work
            cur.execute(sql_insert, ('testuser', 'testpass'))
            print("First insert successful.")
            
            # This second one will fail because 'testuser' is already taken
            cur.execute(sql_insert, ('testuser', 'anotherpass'))
            print("Second insert successful.") # This line will not be reached
except psycopg2.Error as e:
    # psycopg2.errors.UniqueViolation is the specific error for this case
    if e.pgcode == '23505': # '23505' is the PostgreSQL code for unique_violation
        print(f"Error: The username 'testuser' already exists. Please choose another.")
        conn.rollback() # Roll back the failed transaction
    else:
        print(f"A different database error occurred: {e}")
        conn.rollback()

--- 
## 2. Bulk Operations with `executemany()`

Inserting thousands of rows one at a time in a loop is very slow because of the network round-trip for each `execute()` call. `executemany()` is far more efficient as it sends the data in a single batch.

Let's create a table and compare the performance.

In [None]:
sql_setup = """
DROP TABLE IF EXISTS perf_test CASCADE;
CREATE TABLE perf_test (
    id INT,
    name TEXT
);
"""
with psycopg2.connect(host=DB_HOST, dbname=DB_NAME, user=DB_USER, password=DB_PASS) as conn:
    with conn.cursor() as cur:
        cur.execute(sql_setup)
print("Table 'perf_test' created.")

# Prepare a large list of data to insert
data_to_insert = [(i, f'User {i}') for i in range(10000)]

In [None]:
print("--- Testing slow method: loop with execute() ---")
start_time = time.time()

with psycopg2.connect(host=DB_HOST, dbname=DB_NAME, user=DB_USER, password=DB_PASS) as conn:
    with conn.cursor() as cur:
        for record in data_to_insert:
            cur.execute("INSERT INTO perf_test (id, name) VALUES (%s, %s);", record)
    conn.commit()

end_time = time.time()
print(f"Looping with execute() took: {end_time - start_time:.4f} seconds")

In [None]:
# Clear the table first
with psycopg2.connect(host=DB_HOST, dbname=DB_NAME, user=DB_USER, password=DB_PASS) as conn:
    with conn.cursor() as cur:
        cur.execute("DELETE FROM perf_test;")

print("--- Testing fast method: executemany() ---")
start_time = time.time()

with psycopg2.connect(host=DB_HOST, dbname=DB_NAME, user=DB_USER, password=DB_PASS) as conn:
    with conn.cursor() as cur:
        cur.executemany("INSERT INTO perf_test (id, name) VALUES (%s, %s);", data_to_insert)
    conn.commit()

end_time = time.time()
print(f"Using executemany() took: {end_time - start_time:.4f} seconds")

The performance difference is dramatic. `executemany()` is the correct choice for bulk inserts.

--- 
## 3. Calling Stored Procedures

You can execute server-side logic (stored procedures/functions) from Python. This is useful for encapsulating complex business logic in the database.

Let's create a simple function to count employees and call it from Python.

In [None]:
sql_function = """
CREATE OR REPLACE FUNCTION get_user_count() RETURNS BIGINT AS $$
BEGIN
    RETURN (SELECT COUNT(*) FROM users);
END;
$$ LANGUAGE plpgsql;
"""

with psycopg2.connect(host=DB_HOST, dbname=DB_NAME, user=DB_USER, password=DB_PASS) as conn:
    with conn.cursor() as cur:
        cur.execute(sql_function)
print("Stored function 'get_user_count' created.")

In [None]:
with psycopg2.connect(host=DB_HOST, dbname=DB_NAME, user=DB_USER, password=DB_PASS) as conn:
    with conn.cursor() as cur:
        # The correct way to call a function is with SELECT
        cur.execute("SELECT get_user_count();")
        user_count = cur.fetchone()[0]
        print(f"The stored function reported {user_count} users.")

--- 
## 4. Connection Pooling (Conceptual)

In a real-world application like a web server, you might get hundreds of requests per second. Opening and closing a new database connection for every single request is extremely slow and inefficient. The process of establishing a connection (TCP handshake, authentication) is expensive.

The solution is **Connection Pooling**. A pool is a cache of pre-connected, ready-to-use database connections. When your application needs to talk to the database, it quickly "borrows" a connection from the pool and "returns" it when done. This avoids the expensive connection setup process on every request.

`psycopg2` has a built-in connection pool (`psycopg2.pool`) that is essential for any multi-threaded application.

--- 
## Conclusion

This notebook covered key techniques for writing professional-grade database code:

1.  **Handle errors specifically** by catching `psycopg2.Error` and inspecting its properties.
2.  Use **`executemany()`** for significant performance gains when inserting large amounts of data.
3.  Encapsulate logic in the database by **calling stored procedures** from Python.
4.  Understand the necessity of **connection pooling** for scalable applications.