# Chapter 18: SQLite3 Fundamentals

Python ships with `sqlite3` in the standard library, giving you a fully functional
relational database with zero external dependencies. SQLite stores the entire database
in a single file (or in memory), making it perfect for prototyping, testing, embedded
applications, and small-to-medium workloads.

## What You Will Learn
- Connecting to SQLite databases (file-based and in-memory)
- Creating tables with `CREATE TABLE`
- Full CRUD operations: `INSERT`, `SELECT`, `UPDATE`, `DELETE`
- Parameterized queries to prevent SQL injection
- Batch inserts with `executemany()`
- Fetching results: `fetchone()`, `fetchall()`, `fetchmany()`
- Memory-efficient cursor iteration
- DB-API 2.0 (PEP 249) overview

## Connecting to a Database

Use `sqlite3.connect()` to open (or create) a database. The special string
`":memory:"` creates a temporary in-memory database that vanishes when the
connection is closed -- ideal for experimentation and testing.

In [None]:
import sqlite3

# In-memory database -- no file on disk
conn: sqlite3.Connection = sqlite3.connect(":memory:")
print(f"Connection type: {type(conn)}")
print(f"SQLite version:  {sqlite3.sqlite_version}")
print(f"DB-API version:  {sqlite3.apilevel}")

# A cursor executes SQL statements and retrieves results
cur: sqlite3.Cursor = conn.cursor()
print(f"Cursor type:     {type(cur)}")

# File-based database (commented out to keep examples self-contained):
# conn = sqlite3.connect("my_app.db")  # creates the file if it doesn't exist

## Creating Tables with CREATE TABLE

SQLite supports common SQL column types: `INTEGER`, `TEXT`, `REAL`, `BLOB`, and
`NULL`. The `INTEGER PRIMARY KEY` column acts as an auto-incrementing row ID.

In [None]:
# Create a 'books' table
cur.execute("""
    CREATE TABLE IF NOT EXISTS books (
        id    INTEGER PRIMARY KEY,
        title TEXT    NOT NULL,
        author TEXT   NOT NULL,
        year  INTEGER,
        price REAL
    )
""")
conn.commit()

# Verify the table was created by querying sqlite_master
cur.execute("SELECT name FROM sqlite_master WHERE type='table'")
tables: list[tuple[str]] = cur.fetchall()
print(f"Tables in database: {[t[0] for t in tables]}")

## INSERT: Adding Rows

Use `INSERT INTO` to add rows. **Always** use parameterized queries (`?` placeholders)
instead of string formatting to prevent SQL injection attacks.

In [None]:
# Single insert with parameterized query (? placeholders)
cur.execute(
    "INSERT INTO books (title, author, year, price) VALUES (?, ?, ?, ?)",
    ("The Pragmatic Programmer", "David Thomas", 1999, 49.99),
)
print(f"Inserted row with id: {cur.lastrowid}")

# DANGER: Never do this -- vulnerable to SQL injection!
# title = "'; DROP TABLE books; --"
# cur.execute(f"INSERT INTO books (title, author, year, price) VALUES ('{title}', ...)")

# Safe: parameterized queries escape all user input automatically
user_input: str = "'; DROP TABLE books; --"
cur.execute(
    "INSERT INTO books (title, author, year, price) VALUES (?, ?, ?, ?)",
    (user_input, "Unknown", 2024, 0.0),
)
print(f"Safely inserted malicious string as data, id: {cur.lastrowid}")
conn.commit()

## executemany(): Batch Inserts

`executemany()` inserts multiple rows in a single call, which is more efficient
than calling `execute()` in a loop.

In [None]:
books_to_insert: list[tuple[str, str, int, float]] = [
    ("Clean Code", "Robert C. Martin", 2008, 39.99),
    ("Design Patterns", "Gang of Four", 1994, 54.99),
    ("Refactoring", "Martin Fowler", 1999, 47.99),
    ("Python Cookbook", "David Beazley", 2013, 59.99),
    ("Fluent Python", "Luciano Ramalho", 2015, 49.99),
]

cur.executemany(
    "INSERT INTO books (title, author, year, price) VALUES (?, ?, ?, ?)",
    books_to_insert,
)
conn.commit()

print(f"Inserted {len(books_to_insert)} books")
print(f"Total rows affected: {cur.rowcount}")

## SELECT: Querying Data

The `SELECT` statement retrieves rows. Use the cursor's fetch methods to
control how results are returned:

| Method | Returns | Use When |
|--------|---------|----------|
| `fetchone()` | One row or `None` | You need exactly one result |
| `fetchall()` | List of all rows | Result set fits in memory |
| `fetchmany(n)` | Up to `n` rows | Processing in batches |
| Cursor iteration | One row at a time | Large result sets |

In [None]:
# fetchall() -- get every row at once
cur.execute("SELECT id, title, author, year, price FROM books ORDER BY year")
all_books: list[tuple] = cur.fetchall()

print(f"Total books: {len(all_books)}\n")
for book in all_books:
    book_id, title, author, year, price = book
    print(f"  [{book_id}] {title} by {author} ({year}) - ${price:.2f}")

In [None]:
# fetchone() -- get a single row
cur.execute("SELECT title, price FROM books WHERE id = ?", (1,))
row: tuple | None = cur.fetchone()

if row is not None:
    title, price = row
    print(f"Book 1: {title} (${price:.2f})")
else:
    print("Book not found")

# fetchone() returns None when no match
cur.execute("SELECT title FROM books WHERE id = ?", (9999,))
missing: tuple | None = cur.fetchone()
print(f"Missing book: {missing}")

In [None]:
# fetchmany(n) -- process results in batches
cur.execute("SELECT title FROM books ORDER BY title")

batch_num: int = 1
while True:
    batch: list[tuple] = cur.fetchmany(3)
    if not batch:
        break
    print(f"Batch {batch_num}: {[row[0] for row in batch]}")
    batch_num += 1

## Memory-Efficient Cursor Iteration

For large result sets, iterate directly over the cursor. This processes one row
at a time without loading everything into memory.

In [None]:
# Direct cursor iteration -- memory-efficient for large datasets
cur.execute("SELECT title, price FROM books WHERE price > ?  ORDER BY price DESC", (40.0,))

print("Books over $40 (streamed one at a time):")
for title, price in cur:  # iterates row by row
    print(f"  {title}: ${price:.2f}")

# The cursor protocol: __iter__ returns self, __next__ calls fetchone()
print(f"\nCursor supports iteration: {hasattr(cur, '__iter__')}")

## UPDATE and DELETE: Modifying Data

`UPDATE` changes existing rows; `DELETE` removes them. Both support `WHERE`
clauses and parameterized queries. Always check `cursor.rowcount` to verify
how many rows were affected.

In [None]:
# UPDATE: change the price of a specific book
cur.execute(
    "UPDATE books SET price = ? WHERE title = ?",
    (44.99, "Clean Code"),
)
print(f"Rows updated: {cur.rowcount}")

# Verify the update
cur.execute("SELECT title, price FROM books WHERE title = ?", ("Clean Code",))
print(f"After update: {cur.fetchone()}")

# DELETE: remove the malicious-string test row
cur.execute("DELETE FROM books WHERE author = ?", ("Unknown",))
print(f"\nRows deleted: {cur.rowcount}")
conn.commit()

# Confirm deletion
cur.execute("SELECT COUNT(*) FROM books")
count: int = cur.fetchone()[0]
print(f"Remaining books: {count}")

## DB-API 2.0 (PEP 249) Overview

Python's `sqlite3` module conforms to **PEP 249** (DB-API 2.0), the standard
interface for database access in Python. This means the patterns you learn here
apply to other databases too (PostgreSQL via `psycopg2`, MySQL via `mysql-connector`,
etc.).

### Core DB-API 2.0 Concepts

| Component | Description |
|-----------|-------------|
| `connect()` | Module-level function that returns a `Connection` |
| `Connection` | Represents a database session; manages transactions |
| `Cursor` | Executes SQL and fetches results |
| `execute()` | Run a single SQL statement with optional parameters |
| `executemany()` | Run a statement for each item in a sequence |
| `fetchone/all/many()` | Retrieve query results |
| `commit()` / `rollback()` | Transaction control |
| `close()` | Release the connection |

### Module-Level Attributes

In [None]:
# DB-API 2.0 module-level attributes
print(f"apilevel:      {sqlite3.apilevel!r}")       # '2.0'
print(f"threadsafety:  {sqlite3.threadsafety}")      # 1 = threads may share module
print(f"paramstyle:    {sqlite3.paramstyle!r}")      # 'qmark' means ? placeholders

# Cursor description after a SELECT -- column metadata
cur.execute("SELECT id, title, price FROM books LIMIT 1")
print(f"\nCursor description (column metadata):")
if cur.description:
    for col in cur.description:
        # (name, type_code, display_size, internal_size, precision, scale, null_ok)
        print(f"  Column: {col[0]!r}, type_code: {col[1]}")

# Clean up
conn.close()
print("\nConnection closed.")

## Summary

### Key Takeaways

| Topic | What You Learned |
|-------|------------------|
| **Connecting** | `sqlite3.connect(":memory:")` for in-memory, or pass a file path |
| **Creating tables** | `CREATE TABLE IF NOT EXISTS` with typed columns |
| **INSERT** | Always use `?` placeholders to prevent SQL injection |
| **Batch inserts** | `executemany()` is cleaner and faster than looping |
| **SELECT** | `fetchone()`, `fetchall()`, `fetchmany()` for different needs |
| **Cursor iteration** | Iterate directly over the cursor for memory efficiency |
| **UPDATE / DELETE** | Check `rowcount` to verify how many rows were affected |
| **DB-API 2.0** | Standard interface (PEP 249) shared across database drivers |

### Best Practices
- **Always use parameterized queries** -- never use f-strings or `%` formatting with SQL
- **Call `commit()`** after write operations to persist changes
- **Close connections** when done to release resources
- **Use `IF NOT EXISTS`** on `CREATE TABLE` to make scripts re-runnable
- **Prefer cursor iteration** over `fetchall()` for large result sets