# **Chapter 29: Database Fundamentals for Testers**

---

## **29.1 Introduction to Database Testing**

### **What is Database Testing?**

Database testing is the process of validating that the database layer of an application functions correctly, maintains data integrity, performs efficiently, and ensures data security. While application testing verifies that buttons click and screens display correctly, database testing verifies that data persists, relationships remain valid, and business rules are enforced at the data layer.

**Why Database Testing is Critical:**

1. **Data Integrity:** Ensures that data entered through the UI is stored correctly without corruption or loss
2. **Business Logic Enforcement:** Validates that constraints, triggers, and stored procedures enforce business rules (e.g., "account balance cannot be negative")
3. **Performance:** Prevents slow queries that degrade user experience, especially in mobile apps with limited bandwidth
4. **Security:** Ensures sensitive data is encrypted, access is controlled, and SQL injection vulnerabilities are prevented
5. **Migration Safety:** Validates that schema changes and data migrations don't corrupt existing data

**The Tester's Role in Database Testing:**
Unlike developers who optimize queries, testers verify that:
- Data mapping between UI and database is correct (field names, data types)
- Transactions complete successfully or roll back properly on failure
- Concurrent users don't corrupt shared data
- Backup and recovery mechanisms work as expected

---

## **29.2 Database Architecture**

Understanding database types helps testers choose appropriate validation strategies.

### **29.2.1 Relational Databases (RDBMS)**

**Characteristics:**
- Data organized in tables with predefined schemas
- Uses SQL (Structured Query Language)
- ACID compliance (Atomicity, Consistency, Isolation, Durability)
- Supports complex relationships via foreign keys

**Common RDBMS:**
- **PostgreSQL:** Open-source, advanced features, JSON support
- **MySQL/MariaDB:** Widely used in web applications
- **Oracle:** Enterprise-grade, financial systems
- **Microsoft SQL Server:** .NET ecosystem integration
- **SQLite:** Embedded databases in mobile apps

**Architecture Diagram:**
```
┌─────────────────────────────────────┐
│           Application                │
│    (Mobile/Web/Desktop)             │
└─────────────┬───────────────────────┘
              │ SQL/ORM
┌─────────────▼───────────────────────┐
│    Database Management System        │
│   (PostgreSQL/MySQL/Oracle)          │
│                                     │
│  ┌─────────┐ ┌─────────┐ ┌────────┐ │
│  │  Users  │ │ Orders  │ │Products│ │
│  │  Table  │ │  Table  │ │ Table  │ │
│  └─────────┘ └─────────┘ └────────┘ │
│         │           │                │
│         └─────┬─────┘                │
│        Relationships                 │
│   (Foreign Keys, Indexes)            │
└─────────────────────────────────────┘
```

### **29.2.2 NoSQL Databases**

**Characteristics:**
- Schema-less or flexible schemas
- Designed for horizontal scaling
- Various types: Document, Key-Value, Column-Family, Graph

**Types for Testers:**

**Document Stores (MongoDB, Couchbase):**
```json
// Flexible schema - documents can have different fields
{
  "_id": "user123",
  "name": "John Doe",
  "email": "john@example.com",
  "preferences": { "theme": "dark", "notifications": true }
}
```

**Key-Value Stores (Redis, DynamoDB):**
```
Key: session:user123
Value: eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9...
```

**Column-Family (Cassandra, HBase):**
- Optimized for write-heavy operations
- Data organized by columns rather than rows

**Testing Implications:**
- **Relational:** Focus on schema validation, referential integrity, complex joins
- **NoSQL:** Focus on data consistency models (eventual vs strong), document structure validation, missing field handling

---

## **29.3 SQL for Testers**

SQL (Structured Query Language) is essential for testers to verify data states, create test data, and validate business logic.

### **29.3.1 Basic SQL Commands**

**SELECT - Retrieving Data:**
```sql
-- Basic retrieval
SELECT * FROM users;

-- Specific columns
SELECT username, email, created_at FROM users;

-- Filtered data (WHERE clause)
SELECT * FROM users WHERE status = 'active';

-- Pattern matching
SELECT * FROM products WHERE name LIKE '%iPhone%';

-- Sorting
SELECT * FROM orders ORDER BY created_at DESC;

-- Limiting results
SELECT * FROM transactions LIMIT 10;

-- Aggregation
SELECT COUNT(*) as total_users FROM users;
SELECT AVG(price) as avg_price FROM products;
SELECT MAX(order_date) as latest_order FROM orders;
```

**INSERT - Creating Test Data:**
```sql
-- Single record
INSERT INTO users (username, email, age, status) 
VALUES ('testuser', 'test@example.com', 25, 'active');

-- Multiple records
INSERT INTO products (name, price, category) VALUES 
('Laptop', 999.99, 'Electronics'),
('Mouse', 29.99, 'Accessories'),
('Keyboard', 79.99, 'Accessories');
```

**UPDATE - Modifying Data:**
```sql
-- Update specific record
UPDATE users 
SET status = 'inactive', last_login = NOW() 
WHERE user_id = 123;

-- Update with condition (BE CAREFUL - always use WHERE)
UPDATE accounts 
SET balance = balance - 100 
WHERE account_id = 456;
```

**DELETE - Removing Data:**
```sql
-- Delete specific record
DELETE FROM sessions WHERE expired_at < NOW();

-- NEVER run without WHERE in production!
-- DELETE FROM users; -- This deletes everything!
```

### **29.3.2 Advanced SQL for Testing**

**JOIN Operations (Validating Relationships):**
```sql
-- INNER JOIN: Only matching records
SELECT users.username, orders.order_id, orders.total
FROM users
INNER JOIN orders ON users.user_id = orders.user_id;

-- LEFT JOIN: All users even without orders
SELECT users.username, COUNT(orders.order_id) as order_count
FROM users
LEFT JOIN orders ON users.user_id = orders.user_id
GROUP BY users.user_id;

-- Complex validation: Users with orders > $1000 in last month
SELECT u.username, u.email, SUM(o.total) as total_spent
FROM users u
JOIN orders o ON u.user_id = o.user_id
WHERE o.order_date >= DATE_SUB(NOW(), INTERVAL 1 MONTH)
GROUP BY u.user_id
HAVING total_spent > 1000;
```

**Subqueries (Nested Validation):**
```sql
-- Find users who haven't placed orders
SELECT * FROM users 
WHERE user_id NOT IN (SELECT DISTINCT user_id FROM orders);

-- Find products in the highest price category
SELECT * FROM products 
WHERE category = (
    SELECT category FROM products 
    ORDER BY price DESC LIMIT 1
);
```

**Data Integrity Checks:**
```sql
-- Find orphaned records (referential integrity violations)
SELECT o.order_id, o.user_id 
FROM orders o
LEFT JOIN users u ON o.user_id = u.user_id
WHERE u.user_id IS NULL;

-- Find duplicate emails (should be unique)
SELECT email, COUNT(*) as count
FROM users
GROUP BY email
HAVING count > 1;

-- Check for NULL values in required fields
SELECT * FROM users 
WHERE email IS NULL OR username IS NULL;
```

---

## **29.4 Database Relationships and Keys**

Understanding how tables relate is crucial for testing data consistency across the application.

### **29.4.1 Types of Relationships**

**One-to-One (1:1):**
- One user has one profile
- One employee has one desk assignment
```sql
-- User table
CREATE TABLE users (
    user_id INT PRIMARY KEY,
    username VARCHAR(50) NOT NULL
);

-- Profile table with foreign key
CREATE TABLE profiles (
    profile_id INT PRIMARY KEY,
    user_id INT UNIQUE,  -- UNIQUE enforces 1:1
    bio TEXT,
    FOREIGN KEY (user_id) REFERENCES users(user_id)
);
```

**One-to-Many (1:N):**
- One customer has many orders
- One category has many products
```sql
-- Parent table
CREATE TABLE categories (
    category_id INT PRIMARY KEY,
    name VARCHAR(100)
);

-- Child table with foreign key
CREATE TABLE products (
    product_id INT PRIMARY KEY,
    name VARCHAR(200),
    category_id INT,
    FOREIGN KEY (category_id) REFERENCES categories(category_id)
);
```

**Many-to-Many (M:N):**
- Students enroll in many courses; courses have many students
- Products in many orders; orders contain many products
```sql
-- Junction/Link table required
CREATE TABLE students (
    student_id INT PRIMARY KEY,
    name VARCHAR(100)
);

CREATE TABLE courses (
    course_id INT PRIMARY KEY,
    title VARCHAR(200)
);

-- Junction table
CREATE TABLE enrollments (
    student_id INT,
    course_id INT,
    enrollment_date DATE,
    PRIMARY KEY (student_id, course_id),
    FOREIGN KEY (student_id) REFERENCES students(student_id),
    FOREIGN KEY (course_id) REFERENCES courses(course_id)
);
```

### **29.4.2 Keys Explained**

**Primary Key (PK):**
- Uniquely identifies each record
- Cannot contain NULL values
- Typically auto-incrementing integer or UUID
```sql
CREATE TABLE orders (
    order_id INT PRIMARY KEY AUTO_INCREMENT,
    -- other fields
);
```

**Foreign Key (FK):**
- Establishes relationship between tables
- Ensures referential integrity
- Prevents deletion of parent if children exist (by default)
```sql
CREATE TABLE order_items (
    item_id INT PRIMARY KEY,
    order_id INT,
    product_id INT,
    FOREIGN KEY (order_id) REFERENCES orders(order_id)
        ON DELETE CASCADE  -- Delete items when order deleted
        ON UPDATE CASCADE  -- Update if order_id changes
);
```

**Composite Key:**
- Primary key consisting of multiple columns
- Common in junction tables
```sql
CREATE TABLE permissions (
    user_id INT,
    resource_id INT,
    permission_level VARCHAR(20),
    PRIMARY KEY (user_id, resource_id)  -- Composite key
);
```

**Unique Key:**
- Ensures no duplicate values in a column (except NULL)
- Different from Primary Key: table can have multiple Unique constraints
```sql
CREATE TABLE users (
    user_id INT PRIMARY KEY,
    email VARCHAR(100) UNIQUE,  -- No duplicate emails
    username VARCHAR(50) UNIQUE
);
```

---

## **29.5 CRUD Operations Testing**

CRUD (Create, Read, Update, Delete) represents the four basic operations. Testers must verify each operation maintains data integrity.

### **29.5.1 Testing Create Operations**

**Validation Points:**
- Data appears correctly in database after UI submission
- Auto-generated fields (timestamps, IDs) populate correctly
- Default values apply when fields left blank
- Constraints prevent invalid data (negative prices, future birthdates)

```python
class CreateOperationTesting:
    """
    Testing Create/INSERT operations
    """
    
    def test_user_creation_cascades_to_database(self, db_connection):
        """
        Verify user registration creates correct DB record
        """
        # UI Action: Register new user
        ui_data = {
            "username": "testuser_2026",
            "email": "test2026@example.com",
            "password": "SecurePass123!"  # Should be hashed in DB
        }
        
        # Execute UI registration (via API or automation)
        user_id = register_user_via_api(ui_data)
        
        # Database Verification
        cursor = db_connection.cursor()
        cursor.execute(
            "SELECT username, email, password_hash, created_at, status "
            "FROM users WHERE user_id = %s",
            (user_id,)
        )
        db_record = cursor.fetchone()
        
        # Assertions
        assert db_record[0] == ui_data["username"], "Username mismatch"
        assert db_record[1] == ui_data["email"], "Email mismatch"
        assert db_record[2] != ui_data["password"], "Password not hashed!"
        assert db_record[3] is not None, "Created timestamp missing"
        assert db_record[4] == "pending_verification", "Default status not applied"
    
    def test_create_enforces_constraints(self, db_connection):
        """
        Verify database rejects invalid data
        """
        # Try to insert duplicate email (unique constraint)
        cursor = db_connection.cursor()
        try:
            cursor.execute(
                "INSERT INTO users (username, email) VALUES (%s, %s)",
                ("newuser", "existing@example.com")  # Existing email
            )
            db_connection.commit()
            assert False, "Should have raised IntegrityError"
        except IntegrityError as e:
            assert "Duplicate entry" in str(e) or "unique constraint" in str(e).lower()
        
        # Try to insert negative value (check constraint)
        try:
            cursor.execute(
                "INSERT INTO products (name, price) VALUES (%s, %s)",
                ("Test Product", -10.00)
            )
            assert False, "Should reject negative price"
        except IntegrityError:
            pass  # Expected
    
    def test_create_with_relationships(self, db_connection):
        """
        Verify foreign key constraints on creation
        """
        # Try to create order for non-existent user
        cursor = db_connection.cursor()
        try:
            cursor.execute(
                "INSERT INTO orders (user_id, total) VALUES (99999, 100.00)"
            )
            assert False, "Should fail - user doesn't exist"
        except ForeignKeyViolation:
            pass  # Expected behavior
        
        # Valid creation with existing user
        cursor.execute(
            "INSERT INTO orders (user_id, total, status) VALUES (1, 100.00, 'pending') RETURNING order_id"
        )
        order_id = cursor.fetchone()[0]
        
        # Verify order_items can be created for this order
        cursor.execute(
            "INSERT INTO order_items (order_id, product_id, quantity, price) VALUES (%s, %s, %s, %s)",
            (order_id, 1, 2, 50.00)
        )
        db_connection.commit()
```

### **29.5.2 Testing Read Operations**

**Validation Points:**
- Data retrieved matches what was stored
- Filters (WHERE clauses) return correct subsets
- Sorting (ORDER BY) works as expected
- Pagination returns correct page of results
- JOINs return complete, accurate data sets

```python
class ReadOperationTesting:
    def test_search_returns_accurate_results(self, db_connection):
        """
        Verify search functionality queries database correctly
        """
        # Setup: Insert test products
        cursor = db_connection.cursor()
        test_products = [
            ("iPhone 15 Pro", "Electronics", 999.00),
            ("iPhone 15 Case", "Accessories", 29.00),
            ("Samsung Galaxy S24", "Electronics", 899.00)
        ]
        cursor.executemany(
            "INSERT INTO products (name, category, price) VALUES (%s, %s, %s)",
            test_products
        )
        db_connection.commit()
        
        # Execute search query (simulating app's search)
        search_term = "iPhone"
        cursor.execute(
            "SELECT name, category, price FROM products WHERE name LIKE %s ORDER BY price DESC",
            (f"%{search_term}%",)
        )
        results = cursor.fetchall()
        
        # Verify
        assert len(results) == 2, f"Expected 2 results, got {len(results)}"
        assert results[0][0] == "iPhone 15 Pro", "Highest price should be first"
        assert all("iPhone" in row[0] for row in results), "Non-matching product returned"
    
    def test_report_query_accuracy(self, db_connection):
        """
        Test complex reporting queries used in dashboards
        """
        cursor = db_connection.cursor()
        
        # Monthly revenue report query (typical business report)
        query = """
        SELECT 
            DATE_FORMAT(order_date, '%Y-%m') as month,
            COUNT(*) as total_orders,
            SUM(total) as revenue,
            AVG(total) as avg_order_value
        FROM orders
        WHERE order_date >= DATE_SUB(NOW(), INTERVAL 12 MONTH)
        GROUP BY DATE_FORMAT(order_date, '%Y-%m')
        ORDER BY month DESC
        """
        
        cursor.execute(query)
        monthly_data = cursor.fetchall()
        
        # Validate calculations manually for first month
        if monthly_data:
            month, orders, revenue, avg = monthly_data[0]
            
            # Verify: avg should equal revenue/orders
            calculated_avg = revenue / orders
            assert abs(calculated_avg - avg) < 0.01, "Average calculation incorrect"
```

### **29.5.3 Testing Update Operations**

**Validation Points:**
- Only intended records are modified
- Updated values are stored correctly
- Update timestamps auto-populate
- Constraints still enforced on updates
- Concurrent updates don't cause lost updates

```python
class UpdateOperationTesting:
    def test_update_preserves_data_integrity(self, db_connection):
        """
        Verify updates don't corrupt related data
        """
        cursor = db_connection.cursor()
        
        # Setup: Create order
        cursor.execute("INSERT INTO orders (user_id, total, status) VALUES (1, 100.00, 'pending') RETURNING order_id")
        order_id = cursor.fetchone()[0]
        
        # Update order status
        new_status = "shipped"
        tracking_number = "TRACK123456"
        
        cursor.execute(
            "UPDATE orders SET status = %s, tracking_number = %s, updated_at = NOW() WHERE order_id = %s",
            (new_status, tracking_number, order_id)
        )
        db_connection.commit()
        
        # Verify update
        cursor.execute("SELECT status, tracking_number FROM orders WHERE order_id = %s", (order_id,))
        result = cursor.fetchone()
        
        assert result[0] == new_status
        assert result[1] == tracking_number
        
        # Verify related records unchanged (items still exist)
        cursor.execute("SELECT COUNT(*) FROM order_items WHERE order_id = %s", (order_id,))
        item_count = cursor.fetchone()[0]
        assert item_count > 0, "Update deleted related items unexpectedly!"
    
    def test_concurrent_update_handling(self, db_connection):
        """
        Test that simultaneous updates don't cause race conditions
        """
        import threading
        
        def update_balance(user_id, amount):
            cursor = db_connection.cursor()
            cursor.execute("UPDATE accounts SET balance = balance + %s WHERE user_id = %s", (amount, user_id))
            db_connection.commit()
        
        # Initial balance: 1000
        user_id = 1
        
        # Simulate two simultaneous deposits
        t1 = threading.Thread(target=update_balance, args=(user_id, 100))
        t2 = threading.Thread(target=update_balance, args=(user_id, 200))
        
        t1.start()
        t2.start()
        t1.join()
        t2.join()
        
        # Verify final balance (should be 1300, not 1100 or 1200)
        cursor = db_connection.cursor()
        cursor.execute("SELECT balance FROM accounts WHERE user_id = %s", (user_id,))
        final_balance = cursor.fetchone()[0]
        
        assert final_balance == 1300, f"Race condition detected! Balance: {final_balance}"
```

### **29.5.4 Testing Delete Operations**

**Validation Points:**
- Correct record deleted (verify by ID, not just count)
- Cascading deletes work (ON DELETE CASCADE)
- Orphaned records don't remain (if no cascade)
- Soft deletes (if implemented) set flag rather than remove row

```python
class DeleteOperationTesting:
    def test_soft_delete_implementation(self, db_connection):
        """
        Verify soft delete updates flag rather than removing row
        """
        cursor = db_connection.cursor()
        
        # Create test user
        cursor.execute("INSERT INTO users (username) VALUES ('todelete') RETURNING user_id")
        user_id = cursor.fetchone()[0]
        
        # Perform soft delete (UI delete action)
        cursor.execute(
            "UPDATE users SET deleted_at = NOW(), status = 'inactive' WHERE user_id = %s",
            (user_id,)
        )
        db_connection.commit()
        
        # Verify not in active users list
        cursor.execute("SELECT COUNT(*) FROM users WHERE user_id = %s AND deleted_at IS NULL", (user_id,))
        active_count = cursor.fetchone()[0]
        assert active_count == 0, "User still appears as active"
        
        # Verify record exists in database (for audit/recovery)
        cursor.execute("SELECT * FROM users WHERE user_id = %s", (user_id,))
        assert cursor.fetchone() is not None, "Record permanently deleted (not soft deleted)"
    
    def test_cascade_delete_behavior(self, db_connection):
        """
        Verify foreign key cascade rules
        """
        cursor = db_connection.cursor()
        
        # Create user with orders
        cursor.execute("INSERT INTO users (username) VALUES ('cascade_test') RETURNING user_id")
        user_id = cursor.fetchone()[0]
        
        cursor.execute("INSERT INTO orders (user_id, total) VALUES (%s, 50.00) RETURNING order_id", (user_id,))
        order_id = cursor.fetchone()[0]
        
        cursor.execute("INSERT INTO order_items (order_id, product_id, quantity) VALUES (%s, 1, 1)", (order_id,))
        db_connection.commit()
        
        # Delete user (should cascade to orders if configured)
        cursor.execute("DELETE FROM users WHERE user_id = %s", (user_id,))
        db_connection.commit()
        
        # Verify orders also deleted (if ON DELETE CASCADE set)
        cursor.execute("SELECT COUNT(*) FROM orders WHERE user_id = %s", (user_id,))
        order_count = cursor.fetchone()[0]
        
        # This assertion depends on your schema design
        # If cascade: should be 0
        # If restrict: delete should have failed
        assert order_count == 0, "Orphaned orders remain after user deletion"
```

---

## **29.6 Data Integrity Testing**

Data integrity ensures accuracy and consistency of data throughout its lifecycle.

### **29.6.1 Types of Data Integrity**

**Entity Integrity:**
- Primary keys must be unique and not NULL
- No duplicate records in primary tables

**Referential Integrity:**
- Foreign keys must reference existing primary keys
- No orphaned child records

**Domain Integrity:**
- Data types must be correct (dates in date fields, numbers in numeric fields)
- Values must be within acceptable ranges (age > 0 and < 150)
- Format validation (emails contain @, phone numbers have valid format)

**User-Defined Integrity:**
- Business rules enforced via constraints (salary >= minimum_wage)
- Custom validation (end_date > start_date)

### **29.6.2 Integrity Testing Implementation**

```python
class DataIntegrityTesting:
    def test_primary_key_uniqueness(self, db_connection):
        """
        Verify no duplicate primary keys exist
        """
        cursor = db_connection.cursor()
        
        tables = ['users', 'orders', 'products', 'categories']
        
        for table in tables:
            cursor.execute(f"""
                SELECT {table}_id, COUNT(*) as count 
                FROM {table} 
                GROUP BY {table}_id 
                HAVING count > 1
            """)
            duplicates = cursor.fetchall()
            assert len(duplicates) == 0, f"Duplicate PKs found in {table}: {duplicates}"
    
    def test_foreign_key_integrity(self, db_connection):
        """
        Verify all foreign keys reference valid records
        """
        cursor = db_connection.cursor()
        
        # Check orders reference valid users
        cursor.execute("""
            SELECT o.order_id, o.user_id 
            FROM orders o
            LEFT JOIN users u ON o.user_id = u.user_id
            WHERE u.user_id IS NULL
        """)
        orphaned_orders = cursor.fetchall()
        assert len(orphaned_orders) == 0, f"Orphaned orders found: {orphaned_orders}"
        
        # Check order_items reference valid orders
        cursor.execute("""
            SELECT oi.item_id, oi.order_id
            FROM order_items oi
            LEFT JOIN orders o ON oi.order_id = o.order_id
            WHERE o.order_id IS NULL
        """)
        orphaned_items = cursor.fetchall()
        assert len(orphaned_items) == 0, f"Orphaned order items: {orphaned_items}"
    
    def test_domain_constraints(self, db_connection):
        """
        Verify data fits within valid domains
        """
        cursor = db_connection.cursor()
        
        # Check for invalid dates (future birthdates)
        cursor.execute("""
            SELECT user_id, birth_date 
            FROM user_profiles 
            WHERE birth_date > CURDATE() OR birth_date < '1900-01-01'
        """)
        invalid_dates = cursor.fetchall()
        assert len(invalid_dates) == 0, f"Invalid birth dates: {invalid_dates}"
        
        # Check for negative amounts in financial tables
        cursor.execute("""
            SELECT order_id, total 
            FROM orders 
            WHERE total < 0
        """)
        negative_amounts = cursor.fetchall()
        assert len(negative_amounts) == 0, f"Negative order totals: {negative_amounts}"
        
        # Check email format (basic pattern)
        cursor.execute("""
            SELECT user_id, email 
            FROM users 
            WHERE email NOT LIKE '%@%.%' OR email LIKE '% %'
        """)
        invalid_emails = cursor.fetchall()
        assert len(invalid_emails) == 0, f"Invalid email formats: {invalid_emails}"
    
    def test_not_null_constraints(self, db_connection):
        """
        Verify required fields are never NULL
        """
        cursor = db_connection.cursor()
        
        required_fields = [
            ('users', 'email'),
            ('users', 'created_at'),
            ('orders', 'user_id'),
            ('orders', 'order_date'),
            ('products', 'price')
        ]
        
        for table, field in required_fields:
            cursor.execute(f"""
                SELECT COUNT(*) FROM {table} WHERE {field} IS NULL
            """)
            null_count = cursor.fetchone()[0]
            assert null_count == 0, f"NULL values found in {table}.{field}: {null_count} records"
```

---

## **29.7 Transaction Testing (ACID Properties)**

Transactions ensure database consistency even during failures or concurrent access.

### **29.7.1 ACID Properties**

**Atomicity:** All operations complete successfully or none do (all-or-nothing)

**Consistency:** Database remains in valid state before and after transaction

**Isolation:** Concurrent transactions don't interfere with each other

**Durability:** Committed changes survive system crashes

### **29.7.2 Transaction Testing Scenarios**

```python
class TransactionTesting:
    def test_atomicity_rollback(self, db_connection):
        """
        Verify transaction rolls back completely on error
        """
        cursor = db_connection.cursor()
        
        initial_balance = 1000
        transfer_amount = 500
        
        try:
            # Start transaction
            cursor.execute("START TRANSACTION")
            
            # Deduct from sender
            cursor.execute(
                "UPDATE accounts SET balance = balance - %s WHERE account_id = 1",
                (transfer_amount,)
            )
            
            # Attempt to add to non-existent receiver (will fail)
            cursor.execute(
                "UPDATE accounts SET balance = balance + %s WHERE account_id = 99999",
                (transfer_amount,)
            )
            
            # Check rows affected
            if cursor.rowcount == 0:
                raise Exception("Receiver account not found")
            
            cursor.execute("COMMIT")
            
        except Exception as e:
            cursor.execute("ROLLBACK")
            
            # Verify sender balance unchanged (atomicity)
            cursor.execute("SELECT balance FROM accounts WHERE account_id = 1")
            current_balance = cursor.fetchone()[0]
            
            assert current_balance == initial_balance, \
                f"Atomicity violation! Balance changed to {current_balance}"
    
    def test_isolation_concurrent_access(self, db_connection):
        """
        Test that concurrent transactions are properly isolated
        """
        import threading
        import queue
        
        results = queue.Queue()
        
        def read_uncommitted_data():
            """Try to read data from uncommitted transaction"""
            try:
                conn = get_new_connection()  # Separate connection
                cursor = conn.cursor()
                
                # Try to read balance while other transaction is modifying
                cursor.execute("SELECT balance FROM accounts WHERE account_id = 1")
                balance = cursor.fetchone()[0]
                results.put(('read', balance))
                
            except Exception as e:
                results.put(('error', str(e)))
        
        def modify_data():
            """Modify data but don't commit immediately"""
            cursor = db_connection.cursor()
            cursor.execute("START TRANSACTION")
            cursor.execute("UPDATE accounts SET balance = 9999 WHERE account_id = 1")
            
            # Hold transaction open
            time.sleep(2)
            
            cursor.execute("ROLLBACK")
            results.put(('modified', True))
        
        # Start modifier thread
        t1 = threading.Thread(target=modify_data)
        t1.start()
        
        time.sleep(0.5)  # Let modification start
        
        # Start reader thread
        t2 = threading.Thread(target=read_uncommitted_data)
        t2.start()
        
        t1.join()
        t2.join()
        
        # Verify reader didn't see uncommitted data (Isolation)
        read_result = results.get()
        if read_result[0] == 'read':
            assert read_result[1] != 9999, "Dirty read occurred - isolation failed"
    
    def test_durability_after_crash(self, db_connection):
        """
        Verify committed data survives (simulated crash)
        """
        cursor = db_connection.cursor()
        
        # Insert critical data
        cursor.execute(
            "INSERT INTO audit_log (action, details) VALUES (%s, %s) RETURNING log_id",
            ("payment_processed", "Order #12345")
        )
        log_id = cursor.fetchone()[0]
        db_connection.commit()  # Explicit commit
        
        # Simulate crash by closing connection abruptly
        db_connection.close()
        
        # Reconnect (simulating application restart)
        new_connection = get_new_connection()
        new_cursor = new_connection.cursor()
        
        # Verify data persisted
        new_cursor.execute("SELECT * FROM audit_log WHERE log_id = %s", (log_id,))
        record = new_cursor.fetchone()
        
        assert record is not None, "Durability failure - committed data lost"
        assert record[1] == "payment_processed"
```

---

## **29.8 Test Data Management**

Effective database testing requires strategies for creating, maintaining, and cleaning test data.

### **29.8.1 Test Data Strategies**

**Factory Pattern for Test Data:**
```python
import faker
from datetime import datetime, timedelta

fake = faker.Faker()

class TestDataFactory:
    """
    Generate realistic test data
    """
    
    @staticmethod
    def create_user(db_connection, **overrides):
        """Generate a test user with realistic data"""
        defaults = {
            'username': fake.user_name(),
            'email': fake.email(),
            'first_name': fake.first_name(),
            'last_name': fake.last_name(),
            'created_at': datetime.now(),
            'status': 'active'
        }
        defaults.update(overrides)
        
        cursor = db_connection.cursor()
        cursor.execute("""
            INSERT INTO users (username, email, first_name, last_name, created_at, status)
            VALUES (%(username)s, %(email)s, %(first_name)s, %(last_name)s, %(created_at)s, %(status)s)
            RETURNING user_id
        """, defaults)
        
        user_id = cursor.fetchone()[0]
        db_connection.commit()
        return user_id
    
    @staticmethod
    def create_order(db_connection, user_id, item_count=3, **overrides):
        """Create an order with items"""
        cursor = db_connection.cursor()
        
        # Create order
        order_data = {
            'user_id': user_id,
            'total': 0,
            'status': 'pending',
            'order_date': datetime.now(),
            **overrides
        }
        
        cursor.execute("""
            INSERT INTO orders (user_id, total, status, order_date)
            VALUES (%(user_id)s, %(total)s, %(status)s, %(order_date)s)
            RETURNING order_id
        """, order_data)
        
        order_id = cursor.fetchone()[0]
        
        # Add order items
        total = 0
        for i in range(item_count):
            price = fake.random.uniform(10, 500)
            quantity = fake.random.randint(1, 5)
            total += price * quantity
            
            cursor.execute("""
                INSERT INTO order_items (order_id, product_name, quantity, unit_price)
                VALUES (%s, %s, %s, %s)
            """, (order_id, fake.product_name(), quantity, price))
        
        # Update order total
        cursor.execute("UPDATE orders SET total = %s WHERE order_id = %s", (total, order_id))
        db_connection.commit()
        
        return order_id

# Usage in tests
def test_order_history(db_connection):
    # Setup: Create user with 5 orders
    user_id = TestDataFactory.create_user(db_connection)
    for _ in range(5):
        TestDataFactory.create_order(db_connection, user_id)
    
    # Execute test...
```

**Data Masking for Privacy:**
```python
class DataMasking:
    """
    Anonymize production data for testing
    """
    
    @staticmethod
    def mask_user_data(db_connection):
        """
        Replace sensitive PII with fake data while preserving relationships
        """
        cursor = db_connection.cursor()
        
        # Get all users
        cursor.execute("SELECT user_id, email FROM users")
        users = cursor.fetchall()
        
        for user_id, old_email in users:
            # Generate consistent fake email based on ID (for testing relationships)
            fake_email = f"user{user_id}@test.example.com"
            fake_name = f"TestUser{user_id}"
            
            cursor.execute("""
                UPDATE users 
                SET email = %s, 
                    first_name = %s,
                    last_name = 'Test',
                    phone = '555-0000',
                    ssn = '000-00-0000'
                WHERE user_id = %s
            """, (fake_email, fake_name, user_id))
        
        db_connection.commit()
```

### **29.8.2 Database Reset Strategies**

```python
class TestDatabaseManager:
    """
    Manage database state for tests
    """
    
    def __init__(self, db_connection):
        self.db_connection = db_connection
        self.savepoint = None
    
    def setup_clean_state(self):
        """
        Ensure clean database before test run
        """
        cursor = self.db_connection.cursor()
        
        # Truncate tables (faster than DELETE, resets auto-increment)
        tables = ['order_items', 'orders', 'users', 'audit_log']
        for table in tables:
            cursor.execute(f"TRUNCATE TABLE {table} CASCADE")
        
        self.db_connection.commit()
        
        # Seed with minimal required data
        cursor.execute("INSERT INTO users (user_id, username) VALUES (1, 'admin')")
        self.db_connection.commit()
    
    def transactional_cleanup(self):
        """
        Use transactions to rollback changes after test
        """
        cursor = self.db_connection.cursor()
        cursor.execute("SAVEPOINT test_start")
        self.savepoint = "test_start"
    
    def rollback(self):
        """Rollback to savepoint"""
        if self.savepoint:
            cursor = self.db_connection.cursor()
            cursor.execute(f"ROLLBACK TO SAVEPOINT {self.savepoint}")
            self.db_connection.commit()
```

---

## **29.9 Database vs. Application Testing**

Understanding the boundary between database and application testing prevents duplicate efforts and gaps.

| Aspect | Application Testing | Database Testing |
|--------|---------------------|------------------|
| **Focus** | User workflows, UI interactions | Data storage, relationships, constraints |
| **Tools** | Selenium, Appium, Postman | SQL queries, database clients, profilers |
| **Validation** | "Order appears in my account" | "Order record exists with correct foreign keys, totals calculated correctly" |
| **Performance** | Page load times | Query execution time, index usage |
| **Security** | XSS, CSRF protection | SQL injection, data encryption, access controls |

**Integration Points:**
- **API Testing:** Validates both application logic and database persistence
- **End-to-End Testing:** Implicitly tests database through UI actions
- **Contract Testing:** Ensures application queries match database schema

---

## **Chapter Summary**

### **Key Takeaways from Chapter 29:**

**Database Fundamentals:**
- **Relational Databases:** Structured schemas, ACID compliance, SQL-based (PostgreSQL, MySQL, Oracle)
- **NoSQL:** Flexible schemas, horizontal scaling, various models (Document, Key-Value, Column, Graph)
- **Testers must understand:** Both types require validation, but strategies differ (schema vs. document structure)

**SQL for Testers:**
- **SELECT:** Verify data retrieval, filtering, sorting, aggregation (COUNT, AVG, SUM)
- **JOIN:** Validate relationships between tables (INNER, LEFT, verify no orphaned records)
- **INSERT/UPDATE/DELETE:** Test data manipulation with constraint validation
- **Subqueries:** Complex validation scenarios (finding gaps, duplicates, orphans)

**Keys and Relationships:**
- **Primary Keys:** Unique identifiers, never NULL, enforce entity integrity
- **Foreign Keys:** Maintain referential integrity between tables, test CASCADE behaviors
- **Relationships:** 1:1 (unique FK), 1:N (standard FK), M:N (junction tables with composite keys)

**CRUD Testing:**
- **Create:** Verify constraints, defaults, auto-generated fields, relationship validity
- **Read:** Validate search accuracy, report calculations, pagination
- **Update:** Ensure partial updates don't corrupt data, test concurrent modification handling
- **Delete:** Verify cascade rules, soft delete implementations, no orphaned records

**Data Integrity:**
- **Entity Integrity:** No duplicate primary keys
- **Referential Integrity:** All foreign keys reference valid records
- **Domain Integrity:** Valid data types, ranges, formats (emails, dates)
- **User-Defined:** Business rules enforced (balance ≥ 0, end > start date)

**Transaction Testing (ACID):**
- **Atomicity:** All-or-nothing execution; verify rollback on failure
- **Consistency:** Database constraints maintained after transactions
- **Isolation:** Concurrent transactions don't cause dirty reads, lost updates
- **Durability:** Committed data survives crashes

**Test Data Management:**
- **Factories:** Generate realistic, varied test data programmatically
- **Masking:** Anonymize production data for safe testing (GDPR compliance)
- **Cleanup:** Truncate vs. transactional rollback strategies for test isolation

**Database vs. Application Testing:**
- Database testing focuses on data correctness, relationships, and constraints
- Application testing validates user-facing functionality
- Both required: Application tests verify "what user sees," database tests verify "what system remembers"

---

## **📖 Next Chapter: Chapter 30 - SQL for Testers**

Now that you understand database architecture and CRUD operations, **Chapter 30** will dive deeper into **advanced SQL techniques specifically for testing**.

In **Chapter 30**, you'll master:

- **Complex Joins:** INNER, LEFT, RIGHT, FULL OUTER, CROSS joins, and SELF joins for comprehensive data validation
- **Window Functions:** ROW_NUMBER(), RANK(), LAG(), LEAD() for analyzing data sequences and detecting anomalies
- **Common Table Expressions (CTEs):** Recursive queries for hierarchical data testing (org charts, category trees)
- **Stored Procedures and Functions:** Testing database business logic, input/output validation, and error handling
- **Triggers:** Testing automated actions (audit logs, cascading updates) and verifying trigger execution conditions
- **Indexes and Query Optimization:** Identifying slow queries, testing index effectiveness, and execution plan analysis
- **Data Comparison:** Techniques for comparing data between environments (dev vs. prod), schema diff testing
- **ETL Testing:** Extract, Transform, Load validation for data warehousing and migration projects

**Chapter 30** transforms you from a basic SQL user into a database testing specialist capable of validating complex data scenarios and optimizing database performance.

**Continue to Chapter 30 to master advanced SQL and complete your database testing expertise!**

<div style='width:100%; display:flex; justify-content:space-between; align-items:center; margin: 1em 0;'>
  <a href='../6. mobile_application_testing/28. mobile_specific_scenarios.ipynb' style='font-weight:bold; font-size:1.05em;'>&larr; Previous</a>
  <a href='../TOC.md' style='font-weight:bold; font-size:1.05em; text-align:center;'>Table of Contents</a>
  <a href='30. sql_for_testers.ipynb' style='font-weight:bold; font-size:1.05em;'>Next &rarr;</a>
</div>
