# GRIT: Data Manipulation - Day 6

**Learning Objectives**
- Master INSERT statements to add new data
- Use UPDATE to modify existing records
- Apply DELETE to remove data safely
- Create new tables with CREATE TABLE
- Modify table structures with ALTER TABLE
- Understand constraints and data integrity
- Use transactions for data safety

**Why this matters**  
Reading data is important, but real database work involves creating, updating, and managing data. Whether you're building a new application, cleaning up old records, or restructuring your database, these data manipulation skills are essential for any data professional.

Today you'll learn to be the architect of your data - creating, modifying, and managing databases like a pro!

## Setup: Connect to Our Database

Let's connect to our e-commerce database:

In [None]:
# Load the SQL extension
%load_ext sql

# Connect to our sample database
%sql sqlite:///ecommerce.db

print("✅ Connected to database!")

## Theory: Data Manipulation Language (DML)

### The Four Horsemen of Data Manipulation:
- **CREATE**: Build new database objects (tables, indexes, etc.)
- **READ**: Query data (SELECT - we learned this!)
- **UPDATE**: Modify existing data
- **DELETE**: Remove data

### Data Integrity Rules:
- **Primary Keys**: Unique identifiers for each record
- **Foreign Keys**: Links between related tables
- **NOT NULL**: Fields that must have values
- **UNIQUE**: Values that must be unique
- **CHECK**: Custom validation rules
- **DEFAULT**: Automatic values when none provided

### Transactions:
Groups of operations that succeed or fail together:
- **BEGIN**: Start a transaction
- **COMMIT**: Save all changes
- **ROLLBACK**: Undo all changes

### Safety First:
Always backup before major changes! Test on copies first!

## Examples: CREATE TABLE

Let's create some new tables for our e-commerce system:

In [None]:
-- Example 1: Create a reviews table
CREATE TABLE product_reviews (
    review_id INTEGER PRIMARY KEY,
    product_id INTEGER NOT NULL,
    customer_id INTEGER NOT NULL,
    rating INTEGER CHECK(rating >= 1 AND rating <= 5),
    review_text TEXT,
    review_date DATE DEFAULT CURRENT_DATE,
    helpful_votes INTEGER DEFAULT 0,
    verified_purchase BOOLEAN DEFAULT FALSE,
    FOREIGN KEY (product_id) REFERENCES products(product_id),
    FOREIGN KEY (customer_id) REFERENCES customers(customer_id)
);

SELECT '✅ Product reviews table created!' as status;

In [None]:
-- Example 2: Create an inventory tracking table
CREATE TABLE inventory_log (
    log_id INTEGER PRIMARY KEY,
    product_id INTEGER NOT NULL,
    change_type TEXT CHECK(change_type IN ('restock', 'sale', 'adjustment', 'return')),
    quantity_change INTEGER NOT NULL,
    previous_stock INTEGER NOT NULL,
    new_stock INTEGER NOT NULL,
    change_reason TEXT,
    changed_by TEXT DEFAULT 'system',
    change_date DATETIME DEFAULT CURRENT_TIMESTAMP,
    FOREIGN KEY (product_id) REFERENCES products(product_id)
);

SELECT '✅ Inventory log table created!' as status;

In [None]:
-- Example 3: Create a customer preferences table
CREATE TABLE customer_preferences (
    preference_id INTEGER PRIMARY KEY,
    customer_id INTEGER NOT NULL UNIQUE,
    email_marketing BOOLEAN DEFAULT TRUE,
    sms_notifications BOOLEAN DEFAULT FALSE,
    favorite_category TEXT,
    preferred_contact_time TEXT CHECK(preferred_contact_time IN ('morning', 'afternoon', 'evening')),
    loyalty_tier TEXT DEFAULT 'bronze' CHECK(loyalty_tier IN ('bronze', 'silver', 'gold', 'platinum')),
    created_at DATETIME DEFAULT CURRENT_TIMESTAMP,
    updated_at DATETIME DEFAULT CURRENT_TIMESTAMP,
    FOREIGN KEY (customer_id) REFERENCES customers(customer_id)
);

SELECT '✅ Customer preferences table created!' as status;

## Examples: INSERT Data

Add new records to our tables:

In [None]:
-- Example 4: Insert a product review
INSERT INTO product_reviews (product_id, customer_id, rating, review_text, verified_purchase)
VALUES (1, 1, 5, 'Amazing wireless headphones! Great sound quality and comfortable for long listening sessions.', TRUE);

SELECT '✅ Review inserted!' as status;

In [None]:
-- Example 5: Insert multiple reviews at once
INSERT INTO product_reviews (product_id, customer_id, rating, review_text, verified_purchase) VALUES
(2, 2, 4, 'Good gaming mouse with RGB lighting. Could be more responsive.', TRUE),
(3, 3, 5, 'Perfect coffee maker! Brews excellent coffee every morning.', TRUE),
(4, 4, 4, 'Great running shoes. Very comfortable and good support.', TRUE),
(6, 5, 5, 'Smart watch is fantastic! Tracks everything I need and looks great.', TRUE);

SELECT '✅ Multiple reviews inserted!' as status;

In [None]:
-- Example 6: Insert customer preferences
INSERT INTO customer_preferences (customer_id, email_marketing, sms_notifications, favorite_category, preferred_contact_time, loyalty_tier) VALUES
(1, TRUE, FALSE, 'Electronics', 'morning', 'gold'),
(2, TRUE, TRUE, 'Sports', 'afternoon', 'silver'),
(3, FALSE, FALSE, 'Appliances', 'evening', 'bronze'),
(4, TRUE, TRUE, 'Sports', 'morning', 'silver');

SELECT '✅ Customer preferences inserted!' as status;

In [None]:
-- Example 7: Insert with subquery (copy active customers to preferences)
INSERT INTO customer_preferences (customer_id, email_marketing, loyalty_tier)
SELECT customer_id, TRUE, 'bronze'
FROM customers
WHERE customer_id NOT IN (SELECT customer_id FROM customer_preferences);

SELECT '✅ Remaining customers added to preferences!' as status;

## Examples: UPDATE Data

Modify existing records:

In [None]:
-- Example 8: Update product stock
UPDATE products
SET stock_quantity = stock_quantity + 10
WHERE product_id = 1;

SELECT product_name, stock_quantity
FROM products
WHERE product_id = 1;

In [None]:
-- Example 9: Update customer status based on spending
UPDATE customer_preferences
SET loyalty_tier = 'gold'
WHERE customer_id IN (
    SELECT c.customer_id
    FROM customers c
    LEFT JOIN orders o ON c.customer_id = o.customer_id
    GROUP BY c.customer_id
    HAVING COALESCE(SUM(o.total_amount), 0) > 200
);

SELECT '✅ High-value customers upgraded to gold!' as status;

In [None]:
-- Example 10: Update multiple fields with conditions
UPDATE customer_preferences
SET sms_notifications = TRUE,
    preferred_contact_time = 'morning',
    updated_at = CURRENT_TIMESTAMP
WHERE loyalty_tier IN ('gold', 'platinum');

SELECT '✅ Premium customers updated with preferences!' as status;

In [None]:
-- Example 11: Update based on related table data
UPDATE products
SET stock_quantity = stock_quantity - (
    SELECT COALESCE(SUM(oi.quantity), 0)
    FROM order_items oi
    WHERE oi.product_id = products.product_id
)
WHERE product_id IN (SELECT DISTINCT product_id FROM order_items);

SELECT '✅ Product stock updated based on sales!' as status;

## Examples: DELETE Data

Remove records safely:

In [None]:
-- Example 12: Delete old reviews (keep only recent ones)
-- First, let's see what we have
SELECT COUNT(*) as total_reviews FROM product_reviews;

-- Delete reviews older than 1 year (but we just created them, so none will be deleted)
DELETE FROM product_reviews
WHERE review_date < DATE('now', '-1 year');

SELECT COUNT(*) as remaining_reviews FROM product_reviews;

In [None]:
-- Example 13: Delete inactive customer preferences
DELETE FROM customer_preferences
WHERE customer_id IN (
    SELECT c.customer_id
    FROM customers c
    WHERE c.customer_status = 'inactive'
);

SELECT '✅ Inactive customer preferences removed!' as status;

In [None]:
-- Example 14: Clean up low-rated reviews
-- Delete reviews with rating 1 or 2 that have no helpful votes
DELETE FROM product_reviews
WHERE rating <= 2 AND helpful_votes = 0;

SELECT '✅ Low-quality reviews cleaned up!' as status;

## Examples: ALTER TABLE

Modify table structures:

In [None]:
-- Example 15: Add a new column to products table
ALTER TABLE products ADD COLUMN discontinued BOOLEAN DEFAULT FALSE;

SELECT '✅ Discontinued column added to products!' as status;

In [None]:
-- Example 16: Add discount column to product_reviews
ALTER TABLE product_reviews ADD COLUMN would_recommend BOOLEAN;

SELECT '✅ Recommendation column added to reviews!' as status;

In [None]:
-- Example 17: Update the new columns with data
UPDATE product_reviews
SET would_recommend = CASE WHEN rating >= 4 THEN TRUE ELSE FALSE END
WHERE would_recommend IS NULL;

SELECT '✅ Review recommendations updated!' as status;

## Examples: Transactions

Group operations that must succeed or fail together:

In [None]:
-- Example 18: Transaction for order processing
-- This is conceptual - SQLite transactions work but are harder to demonstrate in notebook
-- In real applications, you'd use:

-- BEGIN TRANSACTION;
-- 
-- -- Step 1: Create order
-- INSERT INTO orders (customer_id, order_date, total_amount) 
-- VALUES (1, CURRENT_DATE, 99.99);
-- 
-- -- Step 2: Add order items
-- INSERT INTO order_items (order_id, product_id, quantity, unit_price, total_price)
-- VALUES (LAST_INSERT_ROWID(), 1, 1, 99.99, 99.99);
-- 
-- -- Step 3: Update inventory
-- UPDATE products SET stock_quantity = stock_quantity - 1 WHERE product_id = 1;
-- 
-- COMMIT;

SELECT '✅ Transaction concept demonstrated!' as status;

## Examples: Data Integrity & Constraints

Let's see how constraints protect our data:

In [None]:
-- Example 19: Try to insert invalid data (this will fail)
-- INSERT INTO product_reviews (product_id, customer_id, rating) 
-- VALUES (999, 1, 6);  -- Rating 6 is invalid (max 5)

SELECT '✅ Constraint example: Rating must be 1-5!' as note;

In [None]:
-- Example 20: Try to delete referenced data (this would fail)
-- DELETE FROM customers WHERE customer_id = 1;  -- Will fail if there are orders

SELECT '✅ Foreign key protection prevents orphaned records!' as note;

## Examples: Advanced Data Management

Complex data operations:

In [None]:
-- Example 21: Bulk update based on complex logic
UPDATE customer_preferences
SET loyalty_tier = CASE
    WHEN customer_id IN (
        SELECT c.customer_id
        FROM customers c
        LEFT JOIN orders o ON c.customer_id = o.customer_id
        GROUP BY c.customer_id
        HAVING COALESCE(SUM(o.total_amount), 0) > 300
    ) THEN 'platinum'
    WHEN customer_id IN (
        SELECT c.customer_id
        FROM customers c
        LEFT JOIN orders o ON c.customer_id = o.customer_id
        GROUP BY c.customer_id
        HAVING COALESCE(SUM(o.total_amount), 0) > 150
    ) THEN 'gold'
    WHEN customer_id IN (
        SELECT c.customer_id
        FROM customers c
        LEFT JOIN orders o ON c.customer_id = o.customer_id
        GROUP BY c.customer_id
        HAVING COALESCE(SUM(o.total_amount), 0) > 50
    ) THEN 'silver'
    ELSE 'bronze'
END;

SELECT '✅ Loyalty tiers updated based on spending!' as status;

In [None]:
-- Example 22: Create summary table from existing data
CREATE TABLE sales_summary AS
SELECT strftime('%Y-%m', o.order_date) as month,
       COUNT(DISTINCT o.order_id) as orders_count,
       COUNT(DISTINCT o.customer_id) as customers_count,
       SUM(o.total_amount) as total_revenue,
       AVG(o.total_amount) as avg_order_value,
       COUNT(DISTINCT oi.product_id) as products_sold
FROM orders o
LEFT JOIN order_items oi ON o.order_id = oi.order_id
GROUP BY strftime('%Y-%m', o.order_date)
ORDER BY month;

SELECT '✅ Sales summary table created!' as status;

In [None]:
-- Example 23: View the summary data
SELECT * FROM sales_summary ORDER BY month DESC;

## Exercises

### Exercise 1: CREATE TABLE
Create a table for product categories with proper constraints

In [None]:
-- Your code here
CREATE TABLE categories (
    category_id INTEGER PRIMARY KEY,
    category_name TEXT NOT NULL UNIQUE,
    description TEXT,
    created_at DATETIME DEFAULT CURRENT_TIMESTAMP,
    active BOOLEAN DEFAULT TRUE
);

SELECT '✅ Categories table created!' as status;

### Exercise 2: INSERT Data
Insert some sample categories and update products to reference them

In [None]:
-- Your code here
INSERT INTO categories (category_name, description) VALUES
('Electronics', 'Electronic devices and gadgets'),
('Sports', 'Sports equipment and apparel'),
('Appliances', 'Home appliances'),
('Books', 'Books and publications');

SELECT '✅ Categories inserted!' as status;

### Exercise 3: UPDATE with JOIN
Update customer preferences based on their order history

In [None]:
-- Your code here
UPDATE customer_preferences
SET favorite_category = (
    SELECT p.category
    FROM orders o
    INNER JOIN order_items oi ON o.order_id = oi.order_id
    INNER JOIN products p ON oi.product_id = p.product_id
    WHERE o.customer_id = customer_preferences.customer_id
    GROUP BY p.category
    ORDER BY COUNT(*) DESC
    LIMIT 1
)
WHERE customer_id IN (SELECT customer_id FROM orders);

SELECT '✅ Customer favorite categories updated!' as status;

### Exercise 4: Safe DELETE
Delete reviews that are both low-rated and have no helpful votes

In [None]:
-- Your code here
DELETE FROM product_reviews
WHERE rating <= 2 
  AND helpful_votes = 0 
  AND review_date < DATE('now', '-30 days');

SELECT '✅ Old low-quality reviews cleaned up!' as status;

### Exercise 5: ALTER TABLE
Add a new column to track product return rates

In [None]:
-- Your code here
ALTER TABLE products ADD COLUMN return_rate DECIMAL(5,2) DEFAULT 0.00;

SELECT '✅ Return rate column added!' as status;

### Exercise 6: Complex UPDATE
Update product return rates based on inventory log data

In [None]:
-- Your code here
UPDATE products
SET return_rate = (
    SELECT ROUND(
        CAST(COUNT(CASE WHEN change_type = 'return' THEN 1 END) AS FLOAT) /
        NULLIF(COUNT(*), 0) * 100, 2
    )
    FROM inventory_log
    WHERE inventory_log.product_id = products.product_id
)
WHERE product_id IN (SELECT DISTINCT product_id FROM inventory_log);

SELECT '✅ Product return rates calculated!' as status;

## Debug-Me Cell

This UPDATE statement has an error. Can you fix it?

The goal: Update all products to mark discontinued status for products with no recent sales

In [None]:
-- Debug this UPDATE - it should mark products as discontinued
UPDATE products
SET discontinued = TRUE
WHERE product_id NOT IN (
    SELECT DISTINCT oi.product_id
    FROM order_items oi
    INNER JOIN orders o ON oi.order_id = o.order_id
    WHERE o.order_date > DATE('now', '-90 days')
);

-- This might fail because we just added the discontinued column!
-- Hint: Make sure the column exists before trying to update it!

## Takeaways & Further Reading

### Data Manipulation Commands Mastered:
✅ **CREATE TABLE**: Design database structures with constraints  
✅ **INSERT**: Add new data (single row or multiple rows)  
✅ **UPDATE**: Modify existing data with WHERE conditions  
✅ **DELETE**: Remove data safely with proper WHERE clauses  
✅ **ALTER TABLE**: Modify table structures after creation  

### Data Integrity Features:
✅ **Primary Keys**: Unique record identifiers  
✅ **Foreign Keys**: Maintain relationships between tables  
✅ **NOT NULL**: Required fields  
✅ **UNIQUE**: Unique value constraints  
✅ **CHECK**: Custom validation rules  
✅ **DEFAULT**: Automatic values  

### Key Concepts:
- **Transactions**: Group operations for consistency
- **Constraints**: Protect data integrity
- **Bulk Operations**: Efficient mass updates/deletes
- **Subqueries in DML**: Use SELECT in INSERT/UPDATE/DELETE
- **Safe Operations**: Always test before production changes

### Best Practices:
- **Test First**: Always test operations on copies
- **Backup Data**: Create backups before major changes
- **Use Transactions**: Group related operations
- **Validate Constraints**: Ensure data integrity
- **Document Changes**: Track what you modify

### Tomorrow Preview:
Day 7: **SQL Project** - Apply everything you've learned to a comprehensive e-commerce analytics project. You'll build real business reports and insights!

### Practice Resources:
- [SQL INSERT Tutorial](https://www.w3schools.com/sql/sql_insert.asp)
- [SQL UPDATE Tutorial](https://www.w3schools.com/sql/sql_update.asp)
- [SQL Constraints](https://www.w3schools.com/sql/sql_constraints.asp)

**Congratulations! You're now a database architect who can create, modify, and manage data like a pro! 🏗️**