# Lesson 11: Creating a Database and Working with CSV Data

This lesson covers the complete workflow of database operations:
1. **Creating a database** from scratch
2. **Creating tables** with proper relationships
3. **Importing data** from CSV files
4. **Updating data** in the database
5. **Querying data** to verify changes

We'll build a restaurant ordering system database and learn how to populate and maintain it.

**Database Location:** `/workspaces/Fall2025-MS3083-Base_Template/databases/`

## Part 1: Setup and Database Creation

Before we can work with a database, we need to:
1. **Import Python libraries** that let us connect to PostgreSQL and work with data
2. **Create a new database** to store our restaurant information
3. **Establish a connection** so we can send SQL commands to the database

### Why These Libraries?

- **psycopg2**: Low-level PostgreSQL connector - talks directly to the database
- **pandas**: Makes working with tables easy - like Excel but in Python
- **SQLAlchemy**: Modern database toolkit - provides a "text()" function for SQL 2.0+ compatibility
- **csv**: Creates and reads CSV files
- **os**: Helps us build file paths that work on any operating system
- **datetime**: Handles date and time data properly

In [2]:
# Import necessary libraries
import psycopg2  # PostgreSQL adapter - allows Python to talk to PostgreSQL databases
import pandas as pd  # Data manipulation library - makes working with tables easy (like Excel in Python)
from sqlalchemy import create_engine, text  # SQLAlchemy - modern database toolkit (text() needed for SQL 2.0+)
import os  # Operating system interface - helps build file paths that work across different systems
import csv  # CSV file handling - creates and reads comma-separated value files
from datetime import datetime, date  # Date/time handling - ensures dates are stored in correct format

# Database connection parameters
# These variables store the information needed to connect to our PostgreSQL database
DB_NAME = "restaurant_orders_db"  # Name of our new database (we'll create this if it doesn't exist)
DB_USER = "student"  # PostgreSQL username (this user already exists in your dev container)
DB_PASSWORD = ""  # Password (empty string for local dev container - no password needed)
DB_HOST = "localhost"  # Server location ("localhost" means this computer)
DB_PORT = "5432"  # PostgreSQL default port (like a door number for the database service)

# Print confirmation that libraries loaded successfully
print("Libraries imported successfully!")
print(f"Target database: {DB_NAME}")

Libraries imported successfully!
Target database: restaurant_orders_db


### Create the Database

Now we'll create our database. Here's the process:

1. **Connect to the default 'postgres' database** (this database always exists)
2. **Check if our database already exists** (to avoid errors if we run this cell twice)
3. **Create the database** if it doesn't exist
4. **Create a SQLAlchemy engine** to connect to our new database

**Why two connections?**
- First connection (psycopg2): Creates the database itself
- Second connection (SQLAlchemy): Works with tables inside the database

**What is autocommit?**
- Normally, database changes wait for a COMMIT command
- `autocommit = True` means changes happen immediately
- Needed for CREATE DATABASE (can't be inside a transaction)

In [3]:
# Create the database if it doesn't exist
try:
    # Step 1: Connect to the default 'postgres' database first
    # We can't create a database while connected to it, so we connect to 'postgres'
    conn = psycopg2.connect(
        dbname="postgres",  # Connect to the default database (always exists)
        user=DB_USER,  # Use the student username
        password=DB_PASSWORD,  # Empty password for local dev container
        host=DB_HOST,  # localhost = this computer
        port=DB_PORT  # 5432 = PostgreSQL's default port
    )
    conn.autocommit = True  # Enable autocommit mode (changes happen immediately without COMMIT)
    cursor = conn.cursor()  # Create a cursor object to execute SQL commands
    
    # Step 2: Check if our database already exists
    # Query the system catalog pg_database to see if our database name is there
    cursor.execute(f"SELECT 1 FROM pg_database WHERE datname = '{DB_NAME}'")
    exists = cursor.fetchone()  # fetchone() returns None if no rows found, or a tuple if found
    
    if not exists:
        # Step 3: Create the database (only if it doesn't exist)
        cursor.execute(f"CREATE DATABASE {DB_NAME}")
        print(f"✓ Database '{DB_NAME}' created successfully!")
    else:
        # Database already exists - that's okay, we'll just use it
        print(f"✓ Database '{DB_NAME}' already exists.")
    
    # Step 4: Clean up - close cursor and connection
    # Always close database connections when done to free up resources
    cursor.close()
    conn.close()
    
except Exception as e:
    # If any error occurs, print it so we can troubleshoot
    print(f"Error creating database: {e}")

# Step 5: Now create a SQLAlchemy engine to connect to our NEW database
# SQLAlchemy is more modern and works better with pandas
# Connection string format: postgresql://username:password@host:port/database_name
connection_string = f"postgresql://{DB_USER}:{DB_PASSWORD}@{DB_HOST}:{DB_PORT}/{DB_NAME}"
engine = create_engine(connection_string)  # Create the engine (connection factory)
print(f"\n✓ Connected to database: {DB_NAME}")

✓ Database 'restaurant_orders_db' created successfully!

✓ Connected to database: restaurant_orders_db


## Part 2: Create Database Tables

Now we'll create the table structure for our restaurant ordering system. Our database will have:

- **menu_items**: Items available for order
- **employees**: Staff members who take orders
- **suppliers**: Vendors who provide inventory
- **orders**: Customer orders (links employees and menu items)
- **inventory**: Stock levels (links menu items and suppliers)

**Important:** Tables must be created in the correct order to satisfy foreign key dependencies!

In [4]:
# Create all tables for the restaurant system
create_tables_sql = """
-- Drop existing tables if they exist (for clean re-runs)
DROP TABLE IF EXISTS inventory CASCADE;
DROP TABLE IF EXISTS orders CASCADE;
DROP TABLE IF EXISTS employees CASCADE;
DROP TABLE IF EXISTS menu_items CASCADE;
DROP TABLE IF EXISTS suppliers CASCADE;

-- Menu Items table (no dependencies)
CREATE TABLE menu_items (
    item_id SERIAL PRIMARY KEY,
    name VARCHAR(100) NOT NULL UNIQUE,
    price DECIMAL(10, 2) NOT NULL CHECK (price >= 0),
    category VARCHAR(50),
    description TEXT
);

-- Employees table (no dependencies)
CREATE TABLE employees (
    employee_id SERIAL PRIMARY KEY,
    first_name VARCHAR(50) NOT NULL,
    last_name VARCHAR(50) NOT NULL,
    contact VARCHAR(100) NOT NULL UNIQUE,
    hire_date DATE DEFAULT CURRENT_DATE,
    position VARCHAR(50)
);

-- Suppliers table (no dependencies)
CREATE TABLE suppliers (
    supplier_id SERIAL PRIMARY KEY,
    name VARCHAR(100) NOT NULL UNIQUE,
    phone VARCHAR(20) NOT NULL,
    email VARCHAR(100) NOT NULL UNIQUE,
    address TEXT
);

-- Orders table (depends on employees and menu_items)
CREATE TABLE orders (
    order_id SERIAL PRIMARY KEY,
    order_date DATE NOT NULL DEFAULT CURRENT_DATE,
    employee_id INTEGER NOT NULL,
    item_id INTEGER NOT NULL,
    quantity INTEGER NOT NULL DEFAULT 1 CHECK (quantity > 0),
    total_price DECIMAL(10, 2) NOT NULL CHECK (total_price >= 0),
    CONSTRAINT FK_orders_employee_id
        FOREIGN KEY (employee_id) 
        REFERENCES employees(employee_id),
    CONSTRAINT FK_orders_item_id
        FOREIGN KEY (item_id) 
        REFERENCES menu_items(item_id)
);

-- Inventory table (depends on suppliers and menu_items)
CREATE TABLE inventory (
    inventory_id SERIAL PRIMARY KEY,
    item_id INTEGER NOT NULL,
    quantity INTEGER NOT NULL CHECK (quantity >= 0),
    supplier_id INTEGER NOT NULL,
    last_restock_date DATE DEFAULT CURRENT_DATE,
    CONSTRAINT FK_inventory_item_id
        FOREIGN KEY (item_id)
        REFERENCES menu_items(item_id),
    CONSTRAINT FK_inventory_supplier_id
        FOREIGN KEY (supplier_id) 
        REFERENCES suppliers(supplier_id)
);
"""

# Execute the table creation SQL
try:
    with engine.connect() as connection:
        connection.execute(text(create_tables_sql))
        connection.commit()
        print("✓ All tables created successfully!")
    
    # Verify tables were created
    tables_query = """
    SELECT table_name 
    FROM information_schema.tables 
    WHERE table_schema = 'public'
    ORDER BY table_name;
    """
    df_tables = pd.read_sql(tables_query, engine)
    print(f"\nTables in database ({len(df_tables)}):")    
    for idx, table in enumerate(df_tables['table_name'], 1):
        print(f"  {idx}. {table}")
    
except Exception as e:
    print(f"Error creating tables: {e}")

✓ All tables created successfully!

Tables in database (5):
  1. employees
  2. inventory
  3. menu_items
  4. orders
  5. suppliers


## Part 3: Create Sample CSV Files

Before we can import data, we need CSV files! Let's create sample data files for our restaurant.

In [5]:
# Define the directory for our CSV files
csv_dir = '/workspaces/Fall2025-MS3083-Base_Template/databases'

# Create sample menu items data
menu_items_data = [
    ['name', 'price', 'category', 'description'],
    ['Cheeseburger', 12.99, 'Entree', 'Classic burger with cheddar cheese'],
    ['Caesar Salad', 9.99, 'Salad', 'Romaine lettuce with Caesar dressing'],
    ['Margherita Pizza', 14.99, 'Entree', 'Fresh mozzarella and basil'],
    ['Chicken Wings', 10.99, 'Appetizer', 'Spicy buffalo wings'],
    ['Chocolate Cake', 7.99, 'Dessert', 'Rich chocolate layer cake'],
    ['French Fries', 4.99, 'Side', 'Crispy golden fries'],
    ['Soda', 2.99, 'Beverage', 'Assorted soft drinks'],
    ['Iced Tea', 2.49, 'Beverage', 'Freshly brewed iced tea']
]

menu_csv_path = os.path.join(csv_dir, 'menu_items.csv')
with open(menu_csv_path, 'w', newline='') as f:
    writer = csv.writer(f)
    writer.writerows(menu_items_data)
print(f"✓ Created: {menu_csv_path}")

# Create sample employees data
employees_data = [
    ['first_name', 'last_name', 'contact', 'hire_date', 'position'],
    ['Hillary', 'McAllister', 'hillary.m@restaurant.com', '2024-01-15', 'Server'],
    ['John', 'Smith', 'john.smith@restaurant.com', '2023-06-20', 'Manager'],
    ['Sarah', 'Johnson', 'sarah.j@restaurant.com', '2024-03-10', 'Server'],
    ['Mike', 'Davis', 'mike.davis@restaurant.com', '2023-11-05', 'Cook'],
    ['Emily', 'Brown', 'emily.b@restaurant.com', '2024-02-28', 'Host']
]

employees_csv_path = os.path.join(csv_dir, 'employees.csv')
with open(employees_csv_path, 'w', newline='') as f:
    writer = csv.writer(f)
    writer.writerows(employees_data)
print(f"✓ Created: {employees_csv_path}")

# Create sample suppliers data
suppliers_data = [
    ['name', 'phone', 'email', 'address'],
    ['Fresh Foods Inc', '555-0101', 'orders@freshfoods.com', '123 Farm Road'],
    ['Quality Meats Co', '555-0202', 'sales@qualitymeats.com', '456 Butcher Lane'],
    ['Dairy Delights', '555-0303', 'info@dairydelights.com', '789 Milk Street'],
    ['Veggie Suppliers', '555-0404', 'contact@veggiesuppliers.com', '321 Garden Ave']
]

suppliers_csv_path = os.path.join(csv_dir, 'suppliers.csv')
with open(suppliers_csv_path, 'w', newline='') as f:
    writer = csv.writer(f)
    writer.writerows(suppliers_data)
print(f"✓ Created: {suppliers_csv_path}")

print(f"\n✓ All CSV files created in: {csv_dir}")

✓ Created: /workspaces/Fall2025-MS3083-Base_Template/databases/menu_items.csv
✓ Created: /workspaces/Fall2025-MS3083-Base_Template/databases/employees.csv
✓ Created: /workspaces/Fall2025-MS3083-Base_Template/databases/suppliers.csv

✓ All CSV files created in: /workspaces/Fall2025-MS3083-Base_Template/databases


## Part 4: Import CSV Data into Database

Now we'll import our CSV files into the database tables using pandas. This is much easier than manually inserting each row!

**Key Points:**
- `pd.read_csv()` reads the CSV file into a DataFrame
- `to_sql()` uploads the DataFrame to PostgreSQL
- `if_exists='append'` adds data to the existing table
- `index=False` prevents adding the DataFrame index as a column

In [6]:
# Import menu items from CSV
print("Importing menu items...")
df_menu = pd.read_csv(menu_csv_path)
df_menu.to_sql('menu_items', engine, if_exists='append', index=False)
print(f"  ✓ Imported {len(df_menu)} menu items")

# Import employees from CSV
print("\nImporting employees...")
df_employees = pd.read_csv(employees_csv_path)
# Convert hire_date to proper date format
df_employees['hire_date'] = pd.to_datetime(df_employees['hire_date'])
df_employees.to_sql('employees', engine, if_exists='append', index=False)
print(f"  ✓ Imported {len(df_employees)} employees")

# Import suppliers from CSV
print("\nImporting suppliers...")
df_suppliers = pd.read_csv(suppliers_csv_path)
df_suppliers.to_sql('suppliers', engine, if_exists='append', index=False)
print(f"  ✓ Imported {len(df_suppliers)} suppliers")

print("\n" + "="*50)
print("✓ ALL CSV DATA IMPORTED SUCCESSFULLY!")
print("="*50)

Importing menu items...
  ✓ Imported 8 menu items

Importing employees...
  ✓ Imported 5 employees

Importing suppliers...
  ✓ Imported 4 suppliers

✓ ALL CSV DATA IMPORTED SUCCESSFULLY!


### Verify the Imported Data

Let's query our tables to see the data we just imported.

In [7]:
# Query and display menu items
print("MENU ITEMS:")
print("="*80)
df_menu_check = pd.read_sql("SELECT * FROM menu_items ORDER BY category, name;", engine)
print(df_menu_check.to_string(index=False))

# Query and display employees
print("\n\nEMPLOYEES:")
print("="*80)
df_employees_check = pd.read_sql("SELECT * FROM employees ORDER BY hire_date;", engine)
print(df_employees_check.to_string(index=False))

# Query and display suppliers
print("\n\nSUPPLIERS:")
print("="*80)
df_suppliers_check = pd.read_sql("SELECT * FROM suppliers ORDER BY name;", engine)
print(df_suppliers_check.to_string(index=False))

MENU ITEMS:
 item_id             name  price  category                          description
       4    Chicken Wings  10.99 Appetizer                  Spicy buffalo wings
       8         Iced Tea   2.49  Beverage              Freshly brewed iced tea
       7             Soda   2.99  Beverage                 Assorted soft drinks
       5   Chocolate Cake   7.99   Dessert            Rich chocolate layer cake
       1     Cheeseburger  12.99    Entree   Classic burger with cheddar cheese
       3 Margherita Pizza  14.99    Entree           Fresh mozzarella and basil
       2     Caesar Salad   9.99     Salad Romaine lettuce with Caesar dressing
       6     French Fries   4.99      Side                  Crispy golden fries


EMPLOYEES:
 employee_id first_name  last_name                   contact  hire_date position
           2       John      Smith john.smith@restaurant.com 2023-06-20  Manager
           4       Mike      Davis mike.davis@restaurant.com 2023-11-05     Cook
           1

## Part 5: Add More Data Using SQL INSERT

Sometimes you need to add data directly using SQL INSERT statements, especially when:
- Adding a single record
- Working with foreign key relationships
- Calculating values based on other data

Let's add some orders and inventory records.

In [8]:
# Insert sample orders
# Note: We need to use actual employee_id and item_id values from our tables
insert_orders_sql = """
INSERT INTO orders (order_date, employee_id, item_id, quantity, total_price) VALUES
    ('2024-11-01', 1, 1, 2, 25.98),  -- Hillary sold 2 Cheeseburgers
    ('2024-11-01', 1, 7, 2, 5.98),   -- Hillary sold 2 Sodas
    ('2024-11-02', 3, 3, 1, 14.99),  -- Sarah sold 1 Margherita Pizza
    ('2024-11-02', 1, 2, 1, 9.99),   -- Hillary sold 1 Caesar Salad
    ('2024-11-03', 3, 4, 3, 32.97),  -- Sarah sold 3 Chicken Wings
    ('2024-11-03', 1, 5, 2, 15.98),  -- Hillary sold 2 Chocolate Cakes
    ('2024-11-04', 3, 1, 1, 12.99),  -- Sarah sold 1 Cheeseburger
    ('2024-11-04', 3, 6, 1, 4.99);   -- Sarah sold 1 French Fries
"""

try:
    with engine.connect() as connection:
        result = connection.execute(text(insert_orders_sql))
        connection.commit()
        print(f"✓ Inserted {result.rowcount} orders")
except Exception as e:
    print(f"Error inserting orders: {e}")

# Insert sample inventory
insert_inventory_sql = """
INSERT INTO inventory (item_id, quantity, supplier_id, last_restock_date) VALUES
    (1, 50, 2, '2024-10-28'),   -- Cheeseburgers from Quality Meats
    (2, 30, 4, '2024-10-29'),   -- Caesar Salad from Veggie Suppliers
    (3, 25, 3, '2024-10-28'),   -- Margherita Pizza from Dairy Delights
    (4, 40, 2, '2024-10-30'),   -- Chicken Wings from Quality Meats
    (5, 15, 1, '2024-10-27'),   -- Chocolate Cake from Fresh Foods
    (6, 100, 1, '2024-10-30'),  -- French Fries from Fresh Foods
    (7, 200, 1, '2024-11-01'),  -- Soda from Fresh Foods
    (8, 150, 1, '2024-11-01');  -- Iced Tea from Fresh Foods
"""

try:
    with engine.connect() as connection:
        result = connection.execute(text(insert_inventory_sql))
        connection.commit()
        print(f"✓ Inserted {result.rowcount} inventory records")
except Exception as e:
    print(f"Error inserting inventory: {e}")

print("\n✓ Additional data inserted successfully!")

✓ Inserted 8 orders
✓ Inserted 8 inventory records

✓ Additional data inserted successfully!


### View the Complete Database

Let's look at our orders and inventory data.

In [9]:
# View orders with employee and item names (using JOIN)
orders_query = """
SELECT 
    o.order_id,
    o.order_date,
    e.first_name || ' ' || e.last_name AS employee_name,
    m.name AS item_name,
    o.quantity,
    o.total_price
FROM orders o
JOIN employees e ON o.employee_id = e.employee_id
JOIN menu_items m ON o.item_id = m.item_id
ORDER BY o.order_date, o.order_id;
"""

print("ORDERS:")
print("="*100)
df_orders = pd.read_sql(orders_query, engine)
print(df_orders.to_string(index=False))

# View inventory with item and supplier names (using JOIN)
inventory_query = """
SELECT 
    i.inventory_id,
    m.name AS item_name,
    i.quantity,
    s.name AS supplier_name,
    i.last_restock_date
FROM inventory i
JOIN menu_items m ON i.item_id = m.item_id
JOIN suppliers s ON i.supplier_id = s.supplier_id
ORDER BY m.name;
"""

print("\n\nINVENTORY:")
print("="*100)
df_inventory = pd.read_sql(inventory_query, engine)
print(df_inventory.to_string(index=False))

ORDERS:
 order_id order_date      employee_name        item_name  quantity  total_price
        1 2024-11-01 Hillary McAllister     Cheeseburger         2        25.98
        2 2024-11-01 Hillary McAllister             Soda         2         5.98
        3 2024-11-02      Sarah Johnson Margherita Pizza         1        14.99
        4 2024-11-02 Hillary McAllister     Caesar Salad         1         9.99
        5 2024-11-03      Sarah Johnson    Chicken Wings         3        32.97
        6 2024-11-03 Hillary McAllister   Chocolate Cake         2        15.98
        7 2024-11-04      Sarah Johnson     Cheeseburger         1        12.99
        8 2024-11-04      Sarah Johnson     French Fries         1         4.99


INVENTORY:
 inventory_id        item_name  quantity    supplier_name last_restock_date
            2     Caesar Salad        30 Veggie Suppliers        2024-10-29
            1     Cheeseburger        50 Quality Meats Co        2024-10-28
            4    Chicken Wings 

## Part 6: Update Data in the Database

Now let's practice updating data. Common scenarios include:
- Updating prices
- Adjusting inventory quantities
- Correcting employee information
- Modifying order details

**⚠️ CRITICAL:** Always use a WHERE clause in UPDATE statements, or you'll update ALL rows!

### Example 1: Update Menu Item Prices

Let's increase all beverage prices by 10%.

In [10]:
# First, let's see current beverage prices
print("BEFORE UPDATE:")
print("="*60)
df_before = pd.read_sql("SELECT name, price, category FROM menu_items WHERE category = 'Beverage';", engine)
print(df_before.to_string(index=False))

# Update beverage prices (increase by 10%)
update_prices_sql = """
UPDATE menu_items
SET price = price * 1.10
WHERE category = 'Beverage';
"""

with engine.connect() as connection:
    result = connection.execute(text(update_prices_sql))
    connection.commit()
    print(f"\n✓ Updated {result.rowcount} beverage prices (increased by 10%)")

# View the updated prices
print("\nAFTER UPDATE:")
print("="*60)
df_after = pd.read_sql("SELECT name, price, category FROM menu_items WHERE category = 'Beverage';", engine)
print(df_after.to_string(index=False))

BEFORE UPDATE:
    name  price category
    Soda   2.99 Beverage
Iced Tea   2.49 Beverage

✓ Updated 2 beverage prices (increased by 10%)

AFTER UPDATE:
    name  price category
    Soda   3.29 Beverage
Iced Tea   2.74 Beverage


### Example 2: Update Inventory Quantities

When items are sold, we need to reduce inventory. Let's subtract sold quantities from stock.

In [11]:
# First, let's see current inventory for Cheeseburgers
print("BEFORE UPDATE:")
print("="*60)
check_query = """
SELECT i.inventory_id, m.name, i.quantity, i.last_restock_date
FROM inventory i
JOIN menu_items m ON i.item_id = m.item_id
WHERE m.name = 'Cheeseburger';
"""
df_before_inv = pd.read_sql(check_query, engine)
print(df_before_inv.to_string(index=False))

# We sold 3 cheeseburgers total (2 + 1), so reduce inventory by 3
update_inventory_sql = """
UPDATE inventory
SET quantity = quantity - 3
WHERE item_id = (SELECT item_id FROM menu_items WHERE name = 'Cheeseburger');
"""

with engine.connect() as connection:
    result = connection.execute(text(update_inventory_sql))
    connection.commit()
    print(f"\n✓ Updated inventory (reduced by 3 units)")

# View the updated inventory
print("\nAFTER UPDATE:")
print("="*60)
df_after_inv = pd.read_sql(check_query, engine)
print(df_after_inv.to_string(index=False))
print(f"\nChange: {df_before_inv['quantity'].values[0]} → {df_after_inv['quantity'].values[0]} (-3)")

BEFORE UPDATE:
 inventory_id         name  quantity last_restock_date
            1 Cheeseburger        50        2024-10-28

✓ Updated inventory (reduced by 3 units)

AFTER UPDATE:
 inventory_id         name  quantity last_restock_date
            1 Cheeseburger        47        2024-10-28

Change: 50 → 47 (-3)


### Example 3: Update Employee Information

Let's promote Hillary to Manager and update her position.

In [12]:
# Check Hillary's current information
print("BEFORE UPDATE:")
print("="*60)
df_hillary_before = pd.read_sql(
    "SELECT * FROM employees WHERE first_name = 'Hillary' AND last_name = 'McAllister';",
    engine
)
print(df_hillary_before.to_string(index=False))

# Update Hillary's position
update_employee_sql = """
UPDATE employees
SET position = 'Manager'
WHERE first_name = 'Hillary' AND last_name = 'McAllister';
"""

with engine.connect() as connection:
    result = connection.execute(text(update_employee_sql))
    connection.commit()
    print(f"\n✓ Updated employee record - Hillary promoted to Manager!")

# View the updated record
print("\nAFTER UPDATE:")
print("="*60)
df_hillary_after = pd.read_sql(
    "SELECT * FROM employees WHERE first_name = 'Hillary' AND last_name = 'McAllister';",
    engine
)
print(df_hillary_after.to_string(index=False))

BEFORE UPDATE:
 employee_id first_name  last_name                  contact  hire_date position
           1    Hillary McAllister hillary.m@restaurant.com 2024-01-15   Server

✓ Updated employee record - Hillary promoted to Manager!

AFTER UPDATE:
 employee_id first_name  last_name                  contact  hire_date position
           1    Hillary McAllister hillary.m@restaurant.com 2024-01-15  Manager


### Example 4: Bulk Update with CSV

Sometimes you need to update many records at once. You can:
1. Export data to CSV
2. Edit the CSV in Excel or another tool
3. Re-import the data

Let's demonstrate this with menu item descriptions.

In [13]:
# Export current menu items to CSV
export_csv_path = os.path.join(csv_dir, 'menu_items_export.csv')
df_menu_export = pd.read_sql("SELECT * FROM menu_items;", engine)
df_menu_export.to_csv(export_csv_path, index=False)
print(f"✓ Exported menu items to: {export_csv_path}")

# Simulate editing the CSV - let's add "UPDATED:" prefix to all descriptions
df_menu_export['description'] = 'UPDATED: ' + df_menu_export['description'].astype(str)

# Save the modified CSV
updated_csv_path = os.path.join(csv_dir, 'menu_items_updated.csv')
df_menu_export.to_csv(updated_csv_path, index=False)
print(f"✓ Created updated CSV: {updated_csv_path}")

# Show the changes we're about to make
print("\nChanges to be applied:")
print("="*80)
print(df_menu_export[['name', 'description']].to_string(index=False))

# To apply updates from CSV, we need to use UPDATE statements or replace the table
# For this demo, we'll update each row individually
print("\nApplying updates...")
for idx, row in df_menu_export.iterrows():
    update_sql = f"""
    UPDATE menu_items
    SET description = '{row['description']}'
    WHERE item_id = {row['item_id']};
    """
    with engine.connect() as connection:
        connection.execute(text(update_sql))
        connection.commit()

print(f"✓ Updated {len(df_menu_export)} menu item descriptions")

# Verify the updates
print("\nVERIFICATION:")
print("="*80)
df_verify = pd.read_sql("SELECT name, description FROM menu_items;", engine)
print(df_verify.to_string(index=False))

✓ Exported menu items to: /workspaces/Fall2025-MS3083-Base_Template/databases/menu_items_export.csv
✓ Created updated CSV: /workspaces/Fall2025-MS3083-Base_Template/databases/menu_items_updated.csv

Changes to be applied:
            name                                   description
    Cheeseburger   UPDATED: Classic burger with cheddar cheese
    Caesar Salad UPDATED: Romaine lettuce with Caesar dressing
Margherita Pizza           UPDATED: Fresh mozzarella and basil
   Chicken Wings                  UPDATED: Spicy buffalo wings
  Chocolate Cake            UPDATED: Rich chocolate layer cake
    French Fries                  UPDATED: Crispy golden fries
            Soda                 UPDATED: Assorted soft drinks
        Iced Tea              UPDATED: Freshly brewed iced tea

Applying updates...
✓ Updated 8 menu item descriptions

VERIFICATION:
            name                                   description
    Cheeseburger   UPDATED: Classic burger with cheddar cheese
    Caesar Sal

## Part 7: Summary Statistics and Reporting

Now that we have data in our database, let's generate some useful reports.

In [14]:
# Sales by Employee
sales_by_employee_query = """
SELECT 
    e.first_name || ' ' || e.last_name AS employee_name,
    e.position,
    COUNT(o.order_id) AS total_orders,
    SUM(o.total_price) AS total_sales
FROM employees e
LEFT JOIN orders o ON e.employee_id = o.employee_id
GROUP BY e.employee_id, e.first_name, e.last_name, e.position
ORDER BY total_sales DESC NULLS LAST;
"""

print("SALES BY EMPLOYEE:")
print("="*80)
df_sales = pd.read_sql(sales_by_employee_query, engine)
print(df_sales.to_string(index=False))

# Popular Menu Items
popular_items_query = """
SELECT 
    m.name,
    m.category,
    COUNT(o.order_id) AS times_ordered,
    SUM(o.quantity) AS total_quantity_sold,
    SUM(o.total_price) AS total_revenue
FROM menu_items m
LEFT JOIN orders o ON m.item_id = o.item_id
GROUP BY m.item_id, m.name, m.category
ORDER BY total_quantity_sold DESC NULLS LAST;
"""

print("\n\nPOPULAR MENU ITEMS:")
print("="*80)
df_popular = pd.read_sql(popular_items_query, engine)
print(df_popular.to_string(index=False))

# Inventory Status
inventory_status_query = """
SELECT 
    m.name,
    i.quantity AS current_stock,
    s.name AS supplier,
    i.last_restock_date,
    CASE 
        WHEN i.quantity < 20 THEN 'LOW STOCK'
        WHEN i.quantity < 50 THEN 'MEDIUM'
        ELSE 'GOOD'
    END AS stock_status
FROM inventory i
JOIN menu_items m ON i.item_id = m.item_id
JOIN suppliers s ON i.supplier_id = s.supplier_id
ORDER BY i.quantity ASC;
"""

print("\n\nINVENTORY STATUS:")
print("="*80)
df_inv_status = pd.read_sql(inventory_status_query, engine)
print(df_inv_status.to_string(index=False))

SALES BY EMPLOYEE:
     employee_name position  total_orders  total_sales
     Sarah Johnson   Server             4        65.94
Hillary McAllister  Manager             4        57.93
       Emily Brown     Host             0          NaN
        Mike Davis     Cook             0          NaN
        John Smith  Manager             0          NaN


POPULAR MENU ITEMS:
            name  category  times_ordered  total_quantity_sold  total_revenue
   Chicken Wings Appetizer              1                  3.0          32.97
    Cheeseburger    Entree              2                  3.0          38.97
            Soda  Beverage              1                  2.0           5.98
  Chocolate Cake   Dessert              1                  2.0          15.98
Margherita Pizza    Entree              1                  1.0          14.99
    Caesar Salad     Salad              1                  1.0           9.99
    French Fries      Side              1                  1.0           4.99
     

## Practice Exercise

Try these exercises on your own:

1. **Add a new menu item** from CSV or INSERT statement
2. **Update inventory quantities** after processing more orders
3. **Add a new employee** and assign them some orders
4. **Create a report** showing daily sales totals
5. **Update supplier contact information**
6. **Calculate which menu items need restocking** (quantity < 25)

Use the cells below to practice!

In [15]:
# YOUR PRACTICE CODE HERE
# Exercise 1: Add a new menu item


In [16]:
# YOUR PRACTICE CODE HERE
# Exercise 2: Update inventory after sales


In [17]:
# YOUR PRACTICE CODE HERE
# Exercise 3: Add a new employee


## Summary

In this lesson, you learned:

✓ How to **create a new database** in PostgreSQL
✓ How to **create tables** with proper relationships and foreign keys
✓ How to **import data from CSV files** using pandas
✓ How to **insert data** using SQL INSERT statements
✓ How to **update existing data** with WHERE clauses
✓ How to **bulk update from CSV files**
✓ How to **query and report** on your data

### Key Takeaways:

1. **Always use WHERE clauses** with UPDATE statements
2. **Create tables in dependency order** (parent tables before child tables)
3. **Use pandas for bulk CSV imports** - much easier than manual INSERT statements
4. **Verify your changes** by querying the data after updates
5. **Join tables** to create meaningful reports

### Next Steps:

- Practice with your own database design
- Learn about database transactions and rollback
- Explore more complex queries with aggregations
- Study database normalization and optimization