## Part 1: Setup and Create Car Inventory Database

We'll create a simple car dealership inventory system with one table to track vehicles.

**Table Structure:**
- `car_id` - Unique identifier (Primary Key)
- `make` - Manufacturer (Toyota, Ford, etc.)
- `model` - Model name (Camry, Mustang, etc.)
- `year` - Year manufactured
- `color` - Car color
- `price` - Selling price
- `in_stock` - Whether car is available (TRUE/FALSE)

In [26]:
# Import necessary libraries
import psycopg2
import pandas as pd
from sqlalchemy import create_engine, text
import subprocess

# Database parameters
DB_NAME = "car_inventory_db"
DB_USER = "student"
DB_PASSWORD = ""
DB_HOST = "localhost"
DB_PORT = "5432"

print("=" * 80)
print("STEP 1: Creating car inventory database...")

# Connect to postgres database to create our new database
try:
    conn = psycopg2.connect(dbname="postgres", user=DB_USER, password=DB_PASSWORD,
                           host=DB_HOST, port=DB_PORT)
    conn.autocommit = True
    cursor = conn.cursor()
    
    # Terminate existing connections
    cursor.execute(f"""
        SELECT pg_terminate_backend(pg_stat_activity.pid)
        FROM pg_stat_activity
        WHERE pg_stat_activity.datname = '{DB_NAME}'
          AND pid <> pg_backend_pid();
    """)
    
    # Drop and recreate database
    cursor.execute(f"DROP DATABASE IF EXISTS {DB_NAME}")
    cursor.execute(f"CREATE DATABASE {DB_NAME}")
    print(f"✓ Database '{DB_NAME}' created successfully!")
    
    cursor.close()
    conn.close()
except Exception as e:
    print(f"❌ Error: {e}")

# Create SQLAlchemy engine for our new database
print("\nSTEP 2: Connecting to database...")
connection_string = f"postgresql://{DB_USER}:{DB_PASSWORD}@{DB_HOST}:{DB_PORT}/{DB_NAME}"
engine = create_engine(connection_string)

try:
    with engine.connect() as connection:
        result = connection.execute(text("SELECT version();"))
        version = result.fetchone()[0]
        print("✓ Successfully connected!")
        print(f"✓ PostgreSQL version: {version.split(',')[0]}")
except Exception as e:
    print(f"❌ Error: {e}")

print("\n" + "=" * 80)
print("DATABASE READY! Now we'll create tables and learn INSERT/UPDATE/DELETE.")
print("=" * 80)

STEP 1: Creating car inventory database...
✓ Database 'car_inventory_db' created successfully!

STEP 2: Connecting to database...
✓ Successfully connected!
✓ PostgreSQL version: PostgreSQL 18.0 on x86_64-conda-linux-gnu

DATABASE READY! Now we'll create tables and learn INSERT/UPDATE/DELETE.
✓ Database 'car_inventory_db' created successfully!

STEP 2: Connecting to database...
✓ Successfully connected!
✓ PostgreSQL version: PostgreSQL 18.0 on x86_64-conda-linux-gnu

DATABASE READY! Now we'll create tables and learn INSERT/UPDATE/DELETE.


## Part 2: CREATE TABLE

Before we can INSERT data, we need a table to store it.

**CREATE TABLE Syntax:**
```sql
CREATE TABLE table_name (
    column1 datatype constraints,
    column2 datatype constraints,
    ...
);
```

**Common Data Types:**
- `INTEGER` or `SERIAL` - Whole numbers (SERIAL auto-increments)
- `VARCHAR(n)` - Text up to n characters
- `DECIMAL(p,s)` - Numbers with decimals (p=precision, s=scale)
- `BOOLEAN` - TRUE/FALSE
- `DATE`, `TIMESTAMP` - Date and time values

**Common Constraints:**
- `PRIMARY KEY` - Unique identifier for each row
- `NOT NULL` - Column must have a value
- `UNIQUE` - No duplicate values allowed
- `DEFAULT` - Default value if none provided

In [27]:
# Create the car_inventory table
create_table_query = """
CREATE TABLE IF NOT EXISTS car_inventory (
    car_id SERIAL PRIMARY KEY,           -- Auto-incrementing ID
    make VARCHAR(50) NOT NULL,           -- Required field (can't be NULL)
    model VARCHAR(50) NOT NULL,          -- Required field
    year INTEGER NOT NULL,               -- Year manufactured
    color VARCHAR(30),                   -- Optional field (can be NULL)
    price DECIMAL(10,2),                 -- Price with 2 decimal places
    in_stock BOOLEAN DEFAULT TRUE,       -- Default to TRUE if not specified
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP  -- Auto-set creation time
);
"""

# Execute the CREATE TABLE statement
with engine.connect() as conn:
    conn.execute(text(create_table_query))
    conn.commit()
    print("✓ Table 'car_inventory' created successfully!")

# Verify table structure
structure_query = """
SELECT column_name, data_type, character_maximum_length, is_nullable, column_default
FROM information_schema.columns
WHERE table_name = 'car_inventory'
ORDER BY ordinal_position;
"""

df_structure = pd.read_sql(text(structure_query), engine)
print("\nTable Structure:")
df_structure

✓ Table 'car_inventory' created successfully!

Table Structure:


Unnamed: 0,column_name,data_type,character_maximum_length,is_nullable,column_default
0,car_id,integer,,NO,nextval('car_inventory_car_id_seq'::regclass)
1,make,character varying,50.0,NO,
2,model,character varying,50.0,NO,
3,year,integer,,NO,
4,color,character varying,30.0,YES,
5,price,numeric,,YES,
6,in_stock,boolean,,YES,true
7,created_at,timestamp without time zone,,YES,CURRENT_TIMESTAMP


## Part 3: INSERT Statements - Adding Data

The INSERT statement adds new rows to a table.

**Basic Syntax:**
```sql
INSERT INTO table_name (column1, column2, ...)
VALUES (value1, value2, ...);
```

**Important Notes:**
- Column names are optional if you provide values for ALL columns in order
- Text values must be in single quotes: `'Toyota'`
- Numbers don't need quotes: `2023`
- SERIAL columns auto-increment - don't specify them
- Can insert multiple rows at once

### Example 1: Insert a Single Row

Let's add our first car to the inventory.

In [28]:
# Insert a single car
insert_single_query = """
INSERT INTO car_inventory (make, model, year, color, price, in_stock)
VALUES ('Toyota', 'Camry', 2022, 'Blue', 28500.00, TRUE);
"""

with engine.connect() as conn:
    conn.execute(text(insert_single_query))
    conn.commit()
    print("✓ Inserted 1 car into inventory")

# Verify the insert
verify_query = "SELECT * FROM car_inventory;"
df_verify = pd.read_sql(text(verify_query), engine)
print("\nCurrent Inventory:")
df_verify

✓ Inserted 1 car into inventory

Current Inventory:


Unnamed: 0,car_id,make,model,year,color,price,in_stock,created_at
0,1,Toyota,Camry,2022,Blue,28500.0,True,2025-11-18 03:17:14.730960


### Example 2: Insert Multiple Rows at Once

More efficient than running INSERT multiple times.

**Syntax:**
```sql
INSERT INTO table_name (columns...)
VALUES 
    (row1_values...),
    (row2_values...),
    (row3_values...);
```

In [29]:
# Insert multiple cars at once
insert_multiple_query = """
INSERT INTO car_inventory (make, model, year, color, price, in_stock)
VALUES 
    ('Honda', 'Accord', 2023, 'Silver', 32000.00, TRUE),
    ('Ford', 'Mustang', 2021, 'Red', 45000.00, TRUE),
    ('Chevrolet', 'Corvette', 2023, 'Yellow', 75000.00, FALSE),
    ('Tesla', 'Model 3', 2023, 'White', 48000.00, TRUE);
"""

with engine.connect() as conn:
    result = conn.execute(text(insert_multiple_query))
    conn.commit()
    print(f"✓ Inserted 4 more cars into inventory")

# View updated inventory
df_inventory = pd.read_sql(text("SELECT * FROM car_inventory ORDER BY car_id;"), engine)
print(f"\nTotal cars in inventory: {len(df_inventory)}\n")
df_inventory

✓ Inserted 4 more cars into inventory

Total cars in inventory: 5



Unnamed: 0,car_id,make,model,year,color,price,in_stock,created_at
0,1,Toyota,Camry,2022,Blue,28500.0,True,2025-11-18 03:17:14.730960
1,2,Honda,Accord,2023,Silver,32000.0,True,2025-11-18 03:17:14.757405
2,3,Ford,Mustang,2021,Red,45000.0,True,2025-11-18 03:17:14.757405
3,4,Chevrolet,Corvette,2023,Yellow,75000.0,False,2025-11-18 03:17:14.757405
4,5,Tesla,Model 3,2023,White,48000.0,True,2025-11-18 03:17:14.757405


### Example 3: INSERT with RETURNING Clause

PostgreSQL's `RETURNING` clause lets you get values from inserted rows.

**Use Cases:**
- Get the auto-generated ID of inserted row
- Verify what was actually inserted
- Get default values that were applied

In [30]:
# Insert and return the generated ID and timestamp
insert_returning_query = """
INSERT INTO car_inventory (make, model, year, color, price, in_stock)
VALUES ('BMW', 'X5', 2022, 'Black', 65000.00, TRUE)
RETURNING car_id, make, model, created_at;  -- Return these columns from inserted row
"""

with engine.connect() as conn:
    result = conn.execute(text(insert_returning_query))
    returned_row = result.fetchone()
    conn.commit()
    
    print("✓ Inserted new car!")
    print(f"\nReturned values:")
    print(f"  Car ID: {returned_row[0]}")
    print(f"  Make: {returned_row[1]}")
    print(f"  Model: {returned_row[2]}")
    print(f"  Created at: {returned_row[3]}")

✓ Inserted new car!

Returned values:
  Car ID: 6
  Make: BMW
  Model: X5
  Created at: 2025-11-18 03:17:14.772993


## Part 4: Bulk Loading with COPY - Importing CSV Files

For large datasets, INSERT statements are slow. PostgreSQL's `COPY` command is much faster.

**Two Methods:**
1. **COPY FROM** - Direct PostgreSQL command (requires file access permissions)
2. **pandas to_sql()** - Python-based, easier for most use cases

**CSV File Location:**
- `/workspaces/Fall2025-MS3083-Base_Template/data/car_inventory.csv`
- `/workspaces/Fall2025-MS3083-Base_Template/data/new_cars.csv`

### Method 1: Using PostgreSQL COPY Command

Fastest method for bulk loading. Uses PostgreSQL's native COPY command.

**Syntax:**
```sql
COPY table_name (columns...)
FROM '/path/to/file.csv'
WITH (FORMAT CSV, HEADER TRUE, DELIMITER ',');
```

In [31]:
# First, let's look at the CSV file
csv_path = "/workspaces/Fall2025-MS3083-Base_Template/data/car_inventory.csv"

print("Preview of car_inventory.csv:")
print("=" * 80)
df_preview = pd.read_csv(csv_path)
print(df_preview.head())
print(f"\nTotal rows in CSV: {len(df_preview)}")

Preview of car_inventory.csv:
   car_id       make     model  year   color  price  in_stock
0       1     Toyota     Camry  2022    Blue  28500      True
1       2      Honda    Accord  2023  Silver  32000      True
2       3       Ford   Mustang  2021     Red  45000      True
3       4  Chevrolet  Corvette  2023  Yellow  75000     False
4       5      Tesla   Model 3  2023   White  48000      True

Total rows in CSV: 10


In [32]:
# Use psql command line to execute COPY
# Note: We use psql because SQLAlchemy doesn't support COPY directly

copy_command = f"""
COPY car_inventory (car_id, make, model, year, color, price, in_stock)
FROM '{csv_path}'
WITH (FORMAT CSV, HEADER TRUE, DELIMITER ',');
"""

# Execute via psql command line
psql_cmd = f"psql -U {DB_USER} -d {DB_NAME} -c \"{copy_command}\""
result = subprocess.run(psql_cmd, shell=True, capture_output=True, text=True)

if result.returncode == 0:
    print("✓ CSV data loaded successfully using COPY command!")
    print(result.stdout)
else:
    print("Note: COPY command requires specific permissions.")
    print("Let's use the pandas method instead (shown next)...")

# Check total count
count_query = "SELECT COUNT(*) as total FROM car_inventory;"
df_count = pd.read_sql(text(count_query), engine)
print(f"\nTotal cars in database: {df_count['total'][0]}")

Note: COPY command requires specific permissions.
Let's use the pandas method instead (shown next)...

Total cars in database: 6


### Method 2: Using pandas to_sql() - Easier Alternative

More portable and easier to use. Works with any database that SQLAlchemy supports.

**Advantages:**
- No file permission issues
- Works with DataFrames from any source
- Can transform data before loading
- Automatic schema inference

In [33]:
# Load CSV with pandas and insert into database
new_cars_path = "/workspaces/Fall2025-MS3083-Base_Template/data/new_cars.csv"

# Read CSV file
df_new_cars = pd.read_csv(new_cars_path)
print("New cars to add:")
print(df_new_cars)

# Insert into database using to_sql()
# if_exists='append' adds to existing table (use 'replace' to recreate table)
df_new_cars.to_sql(
    'car_inventory',           # Table name
    engine,                    # Database connection
    if_exists='append',        # Append to existing table
    index=False,               # Don't write DataFrame index as a column
    method='multi'             # Faster bulk insert
)

print(f"\n✓ Inserted {len(df_new_cars)} cars using pandas!")

# Verify all data
df_all = pd.read_sql(text("SELECT * FROM car_inventory ORDER BY car_id;"), engine)
print(f"\nTotal cars in inventory: {len(df_all)}\n")
df_all

New cars to add:
   car_id        make     model  year   color  price  in_stock
0      11  Volkswagen     Jetta  2022   White  24500      True
1      12     Hyundai    Sonata  2023   Black  29000      True
2      13         Kia    Optima  2021    Blue  25000     False
3      14      Subaru   Outback  2023   Green  36000      True
4      15        Jeep  Wrangler  2022  Orange  42000      True

✓ Inserted 5 cars using pandas!

Total cars in inventory: 11



Unnamed: 0,car_id,make,model,year,color,price,in_stock,created_at
0,1,Toyota,Camry,2022,Blue,28500.0,True,2025-11-18 03:17:14.730960
1,2,Honda,Accord,2023,Silver,32000.0,True,2025-11-18 03:17:14.757405
2,3,Ford,Mustang,2021,Red,45000.0,True,2025-11-18 03:17:14.757405
3,4,Chevrolet,Corvette,2023,Yellow,75000.0,False,2025-11-18 03:17:14.757405
4,5,Tesla,Model 3,2023,White,48000.0,True,2025-11-18 03:17:14.757405
5,6,BMW,X5,2022,Black,65000.0,True,2025-11-18 03:17:14.772993
6,11,Volkswagen,Jetta,2022,White,24500.0,True,2025-11-18 03:17:14.808293
7,12,Hyundai,Sonata,2023,Black,29000.0,True,2025-11-18 03:17:14.808293
8,13,Kia,Optima,2021,Blue,25000.0,False,2025-11-18 03:17:14.808293
9,14,Subaru,Outback,2023,Green,36000.0,True,2025-11-18 03:17:14.808293


## Part 5: UPDATE Statements - Modifying Data

UPDATE changes existing data in a table.

**Syntax:**
```sql
UPDATE table_name
SET column1 = value1, column2 = value2, ...
WHERE condition;
```

**⚠️ CRITICAL WARNING:**
- **Always use WHERE clause** or ALL rows will be updated!
- Test with SELECT first to see what will be changed
- Use transactions in production to allow rollback

### Example 1: Update a Single Row

Let's update the price of a specific car.

In [34]:
# First, SELECT to see current value
print("BEFORE UPDATE:")
df_before = pd.read_sql(text("SELECT * FROM car_inventory WHERE car_id = 1;"), engine)
print(df_before)

# Update the price
update_single_query = """
UPDATE car_inventory
SET price = 27500.00          -- New price
WHERE car_id = 1;             -- IMPORTANT: Specify which row to update!
"""

with engine.connect() as conn:
    result = conn.execute(text(update_single_query))
    conn.commit()
    print(f"\n✓ Updated {result.rowcount} row(s)")

# Verify the update
print("\nAFTER UPDATE:")
df_after = pd.read_sql(text("SELECT * FROM car_inventory WHERE car_id = 1;"), engine)
df_after

BEFORE UPDATE:
   car_id    make  model  year color    price  in_stock  \
0       1  Toyota  Camry  2022  Blue  28500.0      True   

                  created_at  
0 2025-11-18 03:17:14.730960  

✓ Updated 1 row(s)

AFTER UPDATE:


Unnamed: 0,car_id,make,model,year,color,price,in_stock,created_at
0,1,Toyota,Camry,2022,Blue,27500.0,True,2025-11-18 03:17:14.730960


### Example 2: Update Multiple Columns

You can update multiple columns in one statement.

In [35]:
# Update multiple columns for a specific car
update_multiple_query = """
UPDATE car_inventory
SET 
    price = 70000.00,         -- Update price
    in_stock = TRUE,          -- Mark as in stock
    color = 'Red'             -- Change color
WHERE make = 'Chevrolet' AND model = 'Corvette';  -- Find the Corvette
"""

with engine.connect() as conn:
    result = conn.execute(text(update_multiple_query))
    conn.commit()
    print(f"✓ Updated {result.rowcount} row(s)")

# Verify
df_corvette = pd.read_sql(
    text("SELECT * FROM car_inventory WHERE make = 'Chevrolet' AND model = 'Corvette';"),
    engine
)
print("\nUpdated Corvette:")
df_corvette

✓ Updated 1 row(s)

Updated Corvette:


Unnamed: 0,car_id,make,model,year,color,price,in_stock,created_at
0,4,Chevrolet,Corvette,2023,Red,70000.0,True,2025-11-18 03:17:14.757405


### Example 3: Update Multiple Rows with WHERE Clause

Let's give a discount to all cars from 2021.

In [36]:
# See which cars will be affected
print("Cars that will be discounted (year = 2021):")
df_discount_preview = pd.read_sql(
    text("SELECT car_id, make, model, year, price FROM car_inventory WHERE year = 2021;"),
    engine
)
print(df_discount_preview)

# Apply 10% discount to all 2021 models
update_discount_query = """
UPDATE car_inventory
SET price = price * 0.90      -- Reduce price by 10% (multiply by 0.90)
WHERE year = 2021;            -- Only 2021 models
"""

with engine.connect() as conn:
    result = conn.execute(text(update_discount_query))
    conn.commit()
    print(f"\n✓ Applied discount to {result.rowcount} car(s)")

# Show updated prices
print("\nUpdated prices:")
df_after_discount = pd.read_sql(
    text("SELECT car_id, make, model, year, price FROM car_inventory WHERE year = 2021;"),
    engine
)
df_after_discount

Cars that will be discounted (year = 2021):
   car_id  make    model  year    price
0       3  Ford  Mustang  2021  45000.0
1      13   Kia   Optima  2021  25000.0

✓ Applied discount to 2 car(s)

Updated prices:


Unnamed: 0,car_id,make,model,year,price
0,3,Ford,Mustang,2021,40500.0
1,13,Kia,Optima,2021,22500.0


### Example 4: Convert All Text to Lowercase

Let's standardize our data by converting make and model to lowercase.

**String Functions:**
- `LOWER(text)` - Convert to lowercase
- `UPPER(text)` - Convert to uppercase
- `INITCAP(text)` - Capitalize first letter of each word

In [37]:
# Show current mixed case data
print("BEFORE - Mixed case:")
df_before_case = pd.read_sql(
    text("SELECT car_id, make, model FROM car_inventory ORDER BY car_id LIMIT 10;"),
    engine
)
print(df_before_case)

# Convert make and model to lowercase
update_lowercase_query = """
UPDATE car_inventory
SET 
    make = LOWER(make),       -- Convert make to lowercase
    model = LOWER(model);     -- Convert model to lowercase
-- Note: NO WHERE clause means ALL rows are updated!
"""

with engine.connect() as conn:
    result = conn.execute(text(update_lowercase_query))
    conn.commit()
    print(f"\n✓ Converted {result.rowcount} row(s) to lowercase")

# Show updated lowercase data
print("\nAFTER - All lowercase:")
df_after_case = pd.read_sql(
    text("SELECT car_id, make, model FROM car_inventory ORDER BY car_id LIMIT 10;"),
    engine
)
df_after_case

BEFORE - Mixed case:
   car_id        make     model
0       1      Toyota     Camry
1       2       Honda    Accord
2       3        Ford   Mustang
3       4   Chevrolet  Corvette
4       5       Tesla   Model 3
5       6         BMW        X5
6      11  Volkswagen     Jetta
7      12     Hyundai    Sonata
8      13         Kia    Optima
9      14      Subaru   Outback

✓ Converted 11 row(s) to lowercase

AFTER - All lowercase:


Unnamed: 0,car_id,make,model
0,1,toyota,camry
1,2,honda,accord
2,3,ford,mustang
3,4,chevrolet,corvette
4,5,tesla,model 3
5,6,bmw,x5
6,11,volkswagen,jetta
7,12,hyundai,sonata
8,13,kia,optima
9,14,subaru,outback


### Example 5: UPDATE with Calculations

You can use calculations in UPDATE statements.

In [38]:
# Increase all prices by 5% for inflation adjustment
update_calculation_query = """
UPDATE car_inventory
SET price = ROUND(price * 1.05, 2)  -- Increase by 5%, round to 2 decimals
WHERE in_stock = TRUE;               -- Only for cars currently in stock
"""

with engine.connect() as conn:
    result = conn.execute(text(update_calculation_query))
    conn.commit()
    print(f"✓ Increased prices for {result.rowcount} car(s) in stock")

# Show updated inventory with new prices
df_new_prices = pd.read_sql(
    text("SELECT make, model, price, in_stock FROM car_inventory ORDER BY price DESC LIMIT 10;"),
    engine
)
print("\nUpdated prices (top 10):")
df_new_prices

✓ Increased prices for 10 car(s) in stock

Updated prices (top 10):


Unnamed: 0,make,model,price,in_stock
0,chevrolet,corvette,73500.0,True
1,bmw,x5,68250.0,True
2,tesla,model 3,50400.0,True
3,jeep,wrangler,44100.0,True
4,ford,mustang,42525.0,True
5,subaru,outback,37800.0,True
6,honda,accord,33600.0,True
7,hyundai,sonata,30450.0,True
8,toyota,camry,28875.0,True
9,volkswagen,jetta,25725.0,True


## Part 6: DELETE Statements - Removing Data

DELETE removes rows from a table.

**Syntax:**
```sql
DELETE FROM table_name
WHERE condition;
```

**⚠️ CRITICAL WARNING:**
- **Always use WHERE clause** or ALL rows will be deleted!
- DELETE is permanent (unless you're using transactions)
- Test with SELECT first to see what will be deleted
- Consider soft deletes (UPDATE is_deleted = TRUE) instead

### Example 1: Delete a Single Row

Remove a specific car by its ID.

In [39]:
# First, see what we're about to delete
print("Car to be deleted:")
df_to_delete = pd.read_sql(text("SELECT * FROM car_inventory WHERE car_id = 15;"), engine)
print(df_to_delete)

# Delete the car
delete_single_query = """
DELETE FROM car_inventory
WHERE car_id = 15;            -- IMPORTANT: Specify which row to delete!
"""

with engine.connect() as conn:
    result = conn.execute(text(delete_single_query))
    conn.commit()
    print(f"\n✓ Deleted {result.rowcount} row(s)")

# Verify deletion
count_query = "SELECT COUNT(*) as total FROM car_inventory;"
df_count = pd.read_sql(text(count_query), engine)
print(f"\nTotal cars remaining: {df_count['total'][0]}")

Car to be deleted:
   car_id  make     model  year   color    price  in_stock  \
0      15  jeep  wrangler  2022  Orange  44100.0      True   

                  created_at  
0 2025-11-18 03:17:14.808293  

✓ Deleted 1 row(s)

Total cars remaining: 10
   car_id  make     model  year   color    price  in_stock  \
0      15  jeep  wrangler  2022  Orange  44100.0      True   

                  created_at  
0 2025-11-18 03:17:14.808293  

✓ Deleted 1 row(s)

Total cars remaining: 10


### Example 2: Delete Multiple Rows with WHERE

Remove all cars that are not in stock and from 2021.

In [40]:
# Preview what will be deleted
print("Cars that will be deleted (NOT in stock AND year = 2021):")
df_delete_preview = pd.read_sql(
    text("SELECT * FROM car_inventory WHERE in_stock = FALSE AND year = 2021;"),
    engine
)
print(df_delete_preview)
print(f"\nTotal to delete: {len(df_delete_preview)}")

# Delete them
delete_multiple_query = """
DELETE FROM car_inventory
WHERE in_stock = FALSE        -- Not in stock
  AND year = 2021;            -- AND from 2021
"""

with engine.connect() as conn:
    result = conn.execute(text(delete_multiple_query))
    conn.commit()
    print(f"\n✓ Deleted {result.rowcount} car(s)")

Cars that will be deleted (NOT in stock AND year = 2021):
   car_id make   model  year color    price  in_stock  \
0      13  kia  optima  2021  Blue  22500.0     False   

                  created_at  
0 2025-11-18 03:17:14.808293  

Total to delete: 1

✓ Deleted 1 car(s)


### Example 3: Delete with Subquery

Delete cars that are priced below the average price.

In [41]:
# First, find the average price
avg_query = "SELECT AVG(price) as avg_price FROM car_inventory;"
df_avg = pd.read_sql(text(avg_query), engine)
avg_price = df_avg['avg_price'][0]
print(f"Average car price: ${avg_price:,.2f}")

# See which cars are below average
print("\nCars below average price:")
df_below_avg = pd.read_sql(
    text(f"SELECT make, model, price FROM car_inventory WHERE price < {avg_price};"),
    engine
)
print(df_below_avg)

# Delete cars below average (using subquery)
delete_subquery = """
DELETE FROM car_inventory
WHERE price < (SELECT AVG(price) FROM car_inventory);  -- Subquery for average
"""

with engine.connect() as conn:
    result = conn.execute(text(delete_subquery))
    conn.commit()
    print(f"\n✓ Deleted {result.rowcount} car(s) below average price")

# Show remaining inventory
df_remaining = pd.read_sql(
    text("SELECT make, model, price FROM car_inventory ORDER BY price;"),
    engine
)
print(f"\nRemaining cars: {len(df_remaining)}")
df_remaining

Average car price: $43,458.33

Cars below average price:
         make    model    price
0       honda   accord  33600.0
1  volkswagen    jetta  25725.0
2     hyundai   sonata  30450.0
3      subaru  outback  37800.0
4      toyota    camry  28875.0
5        ford  mustang  42525.0

✓ Deleted 6 car(s) below average price

Remaining cars: 3


Unnamed: 0,make,model,price
0,tesla,model 3,50400.0
1,bmw,x5,68250.0
2,chevrolet,corvette,73500.0


### Example 4: DELETE with RETURNING

See what was deleted by using RETURNING clause.

In [42]:
# Delete cars NOT in stock and return details of deleted rows
delete_returning_query = """
DELETE FROM car_inventory
WHERE in_stock = FALSE
RETURNING car_id, make, model, price;  -- Return info about deleted rows
"""

with engine.connect() as conn:
    result = conn.execute(text(delete_returning_query))
    deleted_rows = result.fetchall()
    conn.commit()
    
    print(f"✓ Deleted {len(deleted_rows)} car(s)")
    print("\nDeleted cars:")
    for row in deleted_rows:
        print(f"  ID: {row[0]}, {row[1]} {row[2]}, Price: ${row[3]}")

# Final inventory count
df_final = pd.read_sql(text("SELECT COUNT(*) as total FROM car_inventory;"), engine)
print(f"\nFinal inventory count: {df_final['total'][0]} cars")

✓ Deleted 0 car(s)

Deleted cars:

Final inventory count: 3 cars


## Part 7: Best Practices and Safety

### Always Test Before Modifying Data

In [43]:
# GOOD PRACTICE: Test with SELECT first

# Step 1: SELECT to see what will be affected
test_query = """
SELECT * FROM car_inventory
WHERE year = 2022;
"""
df_test = pd.read_sql(text(test_query), engine)
print(f"Rows that would be affected: {len(df_test)}")
print(df_test)

# Step 2: If the SELECT looks correct, convert to UPDATE/DELETE
# update_query = """
# UPDATE car_inventory
# SET price = price * 0.95
# WHERE year = 2022;
# """

print("\n✓ Always verify with SELECT before running UPDATE or DELETE!")

Rows that would be affected: 1
   car_id make model  year  color    price  in_stock  \
0       6  bmw    x5  2022  Black  68250.0      True   

                  created_at  
0 2025-11-18 03:17:14.772993  

✓ Always verify with SELECT before running UPDATE or DELETE!


### Using Transactions for Safety

Transactions let you rollback changes if something goes wrong.

**Transaction Commands:**
- `BEGIN` - Start transaction
- `COMMIT` - Save changes permanently
- `ROLLBACK` - Undo all changes since BEGIN

In [44]:
# Example of transaction with manual commit
print("Using transactions for safety:\n")

# Get current count
df_before = pd.read_sql(text("SELECT COUNT(*) as count FROM car_inventory;"), engine)
print(f"Cars before: {df_before['count'][0]}")

# Start a transaction
with engine.connect() as conn:
    # Begin transaction (automatic with 'with' block)
    trans = conn.begin()
    
    try:
        # Make some changes
        conn.execute(text("UPDATE car_inventory SET price = price * 1.10;"))
        
        # If everything is OK, commit
        trans.commit()
        print("✓ Transaction committed successfully")
        
    except Exception as e:
        # If there's an error, rollback
        trans.rollback()
        print(f"❌ Error occurred, rolled back: {e}")

print("\nTransaction complete!")

Using transactions for safety:

Cars before: 3
✓ Transaction committed successfully

Transaction complete!


## Part 8: Final Inventory Review

Let's see what's left in our inventory after all the INSERT, UPDATE, and DELETE operations.

In [45]:
# Final inventory summary
summary_query = """
SELECT 
    COUNT(*) as total_cars,
    COUNT(CASE WHEN in_stock = TRUE THEN 1 END) as in_stock,
    COUNT(CASE WHEN in_stock = FALSE THEN 1 END) as not_in_stock,
    ROUND(AVG(price), 2) as avg_price,
    MIN(price) as min_price,
    MAX(price) as max_price
FROM car_inventory;
"""

df_summary = pd.read_sql(text(summary_query), engine)
print("Final Inventory Summary:")
print("=" * 60)
print(df_summary.to_string(index=False))

# Show all remaining cars
print("\n" + "=" * 60)
print("All Cars in Inventory:")
print("=" * 60)
df_all_cars = pd.read_sql(
    text("SELECT car_id, make, model, year, color, price, in_stock FROM car_inventory ORDER BY make, model;"),
    engine
)
df_all_cars

Final Inventory Summary:
 total_cars  in_stock  not_in_stock  avg_price  min_price  max_price
          3         3             0    70455.0    55440.0    80850.0

All Cars in Inventory:


Unnamed: 0,car_id,make,model,year,color,price,in_stock
0,6,bmw,x5,2022,Black,75075.0,True
1,4,chevrolet,corvette,2023,Red,80850.0,True
2,5,tesla,model 3,2023,White,55440.0,True


## Practice Exercises

Try these on your own:

1. **INSERT**: Add 3 new cars of your choice to the inventory
2. **UPDATE**: Increase the price of all Toyota cars by 8%
3. **UPDATE**: Change the color of all white cars to 'pearl white'
4. **DELETE**: Remove all cars older than 2021
5. **CHALLENGE**: Update all car makes to proper case (first letter uppercase, rest lowercase)

In [46]:
# YOUR CODE HERE - Exercise 1: INSERT 3 new cars


In [47]:
# YOUR CODE HERE - Exercise 2: UPDATE Toyota prices


In [48]:
# YOUR CODE HERE - Exercise 3: UPDATE white cars


In [49]:
# YOUR CODE HERE - Exercise 4: DELETE old cars


In [50]:
# YOUR CODE HERE - Exercise 5: CHALLENGE - Proper case for makes
# Hint: Use INITCAP() function


## Summary

In this lesson, you learned:

### ✓ INSERT Operations
- Single row INSERT with VALUES
- Multiple row INSERT (batch insert)
- INSERT with RETURNING clause
- Understanding SERIAL auto-increment columns

### ✓ Bulk Loading Data
- Using PostgreSQL COPY command for CSV import
- Using pandas to_sql() for easier data loading
- Advantages and use cases for each method

### ✓ UPDATE Operations
- Single row UPDATE with WHERE clause
- Multiple column UPDATE
- Bulk UPDATE affecting multiple rows
- Using string functions (LOWER, UPPER, INITCAP)
- UPDATE with calculations and expressions

### ✓ DELETE Operations
- Single row DELETE
- Multiple row DELETE with conditions
- DELETE with subqueries
- DELETE with RETURNING clause

### ✓ Best Practices
- **Always use WHERE clause** in UPDATE/DELETE
- Test with SELECT before modifying data
- Use transactions for safety (BEGIN/COMMIT/ROLLBACK)
- Consider soft deletes instead of hard deletes
- Preview affected rows before executing

### Key Takeaways:

1. **INSERT** adds new data - use batch inserts for efficiency
2. **UPDATE** modifies existing data - ALWAYS use WHERE clause
3. **DELETE** removes data permanently - test with SELECT first
4. **COPY** and pandas are best for bulk loading CSV files
5. **RETURNING** clause shows what was inserted/updated/deleted
6. **Transactions** provide safety - can rollback if needed

### ⚠️ Remember:
- UPDATE without WHERE updates ALL rows
- DELETE without WHERE deletes ALL rows
- Always backup important data before mass changes
- Use transactions in production environments

### Next Steps:
- Learn about constraints (FOREIGN KEY, CHECK, etc.)
- Study more complex UPDATE patterns (from other tables)
- Explore UPSERT (INSERT ON CONFLICT)
- Practice with real-world data scenarios