
# DS2002 — SQL + Python → Pandas DataFrames
## Studio — Working with DataFrames

This studio is designed to be completed **independently**.

You will:
- pull structured data from SQL into a DataFrame
- answer questions using Pandas (not SQL)
- practice filtering, sorting, grouping, and feature creation

The goal is not to memorize syntax.
The goal is to become comfortable *thinking with DataFrames*.



### Submission (Canvas)

1. Complete all TODOs below.
2. Run every cell so outputs are visible.
3. Save your Kaggle notebook.
4. Submit the **Kaggle Notebook GitHub URL** in Canvas.



## Part 0 — Setup (Run This Cell)

This cell prepares SQLite, Pandas, and helper functions.


In [None]:

import sqlite3
import pandas as pd

conn = sqlite3.connect(":memory:")

def exec_sql(sql):
    conn.executescript(sql)
    conn.commit()

def q(sql):
    return pd.read_sql_query(sql, conn)

print("Environment ready.")



## Part 1 — Create and Load the Store Database (Provided)

Run the next cell to create tables and insert data.


In [None]:

exec_sql('''
CREATE TABLE customers (
    customer_id INTEGER PRIMARY KEY,
    name TEXT,
    region TEXT
);

CREATE TABLE products (
    product_id INTEGER PRIMARY KEY,
    name TEXT,
    category TEXT,
    price REAL
);

CREATE TABLE orders (
    order_id INTEGER PRIMARY KEY,
    customer_id INTEGER,
    order_date TEXT
);

CREATE TABLE order_items (
    order_id INTEGER,
    product_id INTEGER,
    quantity INTEGER
);

INSERT INTO customers VALUES
(1,'Alice','East'),
(2,'Bob','West'),
(3,'Carol','East'),
(4,'David','South');

INSERT INTO products VALUES
(101,'Keyboard','Electronics',50),
(102,'Mouse','Electronics',20),
(103,'Monitor','Electronics',200),
(104,'Desk','Furniture',300),
(105,'Chair','Furniture',150);

INSERT INTO orders VALUES
(1,1,'2026-02-01'),
(2,2,'2026-02-02'),
(3,1,'2026-02-03'),
(4,3,'2026-02-03'),
(5,4,'2026-02-04');

INSERT INTO order_items VALUES
(1,101,1),(1,102,2),
(2,103,1),
(3,104,1),(3,105,1),
(4,101,2),(4,103,1),
(5,105,2);
''')

print("Store data loaded.")



## Part 2 — Pull Data into a DataFrame

You will use **one SQL query** to create a DataFrame.
After this point, **do not use SQL** to answer questions.


In [None]:

df = q('''
SELECT
    c.name AS customer,
    c.region,
    o.order_id,
    o.order_date,
    p.name AS product,
    p.category,
    oi.quantity,
    p.price,
    (oi.quantity * p.price) AS line_total
FROM customers c
JOIN orders o ON c.customer_id = o.customer_id
JOIN order_items oi ON o.order_id = oi.order_id
JOIN products p ON oi.product_id = p.product_id
ORDER BY o.order_date;
''')

df



## Part 3 — DataFrame Questions

Answer each question using **Pandas operations only**.


### Q1. How many rows and columns are in the DataFrame?

In [None]:

# TODO


### Q2. List the column names and their data types.

In [None]:

# TODO


### Q3. Show only the rows where the customer is Alice.

In [None]:

# TODO


### Q4. Show all line items with a line total greater than $100.

In [None]:

# TODO


### Q5. Sort the DataFrame by line_total from highest to lowest.

In [None]:

# TODO


### Q6. Create a new column called tax (7% of line_total).

In [None]:

# TODO


### Q7. Create a new column called total_with_tax.

In [None]:

# TODO


### Q8. What is the total revenue per customer?

In [None]:

# TODO


### Q9. What is the average line_total per product category?

In [None]:

# TODO


### Q10. Which customer spent the most overall?

In [None]:

# TODO


### Q11. Which region generated the most revenue?

In [None]:

# TODO


### Q12. Create a summary DataFrame with customer and total_spent.

In [None]:

# TODO


### Q13. Write that summary DataFrame back to the database as customer_summary.

In [None]:

# TODO



## Part 4 — Reflection

In 2–3 sentences:
- What DataFrame operation felt most intuitive?
- What still feels confusing?


In [None]:

reflection = """
TODO
"""

print(reflection)



## You Are Done

Make sure:
- all TODOs are completed
- all cells run without error
- outputs are visible

Then submit your Kaggle Notebook Github URL in Canvas to me and Daniel
