# SQL Basics: Querying Databases, Joins, and Aggregations

## Introduction
SQL (Structured Query Language) is a powerful language used for managing and manipulating relational databases. It allows users to query data, insert or update records, and perform complex data analysis through joins and aggregations. This notebook explores SQL fundamentals, including database connections, basic queries, joins, and aggregations, with practical examples using a SQLite database.

By the end of this notebook, you will understand how to:
- Connect to a SQLite database using Python.
- Write and execute basic SQL queries.
- Use different types of joins to combine data from multiple tables.
- Apply aggregate functions and group data for analysis.
- Solve practical SQL challenges.

Let's get started!

## 1. Connecting to a Database

We'll use Python's `sqlite3` library to connect to a SQLite database. SQLite is a lightweight, serverless database ideal for learning and small projects. Below, we create a sample database with two tables: `products` and `sales`.

### Steps:
1. Import the `sqlite3` library.
2. Create a database and tables.
3. Populate the tables with sample data.

Run the code below to set up the database.

In [1]:
import sqlite3

# Connect to a new SQLite database (creates 'store.db' if it doesn't exist)
conn = sqlite3.connect('store.db')
cursor = conn.cursor()

# Create products table
cursor.execute('''
    CREATE TABLE IF NOT EXISTS products (
        product_id INTEGER PRIMARY KEY,
        product_name TEXT NOT NULL,
        category TEXT NOT NULL,
        price REAL NOT NULL
    )
''')

# Create sales table
cursor.execute('''
    CREATE TABLE IF NOT EXISTS sales (
        sale_id INTEGER PRIMARY KEY,
        product_id INTEGER,
        sale_date TEXT NOT NULL,
        quantity INTEGER NOT NULL,
        total_amount REAL NOT NULL,
        FOREIGN KEY (product_id) REFERENCES products (product_id)
    )
''')

# Insert sample data into products
cursor.executemany('''
    INSERT OR REPLACE INTO products (product_id, product_name, category, price)
    VALUES (?, ?, ?, ?)
''', [
    (1, 'Laptop', 'Electronics', 999.99),
    (2, 'Headphones', 'Electronics', 79.99),
    (3, 'Coffee Maker', 'Appliances', 49.99),
    (4, 'Blender', 'Appliances', 29.99),
    (5, 'Smartphone', 'Electronics', 599.99)
])

# Insert sample data into sales
cursor.executemany('''
    INSERT OR REPLACE INTO sales (sale_id, product_id, sale_date, quantity, total_amount)
    VALUES (?, ?, ?, ?, ?)
''', [
    (1, 1, '2025-08-01', 2, 1999.98),
    (2, 2, '2025-08-02', 5, 399.95),
    (3, 3, '2025-08-03', 3, 149.97),
    (4, 1, '2025-08-04', 1, 999.99),
    (5, 4, '2025-08-05', 4, 119.96),
    (6, 5, '2025-08-06', 2, 1199.98)
])

# Commit changes and close connection
conn.commit()
print("Database setup complete!")


Database setup complete!


## 2. Basic SQL Queries

SQL queries allow you to interact with the database. Below are examples of basic SQL operations:

- **SELECT**: Retrieves data from a table.
- **INSERT**: Adds new records.
- **UPDATE**: Modifies existing records.
- **DELETE**: Removes records.

### Example: SELECT Query
Let's retrieve the `product_name` and `price` columns from the `products` table.

In [2]:
# SELECT query
cursor.execute('''
    SELECT product_name, price
    FROM products
    WHERE price > 50
''')

# Fetch and display results
results = cursor.fetchall()
for row in results:
    print(f"Product: {row[0]}, Price: ${row[1]}")


Product: Laptop, Price: $999.99
Product: Headphones, Price: $79.99
Product: Smartphone, Price: $599.99


### Explanation
- The `SELECT` statement specifies the columns to retrieve (`product_name`, `price`).
- The `WHERE` clause filters rows where the price is greater than 50.
- Other basic queries (INSERT, UPDATE, DELETE) follow similar syntax but modify the database instead of retrieving data.

## 3. Understanding Joins

Joins combine data from multiple tables based on related columns. The main types of joins are:

- **INNER JOIN**: Returns only matching records from both tables.
- **LEFT JOIN**: Returns all records from the left table and matching records from the right (non-matching rows get NULL).
- **RIGHT JOIN**: Returns all records from the right table and matching records from the left.
- **FULL JOIN**: Returns all records from both tables, with NULLs for non-matching rows.

SQLite does not support RIGHT JOIN or FULL JOIN natively, so we'll focus on INNER and LEFT joins.

### Visual Explanation
- INNER JOIN: Only the overlapping section of two tables.
- LEFT JOIN: All records from the left table, plus matching records from the right.

### Example: INNER JOIN
Combine `products` and `sales` to get sales details with product names.

In [3]:
# INNER JOIN query
cursor.execute('''
    SELECT p.product_name, s.sale_date, s.quantity, s.total_amount
    FROM products p
    INNER JOIN sales s ON p.product_id = s.product_id
''')

# Fetch and display results
results = cursor.fetchall()
for row in results:
    print(f"Product: {row[0]}, Date: {row[1]}, Quantity: {row[2]}, Total: ${row[3]}")


Product: Laptop, Date: 2025-08-01, Quantity: 2, Total: $1999.98
Product: Headphones, Date: 2025-08-02, Quantity: 5, Total: $399.95
Product: Coffee Maker, Date: 2025-08-03, Quantity: 3, Total: $149.97
Product: Laptop, Date: 2025-08-04, Quantity: 1, Total: $999.99
Product: Blender, Date: 2025-08-05, Quantity: 4, Total: $119.96
Product: Smartphone, Date: 2025-08-06, Quantity: 2, Total: $1199.98


### Example: LEFT JOIN
Retrieve all products, even those without sales.

In [4]:
# LEFT JOIN query
cursor.execute('''
    SELECT p.product_name, s.sale_date, s.quantity
    FROM products p
    LEFT JOIN sales s ON p.product_id = s.product_id
''')

# Fetch and display results
results = cursor.fetchall()
for row in results:
    print(f"Product: {row[0]}, Date: {row[1]}, Quantity: {row[2]}")


Product: Laptop, Date: 2025-08-01, Quantity: 2
Product: Laptop, Date: 2025-08-04, Quantity: 1
Product: Headphones, Date: 2025-08-02, Quantity: 5
Product: Coffee Maker, Date: 2025-08-03, Quantity: 3
Product: Blender, Date: 2025-08-05, Quantity: 4
Product: Smartphone, Date: 2025-08-06, Quantity: 2


## 4. Aggregations and Grouping

Aggregate functions perform calculations on a set of rows. Common functions include:
- **SUM**: Calculates the total.
- **AVG**: Calculates the average.
- **COUNT**: Counts the number of rows.
- **MIN**/**MAX**: Finds the minimum or maximum value.

The `GROUP BY` clause groups rows with the same values into summary rows.

### Task: Calculate Sales Metrics
Calculate the total sales, average sales price, and number of transactions per product category.

In [5]:
# Aggregation query
cursor.execute('''
    SELECT p.category,
           COUNT(s.sale_id) AS transaction_count,
           SUM(s.total_amount) AS total_sales,
           AVG(s.total_amount) AS avg_sale_price
    FROM products p
    INNER JOIN sales s ON p.product_id = s.product_id
    GROUP BY p.category
''')

# Fetch and display results
results = cursor.fetchall()
for row in results:
    print(f"Category: {row[0]}, Transactions: {row[1]}, Total Sales: ${row[2]:.2f}, Avg Sale: ${row[3]:.2f}")


Category: Appliances, Transactions: 2, Total Sales: $269.93, Avg Sale: $134.97
Category: Electronics, Transactions: 4, Total Sales: $4599.90, Avg Sale: $1149.97


## 5. Challenges

Test your understanding with these exercises:

1. Write a query to find all products with a price greater than $100.
2. Write a query to list all sales where the quantity sold is at least 3, including the product name and total amount.
3. Write a query to calculate the total quantity sold per product (include product name).

Try writing the queries below, then execute them to verify the results.

In [6]:
# Challenge 1: Products with price > $100
cursor.execute('''
    SELECT product_name, price
    FROM products
    WHERE price > 100
''')
results = cursor.fetchall()
print("Challenge 1 Results:")
for row in results:
    print(f"Product: {row[0]}, Price: ${row[1]}")

# Challenge 2: Sales with quantity >= 3
cursor.execute('''
    SELECT p.product_name, s.quantity, s.total_amount
    FROM products p
    INNER JOIN sales s ON p.product_id = s.product_id
    WHERE s.quantity >= 3
''')
results = cursor.fetchall()
print("\nChallenge 2 Results:")
for row in results:
    print(f"Product: {row[0]}, Quantity: {row[1]}, Total: ${row[2]}")

# Challenge 3: Total quantity sold per product
cursor.execute('''
    SELECT p.product_name, SUM(s.quantity) AS total_quantity
    FROM products p
    LEFT JOIN sales s ON p.product_id = s.product_id
    GROUP BY p.product_name
''')
results = cursor.fetchall()
print("\nChallenge 3 Results:")
for row in results:
    print(f"Product: {row[0]}, Total Quantity: {row[1]}")


Challenge 1 Results:
Product: Laptop, Price: $999.99
Product: Smartphone, Price: $599.99

Challenge 2 Results:
Product: Headphones, Quantity: 5, Total: $399.95
Product: Coffee Maker, Quantity: 3, Total: $149.97
Product: Blender, Quantity: 4, Total: $119.96

Challenge 3 Results:
Product: Blender, Total Quantity: 4
Product: Coffee Maker, Total Quantity: 3
Product: Headphones, Total Quantity: 5
Product: Laptop, Total Quantity: 3
Product: Smartphone, Total Quantity: 2


## 6. Conclusion

This notebook covered the essentials of SQL, including:
- Connecting to a SQLite database using Python's `sqlite3`.
- Writing basic SQL queries (SELECT, INSERT, UPDATE, DELETE).
- Understanding and applying INNER and LEFT joins.
- Using aggregate functions and GROUP BY for data analysis.
- Solving practical SQL challenges.

To deepen your SQL knowledge, explore these resources:
- [SQLZoo](https://sqlzoo.net/) for interactive SQL tutorials.
- [W3Schools SQL Tutorial](https://www.w3schools.com/sql/) for beginner-friendly explanations.
- [SQLite Documentation](https://www.sqlite.org/docs.html) for SQLite-specific details.

Keep practicing, and you'll be querying databases like a pro in no time!

In [7]:
# Close the database connection
conn.close()
print("Database connection closed.")


Database connection closed.
