
# 🧩 AWS Redshift SQL Queries Lab

**Objective:** Learn how to execute analytical SQL queries in Amazon Redshift using sample tables such as `sales`, `date`, and `users`.

---

## 🧭 Lab Overview

In this lab, you will:
1. Connect to an existing **Redshift cluster**.
2. Run analytical SQL queries from the **Query Editor v2** or any SQL client.
3. Understand how to perform joins and aggregations for insights.

> 🧠 **Note:** This lab assumes your Redshift cluster is already running and the sample data (tables `sales`, `date`, and `users`) is preloaded.



## ⚙️ Prerequisites

Before beginning this lab, ensure the following:

1. **Amazon Redshift Cluster** is active.  
   - Cluster status: `available`
   - Database name: e.g., `dev`
   - User credentials: `admin` or equivalent.

2. **Sample Data Available**  
   Required tables:
   - `sales(dateid, buyerid, qtysold, ...)`
   - `date(dateid, caldate, ...)`
   - `users(userid, firstname, lastname, ...)`

3. **SQL Access Options**
   You can connect and run queries using any of the following:
   - **Redshift Query Editor v2** in AWS Console  
   - **DBeaver**, **SQL Workbench/J**, or **PyCharm Database Tool**  
   - Command-line access via `psql`



## 🧮 Query 1: Find Total Sales on a Given Calendar Date

This query calculates the **total quantity sold** on a specific calendar date by joining the `sales` and `date` tables.

```sql
SELECT sum(qtysold)
FROM sales, date
WHERE sales.dateid = date.dateid
AND caldate = '2008-01-06';
```

### 🔍 Explanation:
- The `sales` table holds transaction data including `qtysold` (quantity sold).  
- The `date` table provides the calendar date dimension.  
- The query joins both tables using `dateid` and filters by a specific `caldate`.  
- The result shows the total items sold on January 6, 2008.



## 🧾 Query 2: Find Top 10 Buyers by Quantity

This query identifies the top 10 buyers based on total quantity purchased and lists their names.

```sql
SELECT firstname, lastname, total_quantity
FROM (
    SELECT buyerid, sum(qtysold) AS total_quantity
    FROM sales
    GROUP BY buyerid
    ORDER BY total_quantity DESC
    LIMIT 10
) Q, users
WHERE Q.buyerid = userid
ORDER BY Q.total_quantity DESC;
```

### 🔍 Explanation:
- The **subquery** aggregates sales by `buyerid` using `sum(qtysold)`.
- It limits results to the **top 10 buyers** by total quantity.  
- The **outer query** joins with the `users` table to fetch buyer names (`firstname`, `lastname`).  
- Final results are sorted in descending order of `total_quantity`.



## 🧠 Reflection

In this lab, you learned:
- How to join tables in Redshift using implicit join syntax (`FROM A, B WHERE A.id = B.id`).  
- How to use aggregation functions (`SUM`, `GROUP BY`) for analytical queries.  
- How to identify top customers or summarize sales over time.

> 🔹 Redshift automatically optimizes such analytical workloads using its distributed query execution engine, making it ideal for data warehousing and BI use cases.

---

## 🪞 References

- [Amazon Redshift SQL Reference](https://docs.aws.amazon.com/redshift/latest/dg/cm_chap_SQLCommandRef.html)  
- [Redshift Query Editor v2 Guide](https://docs.aws.amazon.com/redshift/latest/mgmt/query-editor-v2.html)
