# 10 — Final Project: Hotel Operations Intelligence

You're the Senior Data Analyst at a hotel group that runs **two booking systems**:
- `hotel_bookings` — the Property Management System (PMS) with 119k records across Resort & City hotels
- `hotel_reservations` — the OTA/Channel Manager feed with 36k records

The GM, Revenue Manager, and CFO each have requests. **No hints this time.**  
Use everything you've learned: JOINs, CTEs, window functions, subqueries, pivoting, rolling calculations.

---

### Scoring Yourself

| Level | Criteria |
|-------|---------|
| Pass | Query runs, returns correct results |
| Good | Clean column aliases, proper rounding, sorted output |
| Excellent | Uses CTEs for readability, handles NULLs, adds computed insights |

In [None]:
%load_ext sql
%sql postgresql://admin:password@postgres:5432/mastery_db

In [None]:
import pandas as pd
from sqlalchemy import create_engine

engine = create_engine("postgresql://admin:password@postgres:5432/mastery_db")

df1 = pd.read_csv('/app/data/hotel_booking.csv')
df1.to_sql('hotel_bookings', engine, if_exists='replace', index=False)

df2 = pd.read_csv('/app/data/hotel_reservation.csv')
df2.columns = df2.columns.str.lower()
df2.to_sql('hotel_reservations', engine, if_exists='replace', index=False)

print(f"hotel_bookings:      {len(df1):,} rows")
print(f"hotel_reservations:  {len(df2):,} rows")
print("Ready.")

---
## Part I — Revenue Deep Dive

> **From: CFO**  
> *I need a full revenue breakdown for my board presentation.*

---

### Task 1.1

Calculate **estimated total revenue** per hotel per year. Revenue = `adr × total_nights` for non-canceled bookings only. Format revenue in thousands (divide by 1000, round to 1 decimal).

In [None]:
%%sql


### Task 1.2

Using the revenue from 1.1, calculate the **Year-over-Year growth rate (%)** for each hotel using `LAG`. Which hotel grew faster between 2015→2016?

In [None]:
%%sql


### Task 1.3

Create a **monthly revenue pivot table** with ROLLUP: rows = year + month, columns via FILTER = Resort Hotel revenue, City Hotel revenue, Total. Include year subtotals and a grand total.

In [None]:
%%sql


### Task 1.4

Add a **cumulative revenue column** that shows the running total per hotel over months. Also add a **6-month moving average** of monthly revenue.

In [None]:
%%sql


---
## Part II — Cancellation Analysis

> **From: Operations VP**  
> *Cancellations are killing our occupancy. I need to understand the patterns.*

---

### Task 2.1

Build a **cancellation risk profile**: segment bookings by `lead_time` buckets (0–7, 8–30, 31–90, 91–180, 181–365, 365+) AND by `deposit_type`. Show: booking count, cancel count, cancel rate %. Label each combination as HIGH (>50%), MEDIUM (25–50%), or LOW (<25%) risk.

In [None]:
%%sql


### Task 2.2

For **repeat guests** (`is_repeated_guest = 1`): compare their cancellation rate, average ADR, and average lead time against non-repeat guests. Do repeat guests cancel less?

In [None]:
%%sql


### Task 2.3

Find the **top 10 countries by cancellation rate** (only countries with at least 500 bookings). For each, show the cancellation rate and the average ADR of canceled vs non-canceled bookings.

In [None]:
%%sql


---
## Part III — Cross-System Analysis

> **From: CTO**  
> *We're merging our PMS and OTA data. I need to compare the two systems.*

---

### Task 3.1

Create a **unified booking view** (UNION ALL) from both tables with these standardized columns: `source` (PMS/OTA), `lead_time`, `market_segment`, `is_canceled` (1/0), `price`, `total_nights`, `room_type`. Then query the unified view to compare: total bookings, avg price, avg lead time, and cancel rate per source.

In [None]:
%%sql


### Task 3.2

From the unified view, find the **top 5 market segments by booking volume** and pivot the source (PMS vs OTA) into columns showing booking count, avg price, and cancel rate side by side.

In [None]:
%%sql


### Task 3.3

Using the OTA table (`hotel_reservations`), find room types where the average price is **higher than ALL room type averages in the PMS** (`hotel_bookings`). Use the `ALL` keyword.

In [None]:
%%sql


---
## Part IV — Guest Intelligence & Operational Insights

> **From: GM**  
> *Help me understand our guests and optimize operations.*

---

### Task 4.1

Build a **Guest Segmentation Matrix**: for each `customer_type`, create a pivot showing the count of each `meal` type as columns. Add the average ADR and average total nights. Rank customer types by total revenue.

In [None]:
%%sql


### Task 4.2

Find **room upgrade/downgrade patterns**: bookings where `reserved_room_type != assigned_room_type`. For each combination, show: frequency, whether it's an upgrade (assigned > reserved alphabetically) or downgrade, the average ADR, and the average number of special requests. Do guests with more special requests get more upgrades?

In [None]:
%%sql


### Task 4.3

**Parking demand forecast**: For each month, calculate the percentage of bookings that requested parking (`required_car_parking_spaces > 0`). Show the trend over time and add a 3-month moving average. Is parking demand seasonal?

In [None]:
%%sql


### Task 4.4

**The Big One**: Write a single query (or CTE chain) that produces an **executive summary** with one row per hotel showing:
- Total bookings
- Non-canceled bookings
- Cancellation rate %
- Total estimated revenue (adr × nights, non-canceled only)
- Average ADR
- Average lead time
- Average stay length
- Top country (by booking count)
- Top market segment (by booking count)
- Repeat guest %

In [None]:
%%sql


---
## Scratch Space

Use these cells for experimentation.

In [None]:
%%sql


In [None]:
%%sql


In [None]:
%%sql


In [None]:
%%sql


---

### Congratulations!

If you've completed all 4 parts, you've demonstrated mastery of:

| Skill | Where You Used It |
|-------|------------------|
| JOINs | Cross-system analysis, room rate lookups |
| Subqueries & CTEs | Multi-step analysis, risk profiling |
| Window Functions | YoY growth, running totals, moving averages, ranking |
| Pivoting | Revenue by hotel, status by segment, meal breakdown |
| Functions | Date construction, string parsing, NULL handling |
| ROLLUP / GROUPING SETS | Subtotals in revenue reports |
| UNION ALL | Merging two booking systems |
| Performance Thinking | Efficient CTEs, avoiding repeated scans |

You're ready for real hotel analytics work.