# 🏨 OYO Hotel Booking Analysis
This notebook explores booking trends, cancellations, pricing, and revenue breakdowns based on region and room types.

In [None]:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

# Load dataset
df = pd.read_csv('../data/booking_cleaned_data.csv')
df.head()

## 🔍 Summary Statistics

In [None]:
df.describe()

## 📊 Cancellations by Region

In [None]:
plt.figure(figsize=(8, 5))
sns.countplot(x='region', hue='cancellation_flag', data=df)
plt.title('Cancellations by Region')
plt.xlabel('Region')
plt.ylabel('Number of Bookings')
plt.legend(title='Cancelled', labels=['No', 'Yes'])
plt.tight_layout()
plt.show()

## 💸 Price Distribution by Room Type

In [None]:
plt.figure(figsize=(8, 5))
sns.boxplot(x='room_type', y='price', data=df)
plt.title('Price Distribution by Room Type')
plt.xlabel('Room Type')
plt.ylabel('Price')
plt.tight_layout()
plt.show()

## 📅 Booking Timeline

In [None]:
df['checkin_date'] = pd.to_datetime(df['checkin_date'])
df['month'] = df['checkin_date'].dt.to_period('M')
monthly_bookings = df.groupby('month').size()

plt.figure(figsize=(10, 5))
monthly_bookings.plot(kind='line', marker='o')
plt.title('Monthly Booking Volume')
plt.xlabel('Month')
plt.ylabel('Number of Bookings')
plt.grid(True)
plt.tight_layout()
plt.show()

## ✅ Key Insights
- South and North regions show higher cancellation rates.
- Suite rooms have the highest price variability.
- Bookings peak during certain months, suggesting seasonal trends.

Further actions: integrate this analysis into a Streamlit dashboard or Tableau visual.