# Business Analytics Exploration

This notebook demonstrates interactive data exploration using both R and Python in Positron.

## Setup

Make sure you've run the data cleaning script first:
```bash
Rscript scripts/r/data_cleaning.R
# OR
python scripts/python/clean_data.py
```

## Python Analysis

In [None]:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

# Set style
sns.set_style("whitegrid")
plt.rcParams['figure.figsize'] = (10, 6)

In [None]:
# Load cleaned data
df = pd.read_csv('data/processed/sales_data_cleaned.csv')
df['date'] = pd.to_datetime(df['date'])

print(f"Dataset shape: {df.shape}")
df.head()

In [None]:
# Quick summary statistics
df.describe()

### Regional Performance

In [None]:
# Regional sales summary
regional_summary = df.groupby('region').agg({
    'sales_amount': ['sum', 'mean', 'count'],
    'units_sold': 'sum'
}).round(2)

regional_summary.columns = ['Total Sales', 'Avg Transaction', 'Transactions', 'Total Units']
regional_summary = regional_summary.sort_values('Total Sales', ascending=False)
regional_summary

In [None]:
# Visualize regional performance
plt.figure(figsize=(10, 6))
regional_summary['Total Sales'].plot(kind='bar', color='#2E86AB', alpha=0.8)
plt.title('Total Sales by Region', fontsize=14, fontweight='bold')
plt.xlabel('Region', fontweight='bold')
plt.ylabel('Total Sales ($)', fontweight='bold')
plt.xticks(rotation=0)
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

### Product Category Analysis

In [None]:
# Category performance
category_summary = df.groupby('product_category').agg({
    'sales_amount': ['sum', 'mean', 'count'],
    'units_sold': 'sum'
}).round(2)

category_summary.columns = ['Total Sales', 'Avg Transaction', 'Transactions', 'Total Units']
category_summary = category_summary.sort_values('Total Sales', ascending=False)
category_summary

In [None]:
# Sales trend over time
daily_sales = df.groupby('date')['sales_amount'].sum()

plt.figure(figsize=(12, 6))
plt.plot(daily_sales.index, daily_sales.values, marker='o', linewidth=2, color='#A23B72')
plt.title('Daily Sales Trend', fontsize=14, fontweight='bold')
plt.xlabel('Date', fontweight='bold')
plt.ylabel('Daily Sales ($)', fontweight='bold')
plt.grid(True, alpha=0.3)
plt.xticks(rotation=45)
plt.tight_layout()
plt.show()

### Interactive Exploration

Try your own queries and visualizations below:

In [None]:
# Your code here


## Key Insights

1. **Top Region**: Identify which region performs best
2. **Product Mix**: Understand which categories drive revenue
3. **Trends**: Observe sales patterns over time
4. **Opportunities**: Spot underperforming areas for improvement

## Next Steps

- Run full analysis: `python scripts/python/data_analysis.py`
- Generate visualizations: `python scripts/python/visualizations.py`
- Create dashboard: `quarto render reports/business_dashboard.qmd`