# 📊 Brazilian E-Commerce Business Review Report

This notebook provides a business review analysis using the **Brazilian E-Commerce Public Dataset by Olist**. It includes:
- Revenue trends
- Review score analysis
- Delivery performance
- Customer segmentation

Each section is designed to extract actionable insights for business decision-making.

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import datetime

# Set visualization style
sns.set(style='whitegrid')

## 📥 Load Dataset
Load all relevant CSV files from the Brazilian E-Commerce dataset.

In [None]:
# Load dataset files
orders = pd.read_csv('olist_orders_dataset.csv')
order_items = pd.read_csv('olist_order_items_dataset.csv')
products = pd.read_csv('olist_products_dataset.csv')
sellers = pd.read_csv('olist_sellers_dataset.csv')
customers = pd.read_csv('olist_customers_dataset.csv')
reviews = pd.read_csv('olist_order_reviews_dataset.csv')
payments = pd.read_csv('olist_order_payments_dataset.csv')
geolocation = pd.read_csv('olist_geolocation_dataset.csv')

## 💰 Revenue Trends
Analyze monthly revenue trends to understand business growth and seasonality.

In [None]:
# Merge orders with payments
orders['order_purchase_timestamp'] = pd.to_datetime(orders['order_purchase_timestamp'])
revenue_data = pd.merge(orders, payments, on='order_id')
revenue_data['month'] = revenue_data['order_purchase_timestamp'].dt.to_period('M')
monthly_revenue = revenue_data.groupby('month')['payment_value'].sum().reset_index()

# Plot revenue trend
plt.figure(figsize=(12,6))
sns.lineplot(data=monthly_revenue, x='month', y='payment_value', marker='o')
plt.title('Monthly Revenue Trend')
plt.xlabel('Month')
plt.ylabel('Revenue (BRL)')
plt.xticks(rotation=45)
plt.tight_layout()
plt.show()

## ⭐ Review Score Analysis
Understand customer satisfaction by analyzing review scores and their distribution.

In [None]:
# Review score distribution
plt.figure(figsize=(8,5))
sns.countplot(data=reviews, x='review_score', palette='viridis')
plt.title('Review Score Distribution')
plt.xlabel('Review Score')
plt.ylabel('Number of Reviews')
plt.tight_layout()
plt.show()

## 🚚 Delivery Performance
Evaluate delivery delays and fulfillment efficiency.

In [None]:
# Convert timestamps
orders['order_delivered_customer_date'] = pd.to_datetime(orders['order_delivered_customer_date'])
orders['order_estimated_delivery_date'] = pd.to_datetime(orders['order_estimated_delivery_date'])

# Calculate delivery delay
orders['delivery_delay'] = (orders['order_delivered_customer_date'] - orders['order_estimated_delivery_date']).dt.days

# Plot delivery delay
plt.figure(figsize=(10,5))
sns.histplot(orders['delivery_delay'].dropna(), bins=30, kde=True, color='salmon')
plt.title('Delivery Delay Distribution')
plt.xlabel('Days Late')
plt.ylabel('Number of Orders')
plt.tight_layout()
plt.show()

## 👥 Customer Segmentation
Segment customers based on location and order frequency to identify key markets.

In [None]:
# Merge customers with orders
customer_orders = pd.merge(customers, orders, on='customer_id')
customer_freq = customer_orders.groupby(['customer_unique_id', 'customer_state']).size().reset_index(name='order_count')

# Top 10 states by customer order frequency
top_states = customer_freq.groupby('customer_state')['order_count'].sum().sort_values(ascending=False).head(10)

# Plot top states
plt.figure(figsize=(10,5))
sns.barplot(x=top_states.index, y=top_states.values, palette='coolwarm')
plt.title('Top 10 States by Customer Order Frequency')
plt.xlabel('State')
plt.ylabel('Total Orders')
plt.tight_layout()
plt.show()

## ✅ Conclusion
This starter notebook provides a foundation for analyzing business performance using the Brazilian E-Commerce dataset. You can extend it by:
- Adding product category analysis
- Mapping geolocation data
- Predicting delivery delays
- Sentiment analysis on review comments