
# Process Improvement Case Study: Order Fulfilment

This notebook analyses order fulfilment cycle times using the **100 Sales Records** dataset.  
The dataset contains 100 orders across multiple regions and includes both **Order Date** and **Ship Date**, enabling us to calculate the cycle time (in days) for each order.  The goal of this case study is to:

* Map the high‑level process flow from order placement to shipping.
* Calculate and visualise order cycle times to identify bottlenecks.
* Provide data‑driven recommendations to improve the process.


In [None]:

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

# Load data
file_path = 'sales_data.csv'
df = pd.read_csv(file_path)

# Convert date columns to datetime
for col in ['Order Date', 'Ship Date']:
    df[col] = pd.to_datetime(df[col], errors='coerce')

# Compute cycle time in days
df['Cycle_Time'] = (df['Ship Date'] - df['Order Date']).dt.days

df[['Region','Order Date','Ship Date','Cycle_Time']].head()


In [None]:

# Summary statistics for cycle time
summary = df['Cycle_Time'].describe()
summary

# Average cycle time by region
avg_by_region = df.groupby('Region')['Cycle_Time'].mean().sort_values(ascending=False)
avg_by_region

# Top 5 longest cycle times
top5 = df.nlargest(5, 'Cycle_Time')[['Region', 'Country', 'Order Date', 'Ship Date', 'Cycle_Time']]
top5


In [None]:

# Plot cycle time distribution
plt.figure(figsize=(8,5))
sns.histplot(df['Cycle_Time'], bins=10, kde=False, color='#4C72B0')
plt.title('Order Cycle Time Distribution')
plt.xlabel('Cycle Time (days)')
plt.ylabel('Number of Orders')
plt.show()

# Plot average cycle time by region
plt.figure(figsize=(10,6))
avg_by_region.plot(kind='bar', color='#4C72B0')
plt.title('Average Order Cycle Time by Region')
plt.xlabel('Region')
plt.ylabel('Average Cycle Time (days)')
plt.xticks(rotation=45)
plt.show()


**Note:** Additional deliverables such as process flow diagrams, use‑cases, requirements and stakeholder analysis are included in the project repository.