<a href="https://colab.research.google.com/github/RahulKumarDangi/8251960997/blob/main/FedEx_Report_Project.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Project Name**    -



##### **Project Type**    - EDA
##### **Contribution**    - Individual


# **Project Summary -**

This dataset provides insights into supply chain operations, focusing on product demand, vendor performance, shipment methods, delivery delays, and financial metrics. It contains over 10,000 orders and 33 columns, offering detailed data for analysis. The goal is to identify key trends and improve logistics, vendor relationships, and revenue generation.

The analysis shows Nigeria as the country with the highest product demand, making it a critical focus area for business growth. SCMS from RDC is identified as the top-performing vendor, contributing the most to total revenue. Strengthening partnerships with such vendors can enhance overall supply chain performance.

Air shipment is the most frequently used mode, accounting for a significant percentage of deliveries. While it ensures faster transit times, it may also lead to higher costs, highlighting the need for shipment optimization strategies. Delivery delays vary across countries, suggesting logistical challenges that must be addressed to improve customer satisfaction and maintain business efficiency.

The ARV product group leads in line item value, indicating its importance to the overall revenue. Understanding its success can help replicate similar strategies for other product groups. Pairplot analysis of numerical values reveals important relationships, such as the impact of weight on costs and delays on revenues, providing actionable insights for decision-making.

Visualizations like bar charts, pie charts, stacked bar charts, and line graphs effectively present the data. For instance, bar charts show top vendors by revenue, pie charts display vendor contributions, and stacked bar charts explain shipment modes by country.

In conclusion, this dataset highlights critical trends in supply chain management, offering actionable insights to optimize operations, enhance vendor partnerships, and improve logistics, ultimately driving growth and customer satisfaction.

# **GitHub Link -**

Provide your GitHub Link here.

# **Problem Statement**


The problem statement for this dataset is to analyze and optimize the supply chain operations by identifying key trends in product demand, vendor contributions, shipment efficiency, delivery delays, and product performance. The goal is to improve logistics, enhance vendor relationships, and maximize revenue while minimizing delays.

# ***Let's Begin !***

## ***1. Know Your Data***

### Import Libraries

In [None]:
# Import Libraries

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

### Dataset Loading

In [None]:
# Load Dataset

from google.colab import drive
drive.mount('/content/drive')

df = pd.read_csv(r'/content/drive/MyDrive/SCMS_Delivery_History_Dataset.csv',encoding='latin1')

### Dataset First View

In [None]:
# Dataset First Look
df.head()

### Dataset Rows & Columns count

In [None]:
# Dataset Rows & Columns count
df.shape

### Dataset Information

In [None]:
# Dataset Info
df.info()

#### Duplicate Values

In [None]:
# Dataset Duplicate Value Count
df.duplicated().sum()

#### Missing Values/Null Values

In [None]:
# Missing Values/Null Values Count
print(df.isnull().sum())

In [None]:
# Visualizing the missing values
sns.heatmap(df.isnull(), cbar=False, cmap="viridis")
plt.show()


### What did you know about your dataset?

Answer Here: The dataset provides insights into product demand, vendor contributions, shipment modes, delivery delays, and product group performance, helping to identify key trends and areas for improvement. It contains over 8,000 orders offering comprehensive data for analysis and decision-making.

## ***2. Understanding Your Variables***

In [None]:
# Dataset Columns
df.columns

In [None]:
# Dataset Describe
df.describe()

### Check Unique Values for each variable.

In [None]:
# Check Unique Values for each variable.
for column in df.columns:
  print(f"{column} unique values: {df[column].nunique()}")

## 3. ***Data Wrangling***

In [None]:
df.shape

### Data Wrangling Code

In [None]:
# Write your code to make your dataset analysis ready
df.dropna(inplace=True)
df.fillna(0, inplace=True)

In [None]:
df.shape

In [None]:
df.columns

## ***4. Data Vizualization, Storytelling & Experimenting with charts : Understand the relationships between variables***

#### Chart - 1

In [None]:
# Chart - 1 visualization code

country_quantity = df.groupby('Country')['Line Item Quantity'].sum()
country_quantity.plot(kind='bar', figsize=(10, 6), color='skyblue')
plt.title('Country-wise Line Item Quantity')
plt.ylabel('Line Item Quantity')
plt.xlabel('Country')
plt.show()

##### 1. Why did you pick the specific chart?

Answer Here.  Bar charts are ideal for comparing categorical data like quantities per country.

##### 2. What is/are the insight(s) found from the chart?

Answer Here:  Identifies countries with the highest and lowest product demand.Highest product demand in Nigeria.


##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here :
              Helps focus marketing efforts in high-demand regions and optimize supply chains.

#### Chart - 2

In [None]:
# Chart - 2 visualization code
vendor_value = df.groupby('Vendor')['Line Item Value'].sum()
vendor_value.plot(kind='pie', autopct='%1.1f%%', figsize=(8, 8))
plt.title('Vendor Contribution to Total Value')
plt.ylabel('')
plt.show()

##### 1. Why did you pick the specific chart?

Answer Here.  Pie charts effectively show proportional contributions.

##### 2. What is/are the insight(s) found from the chart?

Answer Here :  Reveals top vendors driving business value.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here: Helps prioritize partnerships with high-contribution vendors.

#### Chart - 3

In [None]:
# Chart - 3 visualization code
df['Delivered to Client Date'] = pd.to_datetime(df['Delivered to Client Date'])
monthly_trend = df.groupby(df['Delivered to Client Date'].dt.to_period('M'))['Line Item Quantity'].sum()
monthly_trend.plot(kind='line', marker='o', figsize=(10, 6), color='green')
plt.title('Monthly Delivery Trends')
plt.ylabel('Line Item Quantity')
plt.xlabel('Month')
plt.show()

##### 1. Why did you pick the specific chart?

Answer Here: Line charts are best for visualizing trends over time.

##### 2. What is/are the insight(s) found from the chart?

Answer Here:  Identifies peak and off-peak delivery periods.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here: Supports inventory planning and resource allocation.

#### Chart - 4

In [None]:
# Chart - 4 visualization code
df['Pack Price'].plot(kind='hist', bins=20, color='orange', edgecolor='black', figsize=(10, 6))
plt.title('Pack Price Distribution')
plt.xlabel('Pack Price')
plt.ylabel('Frequency')
plt.show()

##### 1. Why did you pick the specific chart?

Answer Here: Histograms show frequency distributions effectively.

##### 2. What is/are the insight(s) found from the chart?

Answer Here: Highlights pricing ranges and outliers.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here: Guides pricing strategy and identifies irregularities.

#### Chart - 5

In [None]:
# Chart - 5 visualization code
shipment_mode_country = df.groupby(['Country', 'Shipment Mode'])['Line Item Quantity'].sum().unstack()
shipment_mode_country.plot(kind='bar', stacked=True, figsize=(12, 8), colormap='viridis')
plt.title('Shipment Mode by Country')
plt.ylabel('Line Item Quantity')
plt.xlabel('Country')
plt.show()

##### 1. Why did you pick the specific chart?

Answer Here: Stacked bars compare categories within groups.

##### 2. What is/are the insight(s) found from the chart?

Answer Here: Shows shipment preferences by country.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here: Optimizes shipping strategies and cost.

#### Chart - 6

In [None]:
# Chart - 6 visualization code
df['Delay'] = (pd.to_datetime(df['Delivered to Client Date']) - pd.to_datetime(df['Scheduled Delivery Date'])).dt.days
sns.heatmap(df.pivot_table(values='Line Item Quantity', index='Delay', columns='Country', aggfunc='sum'), cmap='coolwarm')
plt.title('Delivery Delays by Country')
plt.show()

##### 1. Why did you pick the specific chart?

Answer Here: Heatmaps are effective for showing intensity.

##### 2. What is/are the insight(s) found from the chart?

Answer Here: Highlights delay patterns across countries.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here: Helps improve logistics and reduce delays.



#### Chart - 7

In [None]:
# Chart - 7 visualization code
sns.boxplot(x='Product Group', y='Line Item Value', data=df)
plt.title('Line Item Value Distribution by Product Group')
plt.xticks(rotation=45)
plt.show()


##### 1. Why did you pick the specific chart?

Answer Here: Box plots reveal variability and outliers.

##### 2. What is/are the insight(s) found from the chart?

Answer Here: Identifies high-value product groups.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here:  Focuses efforts on profitable categories.

#### Chart - 8

In [None]:
# Chart - 8 visualization code
insurance_trend = df.groupby(df['Delivered to Client Date'].dt.to_period('M'))['Line Item Insurance (USD)'].sum()
insurance_trend.plot(kind='area', color='lightblue', alpha=0.6, figsize=(10, 6))
plt.title('Insurance Costs over Time')
plt.ylabel('Insurance Costs (USD)')
plt.xlabel('Month')
plt.show()


##### 1. Why did you pick the specific chart?

Answer Here: Area charts emphasize cumulative changes.

##### 2. What is/are the insight(s) found from the chart?

Answer Here: Shows fluctuations in insurance expenses.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here: Assists in budget control and risk management.

#### Chart - 9

In [None]:
# Chart - 9 visualization code
df['Dosage Form'].value_counts().plot(kind='bar', color='coral', figsize=(10, 6))
plt.title('Dosage Form Distribution')
plt.ylabel('Count')
plt.xlabel('Dosage Form')
plt.show()

##### 1. Why did you pick the specific chart?

Answer Here: Simple and clear for category counts.

##### 2. What is/are the insight(s) found from the chart?

Answer Here: Highlights common dosage forms.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here:  Informs production focus on popular forms.

#### Chart - 10

In [None]:
# Chart - 10 visualization code
sns.pairplot(df[['Line Item Quantity', 'Line Item Value', 'Pack Price', 'Weight (Kilograms)']])
plt.show()

##### 1. Why did you pick the specific chart?

Answer Here: Pair plots explore relationships between multiple variables.

##### 2. What is/are the insight(s) found from the chart?

Answer Here: Reveals hidden correlations.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here: Guides deeper multivariate analysis.

#### Chart - 11 - Correlation Heatmap

In [None]:
# Correlation Heatmap visualization code

numerical_df = df.select_dtypes(include=['number'])
correlation_matrix = numerical_df.corr()
plt.figure(figsize=(10, 8))
sns.heatmap(correlation_matrix, annot=True, cmap='coolwarm', fmt='.2f', linewidths=0.5)
plt.title('Correlation Heatmap')
plt.show()



##### 1. Why did you pick the specific chart?

Answer Here: Heatmaps are visually intuitive for correlation analysis between numerical variables.

##### 2. What is/are the insight(s) found from the chart?

Answer Here: Highlights strong or weak correlations (e.g., between Line Item Value and Line Item Quantity).

#### Chart - 12 - Pair Plot

In [None]:
# Pair Plot visualization code

columns_to_plot = [
    'Line Item Quantity', 'Line Item Value', 'Pack Price',
    'Unit Price', 'Weight (Kilograms)', 'Freight Cost (USD)'
]
sns.pairplot(df[columns_to_plot], diag_kind='kde', corner=True)
plt.show()


##### 1. Why did you pick the specific chart?

Answer Here: Pairplots show relationships and distributions across multiple numerical variables.

##### 2. What is/are the insight(s) found from the chart?

Answer Here:  Provides a holistic view of dependencies (e.g., the impact of Weight on Freight Cost).

# **Conclusion**

The dataset reveals that Nigeria has the highest product demand, making it a key focus area for business operations. Vendor contributions show that SCMS from RDC accounts for the maximum value, emphasizing their critical role in the supply chain. Air shipment emerges as the most utilized mode across countries, likely due to its efficiency for time-sensitive deliveries. Delivery delays vary by country, indicating the need for targeted improvements in logistics to ensure timely fulfillment. The ARV product group has the highest line item value distribution, showcasing its significant contribution to overall revenue. Pairplot analysis of numerical values provides insights into relationships and trends within the dataset, aiding in better decision-making. These insights, combined with visualizations, highlight essential areas for optimization and growth opportunities.

### ***Hurrah! You have successfully completed your Machine Learning Capstone Project !!!***