<a href="https://colab.research.google.com/github/pragnya2001/FedEx/blob/main/FedEx.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Project Name**    -   FedEx Logistics Delivery Performance Analysis





##### **Project Type**    - Exploratory Data Analysis (EDA)
##### **Contribution**    - Pragnyasmita Chand

# **Project Summary -**

In today's fast-paced global supply chain, timely and cost-effective logistics play a critical role in the business success. FedEx, as a global logistics service provider , handles a vast number of shipments daily across multiplecountries, vendors and transportation modes.Ensuring that deliveries arrive on time and at optimized costs is essential not only for maintaining customer satisfaction but also for sustaining operational efficiency and profitability.


This project focuses on analysing the FedEx logistics delivery dataset to explore delivery performance trends and uncover patterns that affect shipment delays and freight costs. The datast includes detailed records of shipment information such as shipment modes, managed teams, delivery dates, scheduled delivery dates, countries involved, vendor INCO terms, weights, freight costs, insurance costs and dates when purchase orderd were sent to vendors.

The key objective of this EDA project is to understand how various features influence delivery outcomes and to identify the factors that contributes to delay or higher costs. By investigating the relationships between variables like shipment mode, country of delivery, vendor terms and delivery lead time, we can extract actionable insights that may help in improving servic reliablity and cost efficiency.

The analysis begins with an overview and cleaning of the dataset to identify and handle missing or incorrect data. Next, we create a new feature that helps determine whether a shipment was delivered on time or not. We also compute the number of das between when the purchase order was sent and the scheduled delivery date, as this lead time likely to impact on-time delivery.

The core of this project lies in performing univarite, bivariate and multivariate analyses. We explore how the distribution of shipment mode, costs and weights differ across the dataset. Further, we examine how various combinations of features- such as freight cost by shipment mode or delays by country-interact to reveal patterns and insights. Correlation analysis helps us quantify the streangth of relationships between numerical variables, such as weight, freight cost and insurance.

Using visualisation tools like matplotlib, seaborn and plotly, we present insightd in a clear and intuitive format.These visualizations aid in understanding which factors are most closely associated with delays and high costs. For example, the analysis my reveal that air shipments, although more expensive, have the highest on-time delivery rates, or thet certain countries face more frequent delivery delays due to customs or geographic challenges.

The findings from this analysis will allow stakeholders at FedEx to make more informed decisions rgarding shipment strategies, vendor negotiations and performance monitoring.Recommendation may inclued optimizing transportation modes, adjusting delivery lead times, or focusing on country specific process improvements.

Ultimately, this project demonstrat hoe data-driven decisions can improve logistics performance and customer satisfaction while reducing unnecessary costs.It serves as a foundational step towards building more predictive and intelligenr logistics systems that adapt to complex supply chain environments.


# **GitHub Link -**

Provide your GitHub Link here.
https://github.com/pragnya2001

# **Problem Statement**


FedEx logistics faces challenges in ensuring consistent and timely deliveries across different countries and vendors.Delayed shipments can cause disruption in the supply chain, increase operational costs and lead to dissatisfied customers. There is a ned to understand the underlying causes of these delays and the factors affecting freight costs.


From the dataset provideed, we observed that several elements- such s shipment mode, country, vendor terms, lead time and shipment weight-mat influence delivery timeliness and cost effectiveness. However, the specific impact nd interactions of these variables are unclear without a detailed exploratory analysis.

#### **Define Your Business Objective?**

The business objective of this project:

- Analyze delivery data to identify key factors influencing om-time vs delayed shipments.
- Evaluate how vendor terms, shipment modes and countries impact freight costs and delivery reiability.
- Discover patternsand trends taht FedEx can act on to improve logistics performance.
- Provide data-driven insights and recommendations to help FedEx optimize operations, reduce delivery delays and lower freight costs.

By achiving these objectives, FedEx will be better equipped to make strategic decisions about vendor management, shipment planning and resource allocation - ultimately enhancing customer satisfaction and profitability.


# ***Let's Begin !***

## ***1. Know Your Data***

### Import Libraries

In [None]:
# Import Libraries

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.express as px



### Dataset Loading

In [None]:
# Load Dataset

In [None]:
from google.colab import files
uploaded = files.upload()


### Dataset First View

In [None]:
# Dataset First Look

In [None]:
df = pd.read_csv('SCMS_Delivery_History_Dataset.csv')
df.head()

In [None]:
df.tail()

### Dataset Rows & Columns count

In [None]:
# Dataset Rows & Columns count

In [None]:
df.shape

### Dataset Information

In [None]:
# Dataset Info

In [None]:
df.info()

#### Duplicate Values

In [None]:
# Dataset Duplicate Value Count

In [None]:
duplicate_count = df.duplicated().sum()
print(f"number of duplicate rows: {duplicate_count}")

In [None]:
duplicate_count = df.duplicated().sum()
print(f"number of duplicate columns: {duplicate_count}")

#### Missing Values/Null Values

In [None]:
# Missing Values/Null Values Count

In [None]:
missing_values = df.isnull().sum()

In [None]:
missing_values = missing_values[missing_values > 0]

In [None]:
missing_values

In [None]:
# Visualizing the missing values

In [None]:
plt.figure(figsize=(12,6))
sns.heatmap(df.isnull(), cbar=False, cmap="YlGnBu")
plt.title("Heatmap of missing values")
plt.show()

In [None]:
#Handling missing values

In [None]:
#shipment mode

In [None]:
most_common_mode = df['Shipment Mode'].mode()[0]
df.fillna({'Shipment Mode':'most_common_mode'}, inplace = True)

In [None]:
#Dosage

In [None]:
df.fillna({'Dosage':'Unknown'}, inplace = True)

### What did you know about your dataset?

This dataset is a historical delivery record from SCMS(Supply chain management system), likely associated with a logistics provider like FedEx. It includes detailed information about shipment such as :
- Geographic data: country, Region
- Logistics data: shipment mode,weght, freight cost, delivery dates, vendors
- Performance metrics: Delivered on time, days late, PO processing time and delivery delays
- Financial data: Freight cost(USD), line item insurance(USD) and line item value

From initial exploration:
- The dataset contains 10,324 rows and 30 column, representing individual shipment records.
- There are a mix of categorical, numerical and date fields.
- Some fields will require data type conversion.

 This dataset is rich in logistics,cost and performance metrics, making it suitable for answering business questions related to delivery efficieny, cost analysis, vendor performance and regional shipment trends.

## ***2. Understanding Your Variables***

In [None]:
# Dataset Columns

In [None]:
df.columns

In [None]:
# Dataset Describe

In [None]:
df.describe()

### Variables Description

Below is a detailed description of each variable present in the SCMS delivery hisstory dataset. Understanding these variable is crucial for performing effective analysis and deriving actionable insights.

**variable**                **description**
1.ID                        -Unique identifier for each record/shipment.

2.Country                   -The destination country for the delivery.

3.Region                    -Geographical region where the delivery took place.

4.PO                       -Main purchase order number related to the shipment.

5.Prime_line_no             -The line number within the order indicating items sequece.

6.Vendor                    -Supplier responsible for delivering the goods.

7.Shipment_mode             -Mode of transport used

8.PO sent to vendor date    -Date when the purchase order was sent to the vendor.

9.Scheduled delivery date   -Expected delivery date of the item.

10.Delivery to client date   -Actual date when the item was delivered to the client.

11.Delivery recorded date    -Date when the delivery was offiialy recorded in the system

12.Item description          -Description of the delivered product.

13.Line item quantity        -Total number of units in the shipment line.

14.Weight(kg)                -Weight of the shipment in kilograms.

15.Freight cost              -Shipping cost in USD.

16.Line item insurance(USD)  -Insurance value for the shipment.

17.Line item value(USD)      -Total declared value of the items shipped.

18.Delivered on time         -Indicates whether the shipment was delivered on time.

19.Managed by                -Entity or organization that managed the shipment process.

20.Sub classification        -Subgroup classification of the item.

21.Late delivery risk        -Binary flag or indication of whether a shipment risk of delay exists.

22.Dosage                    -Amount value for medical items.

23.Dosage form               -Form in which the dosage is administered.

24.Product group             -High-level grouping of products.

25.Product category          -More specific classification of the item.

26.Item type                 -Classification of inventory

27.Unit of measure           -unit used for packaging

28.Manufacture               -Name of the company that manufactured the product.

29.Brand                     -Brand under which the product is marketed.

30.Currency                  -Currency used for the financial transactions in the dataset.

### Check Unique Values for each variable.

In [None]:
# Check Unique Values for each variable.

In [None]:
unique_counts = df.nunique().sort_values(ascending=False)
print("unique values in each column:\n")
print(unique_counts)

## 3. ***Data Wrangling***

### Data Wrangling Code

In [None]:
# convert date columns

In [None]:
date_cols = ['PO Sent to Vendor Date','Scheduled Delivery Date','Delivered to Client Date']
for col in date_cols:
    df[col] = pd.to_datetime(df[col],errors='coerce')


In [None]:
# convert numeric columns

In [None]:
num_cols = ['Freight Cost (USD)','Line Item Value','Line Item Insurance (USD)','Weight (Kilograms)']
for col in num_cols:
    df[col]= pd.to_numeric(df[col],errors='coerce')


In [None]:
# Rename columns

In [None]:
df.rename(columns={'Freight Cost (USD)':'Freightcost','Weight (Kilograms)':'Weightkg','Line Item Insurance (USD)':'InsuranceUSD'}, inplace=True)

In [None]:
#create new useful columns

In [None]:
df['Delivery Delay (Days)'] = (df['Delivered to Client Date'] - df['Scheduled Delivery Date']).dt.days


In [None]:
df['On Time Delivery'] = df['Delivery Delay (Days)'].apply(lambda x:'late' if x > 0 else 'On Time')

### What all manipulations have you done and insights you found?

Before proceeding with the analysis, we performed several essential data wrangling steps to clean and prepare the dataset.
- Checked for missing values and handleed them appropriately.
- Removed duplicates to maintain data intigrity.
- Convered data types for data columns for accurate time-basee analysis.
- Created new features such as delivery delay to assess delivery performance.
- Standardized categorical values to ensure consistency.

**Insights**
- Several columns contaoned null values, especially in date fields and insurance amounts.
- Shipment mode is dominated by Air, suggesting urgency or value in deliveries.
- Some shipments were delivered earlier than the scheduled date, while others and significant delays- this insight will be further explored in upcoming visualization.
- Weight and freight cost show early signs of correlation, indicating potential for cost optimization.

## ***4. Data Vizualization, Storytelling & Experimenting with charts : Understand the relationships between variables***

#### Chart - 1

#### Chart - 1 - Distribution of Shipment Modes

**(Bar chart)**

In [None]:
shipment_counts = df['Shipment Mode'].value_counts()
plt.figure(figsize=(8,5))
shipment_counts.plot(kind='bar', color = 'skyblue', edgecolor = 'black')
plt.title('Distribution of Shipment Mode')
plt.xlabel('Shipment Mode')
plt.ylabel('Number of Shipment')
plt.xticks(rotation=45)
plt.tight_layout()
plt.show()

##### 1. Why did you pick the specific chart?

-  We used a **Bar chart** beacuse Shipment mode is a categorical variable, and we want to understand the frequency of each shipment method used in the dataset.

##### 2. What is/are the insight(s) found from the chart?

-  **Insights**
      -  Air is the most frequently used shipment mode by a significant margin.
      -  Other modees like Ocean and Truck are less common.
      -  This indicates that he logistiv=cs strategy may be focused on fast delivery, possibly for high-priority items.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

- **YES** the gained insights help create a positive business impact.
    -  Air shipments are usually more expensive- the compay can evaluate whether switiching some order to slower but chaper modes is feasible.
    -  If fast delivery is critical, warehouses might need stock replenishment strategies alinged with the trend.
- Yes can lead to **negative growth** over reliance on air shipmens might lead to higher logistics costs, which can reduce profit margins. If this trend isn't alinged wit the urgency of the orders, it could negatively impact business dinance.


#### Chart - 2

#### Chart - 2 - Relationship Between Weight and Freight cost

**(Scatter plot)**

In [None]:
plt.figure(figsize=(8,5))
plt.scatter(df['Weightkg'], df['Freightcost'], alpha=0.5, color='teal', edgecolor='Purple')
plt.title('Weight vs. Freight Cost')
plt.xlabel('weightkg')
plt.ylabel('Freightcost')
plt.grid(True)
plt.tight_layout()
plt.show()

##### 1. Why did you pick the specific chart?

-  We used a scatter plot because we're analysing the relationship between 2 numerical variable: Weightkg and Freightcost . A scatter plot is the best way to visually explore possible correlation or pattern.

##### 2. What is/are the insight(s) found from the chart?

-  **Insights**
    -  There appears to be a possitive correlation between weight and freight cost - as weight increases, cost generally increases.
    -  However, ther are several outlier where freight costs are unusally high for lower weighhts, which might be due to shipment mode or special handling charges.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

-  **YES** the insight helps create a positive business impact
    -  Understanding ths relationship can help in predicting fright costs more accurately.
    -  It also helps logistics terms in pricing decisins and budgeting for sipment.
    -  By identifying outliers, the company can investigate potential inefficiencies or overcharges.
- Yes if many outlier exist where lightweight shipments cost excessively more, it could indicate ineffiient shipping practices, like using costly express options unnecessarilly. this culd **negatively** affect margins if not managed.

#### Chart - 3

#### Chart - 3 - Freight cost by Shipment mode

**(Box plot)**

In [None]:
plt.figure(figsize=(8,5))
sns.boxplot(data=df, x='Shipment Mode', y='Freightcost',hue='Shipment Mode', palette='pastel', legend=False)
plt.title('Freight Cost Distribution by Shipment Mode')
plt.xlabel('Shipment Mode')
plt.ylabel('Freightcost')
plt.xticks(rotation=45)
plt.tight_layout()
plt.show

##### 1. Why did you pick the specific chart?

-  a **Box plot** is idle for comparing how freight costs vary across different shipment modes. It clearly shows the median, quartiles and outlier.

##### 2. What is/are the insight(s) found from the chart?

-  **Insights**
    -  Air shipments have a highermedian freight costs, with a wider range and many high value outliers.
    -  Oceans and Truck shipments are relatively cheaper an more consistent in cost.
    -  This shows that the mode of shipment significantly impact cost.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

-  **YES** creat a possitive business impact
    -  By knowing which shipment modes are costlier, decision-make can optimize for cost efficienc depending on delivery urgency.
    -  Policy suggestions like using air onl when necessary can be derived from the chart.
-  If the cost shipment modes like Air are used unnecessarily for the non-urgent deliveries it could increase logistic costs and redue profitability. this coukd indicate inefficient shipment planning.

#### Chart - 4 - Delivery Time Analysis(PO sent to vendor vs. Scheduled Delivery)

**(Histogram + KDE plot)**

In [None]:
df['Lead Time (Days)'] = (df['Scheduled Delivery Date'] - df['PO Sent to Vendor Date']).dt.days

In [None]:
plt.figure(figsize=(10,6))
sns.histplot(df['Lead Time (Days)'].dropna(), kde=True, bins=30, color='skyblue')
plt.title('Distribution of Delivery Lead Time')
plt.xlabel('Lead Time (Days)')
plt.ylabel('Frequency')
plt.tight_layout()
plt.show()

##### 1. Why did you pick the specific chart?

-  This **Histogram with KDE** like helps visualize the distribution of lead times across all shipments, showing central tendencies, skewness and spread.

##### 2. What is/are the insight(s) found from the chart?

-  **Insight**
      -  Most shipments have a lead time between 20 -60 days.
      -  A few shipments take over 100 days, indicating possible delays or plaanig inefficiencies.
      -  The cureve may be right-skewed, suggesting occasional long delays.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

-  **YES** a positive impact-
      -  Identifying a typical lead time helps in forecatsing delivery scheduleds and setting realistic customer expectations.
      -  Long tail delays can be flagged for further process audits or vendor evaluation.
-  Long or inconsistenct lead time can hurt customer satisfaction and operational efficiency. If these aren't addressed, they could lead to **Supply chain bottlenecks or incresed costs**

#### Chart - 5- Line Item Insurance vs Freight Cost

**(Scatter plot with Trend line)**

In [None]:
plt.figure(figsize=(10,6))
sns.scatterplot(x= 'InsuranceUSD', y= 'Freightcost', data=df, color= 'purple', alpha=0.6)
sns.regplot(x='InsuranceUSD', y='Freightcost', data=df, scatter=False, color='black')
plt.title('Line Item Insurance vs Freight Cost')
plt.xlabel('InsuranceUSD')
plt.ylabel('Freightcost')
plt.tight_layout()
plt.show()

In [None]:
print(df[['InsuranceUSD', 'Freightcost']].dtypes)
print(df[['InsuranceUSD', 'Freightcost']].isnull().sum())


In [None]:
df.loc[:, 'InsuranceUSD'] = pd.to_numeric(df['InsuranceUSD'], errors='coerce')
df.loc[:, 'Freightcost'] = pd.to_numeric(df['Freightcost'], errors='coerce')


In [None]:
df.dropna(subset=['InsuranceUSD', 'Freightcost'], inplace=True)


##### 1. Why did you pick the specific chart?

- A scatter plot is idle for showing the relationship betwee  2 continuous variables. Adding a regression lines helps visualize any liner trend between them.

##### 2. What is/are the insight(s) found from the chart?

-  **Insight**
      -  We can observe that higher insurance values tend to correspond with higer freight costs. this may indicate that more expensive or risky items also cost more to ship.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

-  **YES** this insight can help logistics and pricing terms understand how insurance impacts total sjhipments costs and possibly optimize package grouping or shipping methods for cost saving.
-  If insurance costs rise disproportionately compared to fright costs, it could signal inefficient packaging or shipment risk, which may lead to increased opertional costsif not addressed.

#### Chart - 6- Delivery Status Across Shipment Mode

**(Count plot)**

In [None]:
plt.figure(figsize=(10,6))
sns.countplot(data=df, x='Shipment Mode', hue='On Time Delivery', palette='Set2')
plt.title('Delivery Status BY Shipment Mode')
plt.xlabel('Shipment Mode')
plt.ylabel('Count')
plt.xticks(rotation=45)
plt.tight_layout()
plt.show()

##### 1. Why did you pick the specific chart?

-  A **Countplot** is effective for comparing categorical variables. Here, we're interested in how frequently each shipmentmode is used and how successful it is in terms of on-time vs late delivery.

##### 2. What is/are the insight(s) found from the chart?

-  **Insight**
    -  Air shipments tend to havr higher on-time delivery.
    -  Some shipment odes might have disproportonately more late deliveries.
    -  Certain modes might be more prone to delays.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

-  **YES** understading which shipment modses are more reliable helps in
    -  Optimizing logistics planning.
    -  Choosing efficient shipment methods.
    -  Improving customer satisfaction by reducing late deliveries.

-  If a frequently used shipment mode cosistenly resukts in delays, it could **negatively** affect operations and customer trust. This insight can drive a switch to more reliable alternatives.

#### Chart - 7 - Delivery Delay by Country

**(Bar chart)**

In [None]:
plt.figure(figsize=(12,6))
delay_by_country= df.groupby('Country')['Delivery Delay (Days)'].mean().reset_index()
sns.barplot(x= 'Country', y='Delivery Delay (Days)',data=delay_by_country,hue='Country', palette = 'coolwarm')
plt.title(' Average Delivery Delays by Country', fontsize=16)
plt.xlabel('Country', fontsize=12)
plt.ylabel('Average Delivery Delay (Days)', fontsize=12)
plt.xticks(rotation=45, ha='right')
plt.tight_layout()
plt.show()

##### 1. Why did you pick the specific chart?

-  A **Bar chart** is ideale for comparing average delivery delays across countries. It clearly shows performance dfference and helps identifying regions with potential logistics issues.

##### 2. What is/are the insight(s) found from the chart?

-  **Insights**
      -  Some countries experience significantly higher delivery delays.
      -  This variation may be due to customs delays, inefficient logistics or supplier issues.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

-  **YES** it gives a positve business impact. The insights help identify where to improve logistics and optimize delivery processes.
-  High delays in some countries may hurt customer satisfaction and ead to potential revenue loss if not addressed.

#### Chart - 8- Top 10 Vendor by total Freight cost

**(Horizontal bar chart)**

In [None]:
top_vendors = df.groupby('Vendor')['Freightcost'].sum().nlargest(10).reset_index()

plt.figure(figsize=(12,6))
sns.barplot(data=top_vendors, y='Vendor', x='Freightcost', hue= 'Vendor', palette='Blues_r', legend=False)
plt.title('Top 10 Vendor by total freight cost')
plt.xlabel('Total Freight Cost (USD)')
plt.ylabel('Vendor')
plt.tight_layout()
plt.show()

##### 1. Why did you pick the specific chart?

-  This chart hghlight which vendors generate the highest freight costs. It helps in tracking major cost contributors and evaluating vendor efficiency.

##### 2. What is/are the insight(s) found from the chart?

-  **Insights**
    -  Some vendors consistently have higher freight.
    -  These may be due to larger shipment volumes, hevier goods or less optimized shipping menthods.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

-  It give a positive impact while, helps identifying vendors for cost optimization and negotiation.
-  As for negative impact, unchecked high freight costs may reduce profitability or signal inefficiency.

#### Chart - 9- On Time Delivery Rate by Vendor

**(Lollipop chart)**

In [None]:
on_time_rate = df.groupby('Vendor')['On Time Delivery'].value_counts(normalize=True).unstack().fillna(0).reset_index()

on_time_rate = on_time_rate[['Vendor','On Time']].rename(columns={'On Time': 'Ontimerate'})
on_time_rate = on_time_rate.sort_values(by='Ontimerate', ascending=True)

In [None]:
plt.figure(figsize=(12,8))
plt.hlines(y=on_time_rate['Vendor'], xmin=0, xmax=on_time_rate['Ontimerate'], color='skyblue')
plt.plot(on_time_rate['Ontimerate'], on_time_rate['Vendor'], "o", color='blue')
plt.title('On-Time Delivery Rate by Vendor')
plt.xlabel('On-Time Delivery rate')
plt.ylabel('Vendor')
plt.tight_layout()
plt.show()

##### 1. Why did you pick the specific chart?

-  This chart visualises vendor delivery performance in a clean and engaging way, helping identify top-performing vendors.

##### 2. What is/are the insight(s) found from the chart?

-  **Insights** some vendors consistently deliver on time, while others lag behind -- highlighing improvement opportunities.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

-  Yes, realiable vendors support smooth operations.
-  Poor performers might cause delays, leading to negative customer experiences.

#### Chart - 10- Lead Time vs Delivery Delay

**(scatterplot)**

In [None]:
plt.figure(figsize=(10,6))
sns.scatterplot(data=df, x='Lead Time (Days)', y='Delivery Delay (Days)', hue= 'Shipment Mode', palette='Set2')
plt.title('Lead Time vs Delivery Delay by Shipment Mode', fontsize=16)
plt.xlabel('Lead Time (Days)')
plt.ylabel('Delivery Delay (Days)')
plt.legend(title='Shipment Mode', bbox_to_anchor=(1.05,1), loc='upper left')
plt.tight_layout()
plt.show()

##### 1. Why did you pick the specific chart?

-  This scatter plot helps identify whether longer lead times help reduce delivery delays, and whether any shipment mode stands out.

##### 2. What is/are the insight(s) found from the chart?

-  **Insight** In some cases, longer lead times don't guarentee fewer delays. Certain shipment modes show more consistent delays regardless of lead time.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

-  Yes this insight can lead to optimizing lead time estimates and evaluating shipment modes for better accuracy in delivery planning.
  

#### Chart - 11- Correlation Heatmap of numeric features

**Heatmap**

In [None]:
plt.figure(figsize=(10,6))
numeric_cols = ['Weightkg','Freightcost','InsuranceUSD','Lead Time (Days)','Delivery Delay (Days)','Line Item Value','Line Item Quantity']
corr = df[numeric_cols].corr()

sns.heatmap(corr, annot=True, cmap='YlGnBu', linewidths=0.5)
plt.title('Correlation heatmap of key metrics', fontsize=16)
plt.show()


##### 1. Why did you pick the specific chart?

-  This heatmap provides a compact view of how different numeric factors are related, helping spot potential drivers of cost and delays.

##### 2. What is/are the insight(s) found from the chart?

-  **Insight**
      -  Freight cost is positively correlated with weight, as expected.
      -  Delivery delay shows mild correlation with lead time.
      -  Insurance value also slightly correlates with item value and weight.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

-  Yes, by identify strong correlations, the company can optimize freight planning and set realistic lead tie, educing costs and improving delivery efficiency.
-  If ignored, the correlation between higher weight and freight cost lead to profit shinkage. without optimization, logistics may become unnecessarily expensive.

#### Chart - 12- Pie Chart of Shipment Mode Distribution

**(Pie chart)**

In [None]:
shipment_mode_counts = df['Shipment Mode'].value_counts()

In [None]:
plt.figure(figsize=(8,8))
plt.pie(shipment_mode_counts, labels=shipment_mode_counts.index, autopct='%1.1f%%', startangle=140, colors=plt.cm.Paired.colors)
plt.title('Shipment mode distribution', fontsize=16)
plt.show()

##### 1. Why did you pick the specific chart?

-  Pie charts are great for visualizing categorical distributions.Here, it helps clearly show what percentage of shipments rely on each mode of transport.

##### 2. What is/are the insight(s) found from the chart?

-  **Insight**
      -  You can see if your supply chain is heavily reliant on one mode.
      -  It helps identify whether you're balancing cost-efficieny and speed.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

-  Absolutely. understanding the shipment mode split allows better logistics planning, cost control and risk management.
-  An over-dependence on high-cost shipment methods like air could hurt margins.A balanced distribution is often more sustainable.

## **5. Solution to Business Objective**

To enhance the overall efficiency and reliability of the supply chain, the following solutions are proposed based on data insights:

 1. **Optimize shipment mode usage:** Analyze cost vs benefit for each shipment mode. Prioritize sea or road transport for non-urgent orders to reduce freight costs without compromising delivery timelines.
 2. **Improve Vendor performance monitoring:** Implement a vendor performance dashboard that tracks on-time delivery rates. Reward reliable vendors and renegotiate terms with underperforming ones.
 3. **Reduce lead time variability:** Standardize internal procurement and communication processes to minimize delays between PO issuance and vendor confirmation.
 4. **Address country-specific delays:** Identify countries with high delivery delays.Collaborate with logistics providers or local paterns to mitigate region-specific issues.
 5. **Cost efficiency through bundling:** Combine smaller orders into bulk shipments to optimize freight and insurance costs, reducing unnecessary expense.
 6. **Preditive delay management:** Utilize historical data tto develop predict models for anticipating delivery delays and proactively taking corrective actions.
    

# **Conclusion**

The anaysis of the FedEx logistics dataset has revealed several citical insights that can significantly enhance supply chain efficiency and overall operational performance.

Firstly, the data indicates that shipment mode plays a pivotal role in determining freight costs and delivery timelines. By evaluating the performance of various shipment methods, it becomes clear that aligning shipment modes with delivery urgency and cost constraints can help reduce unnecessary expenses and delays.

Secondly, the exploration of vendor performance through metrics like on-time delivery rates highlights a need for tighter vendor management. Vendors with consistently high delay rates or low punctuality can be flagged for performance reviews or alternative sourcing.

Additionally, analysis of freight and insurance costs in the relation to shipment weight and delivery timeliness help identify cost outliers.This insights can guide targeted interventions, such as improving customs processes or building localwarehousing capacity.

Finally, the lead tme and delay relationship chart shows that even planned lead times don't always guarentee on-time delivery. This suggests the need to refine lead time calculations and buffer allocations.

In summary, this data-driven approach provides a solid foundation for strategic decision-making. It equips the organization to:

-  Optimize supply chain planning.
-  Enhance customer satisfaction through timely deliveries.
-  Reduce costs
-  Building stronger vendor partnerships.

Regularly updating and analyzing this data will hep FedEx and its stakeholders remain agile, competitive and responsive to changing logistics demands.  