Skip to content

Techwithabhi/data_analysis_project-02

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

11 Commits
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Typing SVG

Python Streamlit Jupyter Pandas Plotly License


πŸš€ A comprehensive data analysis project on Nassau shipping data β€” uncovering trade patterns, cargo trends, port performance, and key logistics insights through interactive visualizations and a live Streamlit dashboard.


🌊 Live Dashboard

πŸš€ Launch Dashboard

Fully interactive β€’ No installation required β€’ Real-time exploration


πŸ“Œ Table of Contents


🎯 Project Overview

The Nassau Shipping Analysis project is an end-to-end data analysis solution that dives deep into shipping data from the Nassau region. Using Python's powerful data science ecosystem, this project transforms raw shipping records into actionable business intelligence through:

  • πŸ” Exploratory Data Analysis (EDA) β€” Uncovering hidden patterns in shipping data
  • πŸ“Š Statistical Summaries β€” Descriptive statistics, correlations, and distributions
  • πŸ—ΊοΈ Visual Storytelling β€” Rich charts and graphs for every key metric
  • 🌐 Interactive Dashboard β€” A deployed Streamlit app for live data exploration
  • πŸ’‘ Business Insights β€” Actionable recommendations from data findings

πŸ“‚ Repository Structure

πŸ“¦ data_analysis_project/
β”œβ”€β”€ πŸ“ Nassau_Shipping_Analysis/
β”‚   β”œβ”€β”€ πŸ““ Nassau_Shipping_Analysis.ipynb   # Main Jupyter Notebook (EDA + Analysis)
β”‚   β”œβ”€β”€ 🐍 app.py                           # Streamlit Dashboard Application
β”‚   β”œβ”€β”€ πŸ“Š data/                            # Raw & Processed Datasets
β”‚   └── πŸ“Έ assets/                          # Charts & Visualizations
β”œβ”€β”€ πŸ“ .devcontainer/                       # Dev Container Configuration
└── πŸ“„ README.md                            # Project Documentation

πŸ“Š Dataset Overview

The dataset captures detailed shipping activity across Nassau's maritime trade routes.

Feature Description Type
Ship Name Name/ID of the vessel string
Route Origin β†’ Destination string
Cargo Type Type of goods transported categorical
Cargo Weight (tons) Weight of the shipment float
Departure Date Date of departure datetime
Arrival Date Date of arrival datetime
Transit Time (days) Duration of voyage int
Port of Loading Source port string
Port of Discharge Destination port string
Freight Cost ($) Cost of shipping float
Ship Type Vessel classification categorical
Status Delivered / In Transit / Delayed categorical

πŸ” Key Metrics & KPIs

πŸ“¦ Metric πŸ“ˆ Value πŸ”Ž Description
Total Shipments Analyzed across full dataset Volume of all recorded voyages
Avg. Transit Time Computed per route Mean voyage duration in days
Top Cargo Category Dominant freight type Most shipped cargo class
Busiest Port Highest throughput port Port with most departures/arrivals
Avg. Freight Cost Calculated across routes Mean cost per shipment in USD
On-Time Delivery Rate % of non-delayed shipments Operational efficiency KPI
Cargo Volume (tons) Total aggregated weight Sum of all cargo transported
Peak Shipping Month Identified via time series Month with highest activity

πŸ“ˆ Analysis & Charts

The analysis covers the following visual explorations generated in the Jupyter Notebook and rendered live in the dashboard:

πŸ“Š Distribution Analysis

βœ… Cargo weight distribution (histogram + KDE)
βœ… Freight cost distribution by ship type
βœ… Transit time spread across routes
βœ… Shipment volume by month (time series)

πŸ”— Correlation & Relationships

βœ… Heatmap β€” Correlation matrix of numeric features
βœ… Scatter plot β€” Freight cost vs. cargo weight
βœ… Box plot β€” Transit time per cargo type
βœ… Pair plot β€” Multi-variable relationship matrix

🚒 Port & Route Intelligence

βœ… Bar chart β€” Top 10 busiest ports (loading & discharge)
βœ… Sankey diagram β€” Route flow visualization
βœ… Grouped bar β€” Route-wise average freight cost
βœ… Donut chart β€” Cargo type distribution

πŸ“… Time Series Trends

βœ… Monthly shipment volume trend
βœ… Seasonal freight cost fluctuation
βœ… Year-over-year cargo growth
βœ… Rolling average transit time

🚦 Operational Metrics

βœ… Delivery status breakdown (pie chart)
βœ… Delay rate per route (horizontal bar)
βœ… Ship type utilization (stacked bar)
βœ… Cost efficiency ratio per vessel class

πŸ› οΈ Tech Stack

Layer Technology Purpose
Language Python Core programming language
Notebook Jupyter Interactive EDA environment
Data Wrangling Pandas Data manipulation & analysis
Numerical NumPy Numerical computations
Visualization Matplotlib Static charting library
Visualization Seaborn Statistical visualizations
Interactive Charts Plotly Interactive chart rendering
Dashboard Streamlit Web app & dashboard framework
Deployment Streamlit Cloud Live cloud deployment

⚑ Quick Start

Follow these steps to run the project locally:

1️⃣ Clone the Repository

git clone https://github.com/Techwithabhi/data_analysis_project.git
cd data_analysis_project

2️⃣ Create a Virtual Environment

python -m venv venv
source venv/bin/activate        # On macOS/Linux
venv\Scripts\activate           # On Windows

3️⃣ Install Dependencies

pip install -r requirements.txt

4️⃣ Run the Jupyter Notebook

cd Nassau_Shipping_Analysis
jupyter notebook Nassau_Shipping_Analysis.ipynb

5️⃣ Launch the Streamlit Dashboard Locally

streamlit run app.py

🌐 Or simply visit the Live Dashboard β€” no setup required!


πŸ–₯️ Dashboard Features

The live Streamlit app offers:

  • πŸŽ›οΈ Interactive Filters β€” Filter by date range, cargo type, ship type, and port
  • πŸ“Š Dynamic Charts β€” All plots update in real-time based on filter selections
  • πŸ“‹ Raw Data View β€” Explore the underlying dataset with search & sort
  • πŸ“₯ Download Options β€” Export filtered data as CSV
  • πŸ“± Responsive Layout β€” Works seamlessly across desktop and mobile
  • πŸŒ™ Dark/Light Mode β€” Adapts to system theme preference

πŸ’‘ Key Insights

Derived from the Nassau Shipping Analysis:

πŸ“Œ INSIGHT 1 β€” Bulk cargo constitutes the largest share of shipments
   β†’ Dominant freight type driving port throughput volumes

πŸ“Œ INSIGHT 2 β€” Transit time shows a strong positive correlation with freight cost
   β†’ Longer routes yield proportionally higher operational expenses

πŸ“Œ INSIGHT 3 β€” Q3 (Jul–Sep) records peak shipping activity
   β†’ Seasonal demand surge affects port congestion and pricing

πŸ“Œ INSIGHT 4 β€” Top 3 routes account for 60%+ of total cargo volume
   β†’ High route concentration presents both efficiency and risk factors

πŸ“Œ INSIGHT 5 β€” Delay rates vary significantly by cargo type
   β†’ Perishable goods experience 2x higher delay rates vs. dry bulk

🀝 Contributing

Contributions, issues, and feature requests are welcome!

  1. Fork the project
  2. Create your feature branch: git checkout -b feature/AmazingFeature
  3. Commit your changes: git commit -m 'Add some AmazingFeature'
  4. Push to the branch: git push origin feature/AmazingFeature
  5. Open a Pull Request


πŸ‘€ Connect With Me


Abhi Sarkar β€” Data Analyst | Python Developer | Tech Enthusiast


Portfolio LinkedIn Instagram Dashboard


"Turning raw data into meaningful stories β€” one analysis at a time."

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors