Skip to content

Larousse2001/SkyInsight

Repository files navigation

✈️ SkyInsight — Airlines Data Engineering Project

📌 Overview

SkyInsight is an end-to-end Data Engineering platform designed to help airlines improve:

  • Passenger satisfaction
  • Customer loyalty
  • Operational efficiency
  • Environmental sustainability

The project integrates BI, Machine Learning, NLP, and Computer Vision into a unified pipeline — from raw data ingestion to intelligent insights.


🏗️ Project Architecture

Raw Data → Data Cleaning → Data Warehouse → BI → ML / NLP / CV

  • Data Sources: CSV datasets (loyalty, flights, satisfaction)
  • ETL: Python / SQL transformations
  • Storage: Structured datasets (cleaned & modeled)
  • Analytics: Power BI dashboards
  • AI Layer:
    • Machine Learning
    • Natural Language Processing (NLP)
    • Computer Vision

📂 Repository Structure

Airlines_Project/
│
├── 01_Presentations/      # Project presentations (slides, demos)
├── 02_Documentation/      # Project documentation & architecture
├── 03_Data_Raw/           # Raw datasets (CSV files)
├── 04_Data_Cleaned/       # Processed & cleaned datasets
├── 05_Scripts/            # ETL scripts (Python / SQL)
├── 06_Dashboards/         # Power BI dashboards & exports
├── 07_ML/                 # Machine Learning models & notebooks
├── 08_CNN/                # Computer Vision (CNN / YOLO models)
├── 09_Reports/            # Final reports & analysis
├── 10_NLP/                # (Upcoming) NLP pipelines & text analysis
│
├── .gitignore
├── README.md
└── yolo11n.pt             # Pretrained YOLO model

📊 Business Intelligence Objectives

  • Loyalty Analytics

    • Customer Lifetime Value (CLV)
    • Loyalty segmentation & churn tracking
  • Flight Performance

    • Distance, revenue, and utilization metrics
  • Passenger Satisfaction

    • Analysis across multiple service features
  • Sustainability Reporting

    • CO₂ emissions tracking
    • Fuel efficiency per route

🤖 Machine Learning Use Cases

  • Satisfaction Classification

    • Predict satisfied vs dissatisfied passengers
  • Churn Prediction

    • Identify customers likely to leave loyalty programs
  • Customer Segmentation

    • Clustering (K-Means / DBSCAN)
  • Route Optimization

    • Improve load factor and efficiency
  • Carbon Emission Prediction

    • Estimate CO₂ per passenger

🧠 NLP (Upcoming - 10_NLP/)

  • Sentiment Analysis (reviews & feedback)
  • Topic Modeling (LDA / BERTopic)
  • Aspect-Based Sentiment Analysis
  • Keyword Extraction
  • Automated report generation

👁️ Computer Vision (CNN)

Located in 08_CNN/:

  • Cabin cleanliness classification
  • Passenger crowd detection (YOLO)
  • Aircraft anomaly detection
  • Baggage handling quality control

🌍 Sustainability Goals Alignment

This project supports:

  • SDG 9 → Industry & Innovation
  • SDG 12 → Responsible Consumption
  • SDG 13 → Climate Action

⚙️ Tech Stack

  • Languages: Python, SQL
  • Data Processing: Pandas, PySpark
  • Visualization: Power BI
  • ML/DL: Scikit-learn, XGBoost, TensorFlow / PyTorch
  • NLP: NLTK, SpaCy, BERTopic
  • Computer Vision: OpenCV, YOLO
  • Version Control: Git & GitHub

🚀 Getting Started

Clone the repository

git clone https://github.com/Larousse2001/Airlines_Project.git
cd Airlines_Project

📅 Project Roadmap

  • ✅ Data Engineering Pipeline
  • ✅ Data Warehouse & BI
  • ✅ Machine Learning Models
  • 🔄 NLP Module (10_NLP)
  • 🔄 Advanced Computer Vision
  • 🔄 Deployment (API / Dashboard)

👨‍💻 Contributors

  • Achref AROUS
  • Oumaima ROUIS
  • Iheb KOUKI
  • Mehdi JOUDI
  • Eya CHIIBNI

📜 License

This project is for academic and educational purposes.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors