SkyInsight is an end-to-end Data Engineering platform designed to help airlines improve:
- Passenger satisfaction
- Customer loyalty
- Operational efficiency
- Environmental sustainability
The project integrates BI, Machine Learning, NLP, and Computer Vision into a unified pipeline — from raw data ingestion to intelligent insights.
Raw Data → Data Cleaning → Data Warehouse → BI → ML / NLP / CV
- Data Sources: CSV datasets (loyalty, flights, satisfaction)
- ETL: Python / SQL transformations
- Storage: Structured datasets (cleaned & modeled)
- Analytics: Power BI dashboards
- AI Layer:
- Machine Learning
- Natural Language Processing (NLP)
- Computer Vision
Airlines_Project/
│
├── 01_Presentations/ # Project presentations (slides, demos)
├── 02_Documentation/ # Project documentation & architecture
├── 03_Data_Raw/ # Raw datasets (CSV files)
├── 04_Data_Cleaned/ # Processed & cleaned datasets
├── 05_Scripts/ # ETL scripts (Python / SQL)
├── 06_Dashboards/ # Power BI dashboards & exports
├── 07_ML/ # Machine Learning models & notebooks
├── 08_CNN/ # Computer Vision (CNN / YOLO models)
├── 09_Reports/ # Final reports & analysis
├── 10_NLP/ # (Upcoming) NLP pipelines & text analysis
│
├── .gitignore
├── README.md
└── yolo11n.pt # Pretrained YOLO model-
Loyalty Analytics
- Customer Lifetime Value (CLV)
- Loyalty segmentation & churn tracking
-
Flight Performance
- Distance, revenue, and utilization metrics
-
Passenger Satisfaction
- Analysis across multiple service features
-
Sustainability Reporting
- CO₂ emissions tracking
- Fuel efficiency per route
-
Satisfaction Classification
- Predict satisfied vs dissatisfied passengers
-
Churn Prediction
- Identify customers likely to leave loyalty programs
-
Customer Segmentation
- Clustering (K-Means / DBSCAN)
-
Route Optimization
- Improve load factor and efficiency
-
Carbon Emission Prediction
- Estimate CO₂ per passenger
- Sentiment Analysis (reviews & feedback)
- Topic Modeling (LDA / BERTopic)
- Aspect-Based Sentiment Analysis
- Keyword Extraction
- Automated report generation
Located in 08_CNN/:
- Cabin cleanliness classification
- Passenger crowd detection (YOLO)
- Aircraft anomaly detection
- Baggage handling quality control
This project supports:
- SDG 9 → Industry & Innovation
- SDG 12 → Responsible Consumption
- SDG 13 → Climate Action
- Languages: Python, SQL
- Data Processing: Pandas, PySpark
- Visualization: Power BI
- ML/DL: Scikit-learn, XGBoost, TensorFlow / PyTorch
- NLP: NLTK, SpaCy, BERTopic
- Computer Vision: OpenCV, YOLO
- Version Control: Git & GitHub
git clone https://github.com/Larousse2001/Airlines_Project.git
cd Airlines_Project- ✅ Data Engineering Pipeline
- ✅ Data Warehouse & BI
- ✅ Machine Learning Models
- 🔄 NLP Module (
10_NLP) - 🔄 Advanced Computer Vision
- 🔄 Deployment (API / Dashboard)
- Achref AROUS
- Oumaima ROUIS
- Iheb KOUKI
- Mehdi JOUDI
- Eya CHIIBNI
This project is for academic and educational purposes.