Skip to content

This project analyzes airline passenger satisfaction data using Python for EDA, visualizations, and predictive modeling, integrated with SQL for data management and querying. It includes data cleaning, feature engineering, and machine learning to classify passenger satisfaction levels.

Notifications You must be signed in to change notification settings

nujoomzmn/ML-Model-for-Airline-Data-Analysis-Using-Python-SQL

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

2 Commits
Β 
Β 
Β 
Β 

Repository files navigation

Airline Data: Machine Learning & Data Analysis (Python + SQL)

This repository contains a comprehensive data analysis and machine learning project built around airline passenger data.
The notebook combines Python and SQL techniques to explore, clean, analyze, and model customer satisfaction with the airline service.


πŸ“Š Project Overview

The aim of this project is to:

  • Analyze airline customer satisfaction and behavior.
  • Perform Exploratory Data Analysis (EDA) to identify trends.
  • Preprocess and clean raw data for better model performance.
  • Build and evaluate predictive machine learning models.
  • Integrate SQL queries for data handling and reporting.
  • Visualize insights with clear and interactive charts.

πŸ—‚οΈ Contents of Repository

File/Folder Description
Airline Data ML and Data Analysis using Python,Sql.ipynb Main Jupyter notebook with analysis and ML models.
README.md This file – full project description.
requirements.txt (optional) Python dependencies for easy setup.

πŸ“š Dataset

  • Source: Airline passenger satisfaction dataset (Kaggle or company-provided).
  • Key features include:
    Gender, Customer Type, Age, Type of Travel, Class, Flight Distance,
    Inflight wifi service, Seat comfort, Inflight entertainment, On-board service,
    Leg room service, Baggage handling, etc.
  • Target variable: Satisfaction (Satisfied / Neutral or Dissatisfied).

πŸ› οΈ Technologies Used

  • Programming Languages: Python, SQL
  • Python Libraries:
    • Data analysis: pandas, numpy
    • Visualization: matplotlib, seaborn
    • Machine Learning: scikit-learn
    • Web scraping/ETL (if included): BeautifulSoup, scrapy
  • SQL Integration: sqlite3 / MySQL connector (for queries within the notebook)
  • Jupyter Notebook for interactive analysis.

πŸ”Ž Steps Performed in the Notebook

  1. Data Loading & Cleaning

    • Import CSV or database table.
    • Handle missing values, outliers, and data types.
  2. Exploratory Data Analysis (EDA)

    • Univariate & bivariate analysis.
    • Count plots, pie charts, and correlation heatmaps.
    • Grouping & aggregation using SQL and pandas.
  3. Feature Engineering

    • Encoding categorical variables.
    • Scaling numerical features.
    • Creating flight distance and age groups.
  4. Model Building

    • Train-test split.
    • Model selection (e.g., RandomForestClassifier).
    • Evaluation metrics (accuracy, classification report, confusion matrix).
  5. Visualization

    • Professional charts with seaborn/matplotlib.
    • Clear labeling and annotations.
  6. SQL Queries

    • Example queries for insights.
    • Integration of SQL outputs into Python workflows.

πŸ“ˆ Results

  • Achieved high accuracy predictive model for customer satisfaction.
  • Created dashboards and charts showing key service drivers.
  • Automated repetitive analysis tasks with Python & SQL integration.

About

This project analyzes airline passenger satisfaction data using Python for EDA, visualizations, and predictive modeling, integrated with SQL for data management and querying. It includes data cleaning, feature engineering, and machine learning to classify passenger satisfaction levels.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published