This repository contains a Jupyter notebook with a machine learning model for predicting flight prices. The project uses Python and several popular machine learning libraries, including pandas, numpy, scikit-learn, and matplotlib.
To run this project, you will need to have Jupyter Notebook installed on your machine. You can download and install it from the official website: https://jupyter.org/install.
You will also need to install the following libraries using pip:
pandas, numpy, scikit-learn, matplotlib, seaborn. Once you have installed these dependencies, you can open the Predicting_Flights_Prices.ipynb file in Jupyter Notebook and run the cells to reproduce the analysis.
The data for this project was obtained from Kaggle's "Predicting Flight Prices" competition. The dataset contains information on flight routes, airlines, and prices for flights departing from major Indian cities.
The notebook walks through the entire machine learning pipeline, including data cleaning, exploratory data analysis, feature engineering, model selection, and model evaluation. The final model is a Random Forest Regressor, which achieved an R2 score of 0.8 on the test set.
The notebook also includes visualizations of the data and model performance, as well as explanations of the thought process behind feature selection and engineering.
This project demonstrates how machine learning can be used to predict flight prices, which can be useful for both consumers and airlines. The notebook provides a detailed walkthrough of the machine learning pipeline and can be a helpful resource for anyone looking to improve their skills in data analysis and modeling.