The project is to analyze the flight booking dataset obtained from a platform which is used to book flight tickets. A thorough study of the data will aid in the discovery of valuable insights that will be of enormous value to passengers. Apply EDA, statistical methods and Machine learning algorithms in order to get meaningful information from it.
Flight booking price prediction dataset contains around 3 lacs records with 11 attributes. It is as such:
For our project, we will be solving this case study step by step:
- Importing the libraries
- Loading the dataset (Flight_Booking.csv)
- Data Visualization
- Data Preprocessing
- Feature Selection
- Model Building
- Model Evaluation
- Conclusion
After performing all these steps, we get results as below:
Here, our best performing model turned out to be is Random Forest, however, as random forests are tend to overfit the model we can further do some hyperparameter tuning to more generalize our model and check if it performs well on the unknown data as well.