Repository name is : final_project_Data_Science The repository consists all the files that are need for potental and effective running of the file. The project put in its major focus on the prediciton of the flights being delayed on the proper consideration of some of the major factors like operational, environmental factors etc. The major contents of the repository are: onpremises.ipynb - It is a jupyter notebook that contains the entire workflow like data uploading and preprocessing, feature engineering,hot encoding, training of the model, evaluation of the performance metrics using logistic regression etc. requirements.txt - It shows the entire list python dependencies we have here. README.md - Provides an overview of the project, how to run it etc. We excluded the data due to the large file size.
How to run : Cloning the repository : git clone https://github.com/ARAVIND341/final_project_Data_Science.git navigating to the project folder and installing the dependencies. pip install -r requirements.txt
Now open jupyter notebook and run each cells one by one.
The main objective of the project is to develop a proper baseline classification and improved logistic regression model to make adequate predictions in flight delays using proper datasets which includes weather and holiday features at a later stage.