This project is built over the most infamous titanic dataset. This project is a complete end-to-end machine learning project.
The development of the machine learning model is done in 5 different stages. This project is deployed using Streamlit.
-
Data Ingestion Stage Ingesting the dataset from the souce URL provided.
-
Data Validation Stage Checking the structure of the ingested dataset, and validating it.Based on the validation, a validation file is generated saving the reference of the result of validation.
-
Data Transformation Stage Transforming the dataset to be feed into the model.
-
Model Training Stage Modeling various machine learning algorithms and then passing the dataset thorugh the model for training the model.
-
Model Evaluation Stage After the model has been trained, evaluating the model based on the unknown dataset and then evaluating the accuracy of the model.
- Update config.yaml
- Update parmas.yaml
- Update entity
- Update the configuration manager in the src config
- Update the components
- Update the pipeline
- Update the main.py
- Update the app.py
Programming Language: Python
Package used: Scikit-Learn, DVC (Data-Version-Control), Pandas, Numpy, Matplotlib, Seaborn, Streamlit, Scipy, Flask, pyYAML, catBoost, ensure, python_box
Step 1: Clone this repository
https://github.com/krishanwalia30/Titanic_Pipeline_Project
Step 2: Create a conda environment inside the repository
conda create -n titanic python=3.9 -y
conda activate titanic
Step 3: Install the requirements
pip install -r requirements.txt
# Finally run the following command
python streamlit_app.py
Step 4: See the project running
open the link in the terminal [usually at port 8501]
- Learned to create an end-to-end machine learning pipeline.
- Created different modules and components for different stages in pipeline development
- Learned about Data Version Control (DVC)
- Integrated DVC with the project to make it more efficient.
- Learned to create a Flask webapp for the model.
- Discovered the functionalities of Streamlit.
- Deployed the webapp successfully on Streamlit.
- Developed Continuous Integration and Development pipeline.
- Machine learning model with an accurcy of 93.475
- Live Prediction, in minimum time.
- Train Endpoint integration on the webapp
The project is deployed as a webapp using Streamlit Link: https://titanicpipelineproject123.streamlit.app/
Run the following command in the main project command prompt,-
dvc init
dvc repro