- The problem statement was presented at ZS Data Science Challenge - 2019
- Dataset Link
- Problem Statement & Feature Description
conda create -p venv python==3.10 -y
conda activate venv/
-
python -m pip install --upgrade pip
-
pip install -r requirements.txt
-
conda install jupyter
(to run the jupyter notebook)
- Run
python src/engine.py
to train/predict/deploy
-
To introduce the Deep Neural Network and its implementation
-
The project aims to predict if a customer's license should be issued, renewed, or cancelled depending on various parameters in Business License Dataset
The dataset used is a licensed dataset. It contains information about 86K different businesses over various features. The target variable is the status of license which has five different categories.
- ➔ Language:
Python
- ➔ Libraries:
pandas
,seaborn
,numpy
,matplotlib
,scikit-learn
,h2o
,tensorflow
,flask
,gunicorn
- Data Description
- Exploratory Data Analysis
- Data Cleaning & Feature Engineering
- Preparing data for analysis
- Base model(Random Forest) building using
h2o
- Building deep neural network model
- Predictions on test data
- Model deployment using
flask
gunicorn
input
|__License_Data.csv
|__preprocessed_License_Data.csv (saved the preprocessed data)
|__test_data.csv
notebooks
|__utils
|__helper_functions.py
|__Features_Description.ipynb
|__model_api.ipynb (for testing model deployment)
|__model_notebook.ipynb (Main Notebook)
output
|__saved_models
|__h2o_models
|__neural_network_models
src
|__ML_Pipeline
|__`__init__.py`
|__constants.py
|__deploy.py
|__predict.py
|__preprocessing.py
|__save_and_load_model.py
|__train_model.py
|__wsgi.py
|__wsgi.sh
|__`__init__.py`
|__engine.py
requirements.txt
- What is a Deep Neural Network?
- Building blocks of Deep Neural Network
- What is the Activation Function?
- What is Feedforward?
- What is Backpropagation?
- Loss function and its examples
- What is Dropout regularization?
- Deep learning libraries such as Tensorflow, Pytorch, Pytorch lightning, Horovod
- Understanding the Business context and objective
- Data Cleaning
- How to prepare data for modeling?
- How to use the h2o framework for baseline modeling?
- How to build a Deep Neural Network Model?
- Hyperparameter tuning
- Model predictions on test data
- Model deployment using flask gunicorn
- Predictions using the deployed model on the server