Build-a-Simple-Machine-Learning-Pipeline

The motivation of this project is to build a simple, modular, extensible, machine learning pipeline in Python.

I recommend you to first view my notebook to have a sense of the pipeline as a whole. (You can directly view it in github - if it cannot open (as it contains a lot of graph), just reload another time! Alternatively, you can also download it and use your own local Jupyter Notebook to view.) After that, you can jump to my code to see how I designed the functions used in the notebook.

0. Introduction

The project build a complete machine learning pipeline. Specifically, it deploys the decision tree classifier, and uses the financial distress prediction as an example. The goal of the example is to predict if an individual will experience financial distress in the next two years.

The pipeline project is organized as follow:

code: The functions designed to build up the pipeline. It is composed by four python files, which would be illustrated
notebook: An implementation of the pipeline.
data: The data from the financial distress example.

1. Data Preparation

Please refer to prep.py and the corresponding part in the notebook.

2. Data Exploration

Please refer to explore.py and the corresponding part in the notebook.

3. Feature Engineering

Please refer to feature.py and the corresponding part in the notebook.

4. Model Training, Testing and Evaluation

Please refer to model.py and the corresponding part in the notebook.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
.idea		.idea
.ipynb_checkpoints		.ipynb_checkpoints
code		code
data		data
notebook		notebook
.DS_Store		.DS_Store
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.idea

.idea

.ipynb_checkpoints

.ipynb_checkpoints

code

code

data

data

notebook

notebook

.DS_Store

.DS_Store

LICENSE

LICENSE

README.md

README.md

Repository files navigation

Build-a-Simple-Machine-Learning-Pipeline

0. Introduction

1. Data Preparation

2. Data Exploration

3. Feature Engineering

4. Model Training, Testing and Evaluation

About

Releases

Packages

Languages

License

ZIYU-DEEP/Build-a-Simple-Machine-Learning-Pipeline

Folders and files

Latest commit

History

Repository files navigation

Build-a-Simple-Machine-Learning-Pipeline

0. Introduction

1. Data Preparation

2. Data Exploration

3. Feature Engineering

4. Model Training, Testing and Evaluation

About

Topics

Resources

License

Stars

Watchers

Forks

Languages