Utilizes pandas-profilling and pycaret for a faster model selection process

https://github.com/pycaret/pycaret
pycaret's documentation: https://pycaret.gitbook.io/docs/get-started/modules
Strealit app deployed at Streamlit community cloud: https://automlapp-pycaret.streamlit.app/

This repository will showcase the simplest implementation of PyCaret into a streamlit app (only using pycaret's default settings)

Refer to the last section below for an example of running PyCaret in a notebook with additional features such as fine-tuning, model evaluation and model saving.

What does pandas-profiling do ?

Pandas-Profiling is a Python library that provides a simple and efficient way to perform exploratory data analysis (EDA) on a Pandas DataFrame. The library generates a comprehensive HTML report with various statistical and visual insights into the structure and characteristics of the dataset.

What does PyCaret do ?

PyCaret is an open-source, low-code maching learning library in Python that automates machine learning workflows. It is an end-to-end machine learning and Model management tool that exponentially speeds up experiment cycle and makes you more productive.

App features

Utilize pandas-profiling to generate a HTML report with insights from a dataset.
Utilizes PyCaret to quickly compare between different algorithms, generates a table ranking each algorithms based on metrics.

Important Note:

The app was created for classification problems and regression problems
Purpose of the app is to quickly gauge the performance of different types of models on your dataset, allows for a quicker model selection process *(Only uses PyCaret's basic pre-processing steps and does not fine-tune model)
To further modify the settings of PyCaret, Refer to the pycaret's documentation.
The app might take a long time to run on the streamlit community cloud due to limited resources available
Pycaret is CPU intensive, make sure your CPU is fast enough. Else, you could run Pycaret using GPU for a faster performance
Simply add use_gpu=True into classification_setup(df, target=chosen_target, verbose=False, use_gpu=True) to utilize GPU instead of CPU

Other alternatives:

Run the code locally on your computer for a faster performance
Deploy the streamlit app on a paid cloud service for a faster performance

Further improvement:

Add pycaret's other functionality into the app such as adding more options for data pre-processing and model fine-tuning capabilities.

Docker Image

Pull command: docker pull ongaunjie1/automl-app:latest
Run command: docker run -d -p 8501:8501 ongaunjie1/automl-app:latest

Steps on how to use the AutoML app?

Step 1: Upload your dataset

Step 2: Select Profiling

Step 3: Select ML problem (Classification or Regression) and select target variable

Step 4: Run the modelling and review the output

A more in-depth use-case of PyCaret

List of modules for different machine learning problems

Example: Predicting employee churn (You can find the colab notebook within this repository (employee_churn.ipynb)

Model comparison table

Create the best performing model from the comparison table

Fine-tuning the best model

Show best model's params

Evaluate model: Click on the buttons to see different evaluation plots

Or plot them individually

Perform prediction on test dataset generated by PyCaret

Saving the model

Note: There are more Data preprocessing and Transformations function available in PyCaret:

Check them out at https://pycaret.gitbook.io/docs/get-started/preprocessing

Name		Name	Last commit message	Last commit date
Latest commit History 54 Commits
datasets		datasets
README.md		README.md
app.py		app.py
employee_churn.ipynb		employee_churn.ipynb
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Utilizes pandas-profilling and pycaret for a faster model selection process

This repository will showcase the simplest implementation of PyCaret into a streamlit app (only using pycaret's default settings)

What does pandas-profiling do ?

What does PyCaret do ?

App features

Important Note:

Other alternatives:

Further improvement:

Docker Image

Steps on how to use the AutoML app?

A more in-depth use-case of PyCaret

List of modules for different machine learning problems

Example: Predicting employee churn (You can find the colab notebook within this repository (employee_churn.ipynb)

Model comparison table

Create the best performing model from the comparison table

Fine-tuning the best model

Show best model's params

Evaluate model: Click on the buttons to see different evaluation plots

Or plot them individually

Perform prediction on test dataset generated by PyCaret

Saving the model

Note: There are more Data preprocessing and Transformations function available in PyCaret:

About

Releases

Packages

Languages

ongaunjie1/pycaret_automl_streamlit

Folders and files

Latest commit

History

Repository files navigation

Utilizes pandas-profilling and pycaret for a faster model selection process

This repository will showcase the simplest implementation of PyCaret into a streamlit app (only using pycaret's default settings)

What does pandas-profiling do ?

What does PyCaret do ?

App features

Important Note:

Other alternatives:

Further improvement:

Docker Image

Steps on how to use the AutoML app?

A more in-depth use-case of PyCaret

List of modules for different machine learning problems

Example: Predicting employee churn (You can find the colab notebook within this repository (employee_churn.ipynb)

Model comparison table

Create the best performing model from the comparison table

Fine-tuning the best model

Show best model's params

Evaluate model: Click on the buttons to see different evaluation plots

Or plot them individually

Perform prediction on test dataset generated by PyCaret

Saving the model

Note: There are more Data preprocessing and Transformations function available in PyCaret:

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages