Skip to content

DartML: Machine Learning for Everyone (AutoML app for supervised learning on tabular data)

Notifications You must be signed in to change notification settings

milosz-l/DartML

Repository files navigation

Python 3.9

🎯 DartML

Welcome to DartML: Machine Learning for Everyone! This app lets you build Machine Learning models without writing a single line of code.

  • Step 1: Upload your data
    • Specify sample size
    • Choose between train/test split and cross validation
    • Define shuffle and stratify options Step 1 Image
  • Step 2: Explore the data
    • Numerical data exploration Numerical data exploration
    • Categorical data exploration Categorical data exploration
  • Step 3: Build the model
    • Choose the target column
    • Select the problem type (can be also automatically detected)
    • Define metric (e.g. accuracy, f1 score, etc.)
    • Choose algorithms
    • Specify total training time Step 3 Image
  • Step 4: Evaluate the trained models
    • NOTE: All the tables and plots shown below are being updated in real time during the training!
    • Leaderboard Leaderboard
    • Performance Boxplot Performance Boxplot
    • Features Importance (eXplainable AI) Features Importance
    • Spearman Correlation of Models Spearman Correlation of Models
    • Logs Logs
  • At last, you can download report with all trained models and more detailed information about them (e.g. SHAP or dtreeviz visualizations).
    • SHAP values for 10 worst predictions SHAP values for 10 worst predictions
    • dtreeviz visualization dtreeviz visualization

Demo video

DartML Demo Click above to watch the demo!

How to install and run the app

Install requirements

Using pip

pip install -r requirements.txt

Using conda

You can change new_env_name to any name you like.

conda create --name new_env_name python=3.9
conda activate new_env_name
pip install -r requirements.txt

Run the app

streamlit run 0_🏠_Home.py

Project structure

.
β”œβ”€β”€ 0_🏠_Home.py                    # Home page streamlit view.
β”œβ”€β”€ pages                           # Streamlit views for other pages of the app.
β”‚   β”œβ”€β”€ 1_πŸ§ͺ_Sample.py              # Sample page streamlit view.
β”‚   β”œβ”€β”€ 2_πŸ”_Explore.py             # Explore page streamlit view.
β”‚   β”œβ”€β”€ 4_πŸ› οΈ_Modify_&_Model.py      # Modify & Model page streamlit view.
β”‚   └── 5_πŸ“Š_Assess.py              # Assess page streamlit view.
β”œβ”€β”€ src                             # Source code of the app.
β”‚   β”œβ”€β”€ config.py                   # Configurations for the app.
β”‚   β”œβ”€β”€ sample                      # Code used specifically in the Sample page.
β”‚   β”œβ”€β”€ explore                     # Code used specifically in the Explore page.
β”‚   β”œβ”€β”€ modify_and_model            # Code used specifically in the Modify & Model page.
β”‚   β”œβ”€β”€ assess                      # Code used specifically in the Assess page.
β”‚   β”œβ”€β”€ general_views               # Smaller streamlit views used in multiple pages.
β”‚   └── session_state               # Functions related to handling app's session state.
β”œβ”€β”€ tests                           # Tests for the app.
β”‚   β”œβ”€β”€ functional_tests            # Functional tests.
β”‚   └── load_tests                  # Load tests.
β”‚   └── unit_tests                  # Unit tests.
β”œβ”€β”€ temp_dirs                       # Temporary directories used to store training results.
β”‚   └── .gitkeep                    # Empty file to make sure the directory is tracked by git.
β”œβ”€β”€ docs                            # Documentation for the app.
β”œβ”€β”€ example_data                    # Example data used in the app.
β”œβ”€β”€ README.md                       # project description you are reading right now
β”œβ”€β”€ .pre-commit-config.yaml         # pre-commit configuration
β”œβ”€β”€ .flake8                         # flake8 configuration (run by pre-commit)
β”œβ”€β”€ .isort.cfg                      # isort configuration (run by pre-commit)
β”œβ”€β”€ requirements.txt                # dependencies for pip
└── .streamlit                      # configurations for streamlit (theme)
    └── config.toml                 # configurations for streamlit (theme)

Testing

Load tests

  • These tests check how app behaves under heavy load.
  • Used package: locust.

Run simple load tests

Perform simple load test by just visiting pages without interacting with any buttons or uploading any files.

locust -f tests/load_tests/simple_load_tests.py

Remember to put Host information without backlash at the end, for example:

  • http://localhost:8501 <- this is correct
  • http://localhost:8501/ <- this is incorrect

You can start the locust and simultaneously use the app yourself (or run functional tests), so you can see how the response time changes and ensure that there are no failures.

Functional tests

  • These tests check whether app visually looks and behaves as expected.
  • Used package: seleniumbase.

Run functional tests

First you need to specify the HOST_URL in tests/functional_tests/config.py file. By default it's set to http://localhost:8501.

Run all tests:

pytest tests/functional_tests/functional_tests.py --chrome --headless

Run single test (test_explore_page in this example):

pytest tests/functional_tests/functional_tests.py --chrome --headless -k test_explore_page
  • You can specify the number of concurrent users by adding -n=<number_of_users> flag.
  • You can remove the --headless flag if you want to make the testing browser visible.
  • You can change --edge to any browser you like, for example --chrome or --firefox.
  • You can make it slower by adding --slow flag.
  • You can highlight assertions by adding --demo flag.
  • You can add -k <test_name> flag to run only specific test.

Unit tests

  • These tests check whether individual functions work as expected.

Run unit tests

pytest tests/unit_tests

Generate documentation from docstrings

using doxygen

doxygen

using pdoc

pdoc src

pre-commit

Install the pre-commit Git hook to run it automatically before each commit

pre-commit install

Manually run autoformat and code quality check

pre-commit run --all-files

Command above runs the following:

  1. black - general code autoformatting
  2. flake8 - code quality check
  3. isort - imports autoformatting (alphabetical order)
  4. interrogate - check code for missing docstrings

About

DartML: Machine Learning for Everyone (AutoML app for supervised learning on tabular data)

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages