Skip to content

davegrays/ds-ml-template

Repository files navigation

ds-ml-template

CI

An opinionated template repo for data science / ML pipelines in python

Read more about it in this blog post

Philosophy && Practice

  • Modularity
    • Separate directories for data, notebooks, base package, and tests.
    • Different workflow components are separated from each other and from workflow management (i.e. Pipeline), to allow flexible recomposition.
    • Configurations are further separated via config files.
  • Readability
    • Pipeline manager enables low-code notebooks designed for data exploration and workflow management.
    • Code formatting / linting / testing is enforced
    • Docstrings and type-hinting in function definitions
  • Repeatability
    • Dependencies / environment managed through code (i.e. setup.sh), including installation of the base package.
    • Workflows instantiated through low-code scripts or notebooks, which could further be managed by a dedicated WMS.
  • Robustness
    • Unit tests and code coverage requirements strictly enforced through
      • pre-commit hooks
      • CI pipeline through github actions
      • branch management.

Summary diagram

data

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published