Skip to content
/ labs Public

Labs helps define, create, execute and save experiments.

License

Notifications You must be signed in to change notification settings

Brillone/labs

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Introduction

Labs helps create and define (config file), execute (scale with Dask) and save (artifacts, results, metadata) experiments. It's main purpose is to execute ML experiments, but can be used for other use cases.

Labs is using Dask delayed lazy API for distributed computation. Additionally, Scikit-Learn is also used in the [Searcher] module.

Disclaimer: Labs is currently experimental and for my own personal use.

Key concepts:

  • [Experiment Design] - a user defined experiment. The [Experiment Design] is being expressed by a func, which will be executed by an [Experimenter/s].

  • [Experiment] - a combination of hyper parameters to be tested while running [Experiment Design].

  • [Experiment Run] - using the [Experiment Configuration] and [Experiment Design], numerous [Experiments] will be executed. The [Experiment Run] will output best [Experiment] (best hyper parameters combination).

  • [Experiment Configuration] - sets of configurations which will define the [Experiments] to be executed in Experiment Run.

  • [LabManager] - running all the [Experiments Configurations] as defined in a config file. A [LabManager] can perform numerous [Experiment Configuration] and [Experiment Design]

  • [Experimenter] - an entity which perform the tuning/experimenting process.

  • [Searcher] - an entity used by an [Experimenter] to create the [Experiments] in Experiment Run. Example Searchers: Grid Search, Random Sampling, Bayesian Search (with the great skopt package). The [Searcher] use the defined space in [Experiment Configuration].

1. Installation process

pip install labs

2. Docs

(Documentation is not completed yet)

  1. Quick Start
  2. Experimenters
  3. Searchers
  4. LabManager
  5. Live Reporting
  6. Configs
  7. Suggested Steps

3. Future

Currently, the project is very new and not completed.

The project need more development to support distributed computation options. The future plan is to use Dask rich and developed ecosystem, for simple and fast development of distributed computation options.

Future developments:

  • pytest testing.
  • Flow options - checkpoint saving, time caps, delta improvement and more.
  • Docker support.
  • Kubernetes support.
  • Experiments Artifact saved in cloud storage options.
  • MLFlow interaction.