kuro

Kuro is a python-based experiment management system developed by Pedro Rodriguez, a PhD Candidate in machine learning and natural language processing at the University of Maryland, College Park.

Kuro is focused on:

Providing a client library that can be used from experiment code to log metrics such as accuracy and loss.
Providing a django web application to store information sent from the client library.
Providing a Dash app that visualizes the results of experiments.

This is all aimed at making it easy to:

Collect results from experiments running on heterogenous resources (local workstations, university computing clusters, AWS etc) to one central server
Be agnostic to the learning library used (PyTorch, SKLearn, Tensorflow etc)
Analyze these results with the goal of:
- Comparing many hyper parameter settings
- Collecting results for multiple trials of the same hyper parameter settings
- Handling either time series data (accuracy by epoch) or summary data (best accuracy)
- Analyze either with a web visualization based on Dash, or raw JSON data for scripts (eg scripts converting data to publication ready plots)

Installation

$ pip install -r requirements.txt

Example Usage

Below is a sample of what using the client library looks like from demo.py:

Run Kuro Server

Start the kuro server

$ python manage.py migrate
$ python manage.py runserver

Run a dummy experiment

from kuro import Worker
import random


def run(guesser_name, hyper_parameters):
    worker = Worker('nibel')
    experiment = worker.experiment(
        'guesser', guesser_name,
        metrics=[('test_acc', 'max'), 'test_loss'],
        hyper_parameters=hyper_parameters,
        n_trials=3# Used to group models together and compare them
    )

    # Run 5 trials of same parameters
    for _ in range(5):
        trial = experiment.trial() # If worker doesn't have a trial for experiment make one, otherwise fetch it
        # If there is nothing more to run, skip running a trial
        if trial is None:
            continue
        for step in range(10): # Steps can be epochs or batch steps etc
            acc, loss = 9 + step, -1 - step # Model results here from testing data
            trial.report_metric('test_acc', acc + random.uniform(0, 5), step=step) # Explicitly pass step, no need for mode since it was passed in metrics
            trial.report_metric('test_loss', loss + random.uniform(0, 10), step=step) # For common things such as loss/accuracy, these are automatically inferred if not given
            trial.report_metric('optimality', 0, mode='min', step=step)# Allow step to be auto-computed, need mode since its a new metric on first iteration

        trial.report_metric('final_metric', 98 + random.uniform(-3, 0), mode='max') # similarly new metric needs mode
        trial.report_metric('final_nums', 93 + random.uniform(-10, 10), mode='min')
        trial.report_metric('final_digits', random.randint(0, 100), mode='min')
        trial.end() # Mark trial as complete


if __name__ == '__main__':
    run('qanta.guesser.dan.DanGuesser', {'lr': .001, 'dropout': .5})
    run('qanta.guesser.rnn.RnnGuesser', {'lr': .1, 'dropout': .1})

Viewing Results

The results are displayed at http://localhost:8000/dash-index

Configuration

The host used by the client can be changed at the worker initialization Worker('nibel', server='http://myserver.com/') or by setting the environment variable KURO_SERVER=http://myserver.com/

On the server you also need to specify KURO_HOST=myserver.com (hostname) as this will be passed to django's ALLOWED_HOSTS. This is needed if you intend to run the server with something like python manage.py runserver 0.0.0.0:8000 to expose it to the open web

Developer Notes

Kuro uses the django rest framework for most of its heavy lifting. The endpoints for the rest framework in kuro are at:

Schema: /schema/
API Explorer: /api/v1.0/
Django Admin: /admin/

PostgreSQL

At a certain point the concurrency supported by SQLite may not be sufficient to support multiple workers submitting experiment results. To fix this problem you will need to switch to using PostgreSQL. To do so you will need to:

Install docker and docker-compose
In the repository root run docker-compose up
Set the environment variable export KURO_USE_POSTGRES=1 on the command running the web server

Future Features

Central server maintaining queue of work, and Workers being able to request work from the central server
Paired with this feature would be automatic hyper parameter tuning

Name		Name	Last commit message	Last commit date
Latest commit History 47 Commits
kuro		kuro
.dockerignore		.dockerignore
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
demo.py		demo.py
docker-compose.yml		docker-compose.yml
manage.py		manage.py
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

kuro

kuro

.dockerignore

.dockerignore

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

demo.py

demo.py

docker-compose.yml

docker-compose.yml

manage.py

manage.py

requirements.txt

requirements.txt

setup.py

setup.py

Repository files navigation

kuro

Installation

Example Usage

Run Kuro Server

Run a dummy experiment

Viewing Results

Configuration

Developer Notes

PostgreSQL

Future Features

About

Releases

Packages

Languages

License

EntilZha/kuro

Folders and files

Latest commit

History

Repository files navigation

kuro

Installation

Example Usage

Run Kuro Server

Run a dummy experiment

Viewing Results

Configuration

Developer Notes

PostgreSQL

Future Features

About

Resources

License

Stars

Watchers

Forks

Languages