Repository for benchmarking
Switch branches/tags
Clone or download
Akado2009 and k8s-ci-robot Folder structure (#145)
* reorder

* Paths correct
Latest commit 4f58a83 Dec 11, 2018
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
build/images set LD_LIBRARY_PATH correctly in gpu image (#129) Oct 24, 2018
components Fix argument order in configurator's tf-job.jsonnet (#65) Aug 10, 2018
controller Folder structure (#145) Dec 11, 2018
dashboard Kubebench Dashboard added (#133) Dec 7, 2018
doc Kubebench dashboard package and monitoring system change (#134) Dec 4, 2018
examples Simplify experiment job configs and decouple prototype info (#98) Sep 13, 2018
hack Adds gofmt check. (#72) Aug 14, 2018
kubebench Kubebench dashboard package and monitoring system change (#134) Dec 4, 2018
scripts Kubebench dashboard package and monitoring system change (#134) Dec 4, 2018
test Set env to namespace in testing (#140) Dec 3, 2018
vendor new controller added (#132) Nov 7, 2018
.gitignore Kubebench Dashboard added (#133) Dec 7, 2018
.pylintrc Add python and jsonnet tests to e2e workflow (#25) Jun 5, 2018
.travis.yml Adds gofmt check. (#72) Aug 14, 2018
Dockerfile_controller Kubebench Dashboard added (#133) Dec 7, 2018
Dockerfile_dashboard Kubebench Dashboard added (#133) Dec 7, 2018
Gopkg.lock new controller added (#132) Nov 7, 2018
Gopkg.toml new controller added (#132) Nov 7, 2018
LICENSE Add initial files Apr 25, 2018
Makefile Adds gofmt check. (#72) Aug 14, 2018
OWNERS Add myself to OWNERS (#141) Dec 11, 2018
README.md Add user guide (#119) Sep 25, 2018
main3 new controller added (#132) Nov 7, 2018
prow_config.yaml fix prow config for postsubmit (#118) Sep 24, 2018

README.md

kubebench

The goal of Kubebench is to make it easy to run benchmark jobs on Kubeflow with various system and model settings. Kubebench enables benchmarks by leveraging Kubeflow's capability of managing TFJobs, as well as Argo based workflows.

Quick Start

NOTE: the quick start guide serves as a demo that helps you quickly go through a Kubebench Job. The components installed may not be suitable for production use. Please refer to detailed user guide for proper configuration of Kubebench Jobs.

Prerequisites

  • Kubernetes >= 1.8
  • Ksonnet >= 0.11
  • Kubeflow >= 0.3
    • Required modules: argo, tf-operator
  • For the quick-starter installation, Kubernetes nodes need to support NFS mounting

Installation

  • Install Dependencies (Kubebench depends on an existing Kubeflow deployment. For details about using Kubeflow, please refer to Kubeflow documentation)

  • Install Kubebench quick-starter package

    In your Ksonnet app root, run the followings:

    export KB_VERSION=master
    export KB_ENV=default
    
    curl https://raw.githubusercontent.com/kubeflow/kubebench/master/scripts/install_quickstarter.sh | bash
  • View the Kubebench directory contents

    The installer comes with a simple file server that allows you to view the contents of Kubebench directory through browser. You may find details of the file server service through:

    kubectl get svc kubebench-nfs-file-server-svc -o wide

    Alternatively, you can also access the deployed NFS service directly. You may find details of the nfs service through:

    kubectl get svc kubebench-nfs-svc -o wide

Run a Kubebench Job

  • Create a kubebench-job

    JOB_NAME="my-benchmark"
    
    ks pkg install kubebench/kubebench-job@${KB_VERSION}
    ks generate kubebench-job ${JOB_NAME}
    
    ks apply ${KB_ENV} -c ${JOB_NAME}
  • Track the status of your job

    The Kubebench Job will be deployed as an Argo Workflow, you may go to Argo dashboard to track the progress of the job.

    Alternatively, you can also use the followings in command line:

    kubectl get -o yaml workflows ${JOB_NAME}

View results

  • Once the job is finished, you can find the results under the experiment directory in the NFS, the details of the particular experiment is located at /experiments/<EXPERIMENT_UID>. You may also see a csv file at /experiments/report.csv, if you run multiple experiments, the aggregated results will be recorded here.

Cleanups

  • Delete the kubebench-job

    ks delete ${KB_ENV} -c ${JOB_NAME}
  • Uninstall quickstarter

    curl https://raw.githubusercontent.com/kubeflow/kubebench/master/scripts/uninstall_quickstarter.sh | bash

Design Document

For additional information about motivation and design for this project please refer to kubebench_design.md

Development

Ensure you run $ make verify before submitting PRs.

// TODO post detailed development guide.