Skip to content

Commit

Permalink
Update test infrastructure to use repo tensorflow/k8s (#87)
Browse files Browse the repository at this point in the history
* Update test infrastructure to use repo tensorflow/k8s

* Update README to refer to new repo.

* Rename go packages from jlewi/mlkube.io > tensorflow/k8s.

* Update repo location in the helm e2e test.

* Update links.

* Update the readme.md
  • Loading branch information
jlewi committed Oct 25, 2017
1 parent 1332e8a commit bcf2f08
Show file tree
Hide file tree
Showing 26 changed files with 71 additions and 69 deletions.
4 changes: 3 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -27,4 +27,6 @@ vendor/
**/bazel-*
# Examples egg
examples/tf_sample/tf_sample.egg-info/
examples/.ipynb_checkpoints/
examples/.ipynb_checkpoints/

**/.ipynb_checkpoints
18 changes: 9 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
# K8s Custom Resource and Operator For TensorFlow jobs

[![Build Status](https://travis-ci.org/jlewi/mlkube.io.svg?branch=master)](https://travis-ci.org/jlewi/mlkube.io)
[![Build Status](https://travis-ci.org/tensorflow/k8s.svg?branch=master)](https://travis-ci.org/tensorflow/k8s)

[Prow Test Dashboard](https://k8s-testgrid.appspot.com/sig-big-data)

[Prow Jobs](https://prow.k8s.io/?repo=jlewi%2Fmlkube.io)
[Prow Jobs](https://prow.k8s.io/?repo=tensorflow%2Fk8s)

## Overview

Expand Down Expand Up @@ -85,15 +85,15 @@ Custom Resources require Kubernetes >= 1.7
### Configuring the CRD

The CRD can be configured via a [ConfigMap](https://kubernetes.io/docs/api-reference/v1.8/#configmap-v1-core)
that provides a [ControllerConfig](https://github.com/jlewi/mlkube.io/blob/master/pkg/spec/controller.go) serialized
that provides a [ControllerConfig](https://github.com/tensorflow/k8s/blob/master/pkg/spec/controller.go) serialized
as YAML. The config controls how the CRD manages TensorFlow jobs.

Currently, the most important use for [ControllerConfig](https://github.com/jlewi/mlkube.io/blob/master/pkg/spec/controller.go)
Currently, the most important use for [ControllerConfig](https://github.com/tensorflow/k8s/blob/master/pkg/spec/controller.go)
is specifying environment variables and volumes that must be mounted from the
host into containers to configure GPUS.

The TfJob controller can be configured with a list of volumes that should be mounted from the host into the container
to make GPUs work. Here's an example [ControllerConfig](https://github.com/jlewi/mlkube.io/blob/master/pkg/spec/controller.go):
to make GPUs work. Here's an example [ControllerConfig](https://github.com/tensorflow/k8s/blob/master/pkg/spec/controller.go):

```
accelerators:
Expand All @@ -119,7 +119,7 @@ The helm package for the controller includes a config map suitable for GKE.
This ConfigMap may need to be modified for your cluster if you aren't using
GKE.

There's an open [issue](https://github.com/jlewi/mlkube.io/issues/71) to
There's an open [issue](https://github.com/tensorflow/k8s/issues/71) to
better support non GKE clusters


Expand All @@ -128,7 +128,7 @@ better support non GKE clusters
You create a job by defining a TfJob and then creating it with.

```
kubectl create -f https://raw.githubusercontent.com/jlewi/mlkube.io/master/examples/tf_job.yaml
kubectl create -f https://raw.githubusercontent.com/tensorflow/k8s/master/examples/tf_job.yaml
```

In this case the job spec looks like the following
Expand Down Expand Up @@ -234,9 +234,9 @@ for using GPUs.
### Requesting a TensorBoard instance

You can also ask the `TfJob` operator to create a TensorBoard instance
by including a [TensorBoardSpec](https://github.com/jlewi/mlkube.io/blob/master/pkg/spec/tf_job.go#L103)
by including a [TensorBoardSpec](https://github.com/tensorflow/k8s/blob/master/pkg/spec/tf_job.go#L103)
in your job. The table below describes the important fields in
[TensorBoardSpec](https://github.com/jlewi/mlkube.io/blob/master/pkg/spec/tf_job.go#L103).
[TensorBoardSpec](https://github.com/tensorflow/k8s/blob/master/pkg/spec/tf_job.go#L103).

| Name | Description | Required | Default |
|---|---|---|---|
Expand Down
16 changes: 8 additions & 8 deletions cmd/tf_operator/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -10,19 +10,19 @@ import (

"github.com/ghodss/yaml"

"github.com/jlewi/mlkube.io/pkg/controller"
"github.com/jlewi/mlkube.io/pkg/garbagecollection"
"github.com/jlewi/mlkube.io/pkg/util"
"github.com/jlewi/mlkube.io/pkg/util/k8sutil"
"github.com/jlewi/mlkube.io/pkg/util/k8sutil/election"
"github.com/jlewi/mlkube.io/pkg/util/k8sutil/election/resourcelock"
"github.com/jlewi/mlkube.io/version"
"github.com/tensorflow/k8s/pkg/controller"
"github.com/tensorflow/k8s/pkg/garbagecollection"
"github.com/tensorflow/k8s/pkg/util"
"github.com/tensorflow/k8s/pkg/util/k8sutil"
"github.com/tensorflow/k8s/pkg/util/k8sutil/election"
"github.com/tensorflow/k8s/pkg/util/k8sutil/election/resourcelock"
"github.com/tensorflow/k8s/version"

log "github.com/golang/glog"

"io/ioutil"

"github.com/jlewi/mlkube.io/pkg/spec"
"github.com/tensorflow/k8s/pkg/spec"
"k8s.io/client-go/kubernetes"
"k8s.io/client-go/pkg/api/v1"
"k8s.io/client-go/tools/record"
Expand Down
4 changes: 2 additions & 2 deletions developer_guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ Create a symbolic link inside your GOPATH to the location you checked out the co
ln -sf ${GIT_TRAINING} ${GOPATH}/src/mlkube.io
```

* GIT_TRAINING should be the location where you checked out https://github.com/jlewi/mlkube.io
* GIT_TRAINING should be the location where you checked out https://github.com/tensorflow/k8s

Resolve dependencies (if you don't have glide install, check how to do it [here](https://github.com/Masterminds/glide/blob/master/README.md#install))

Expand All @@ -24,7 +24,7 @@ rm -rf vendor/k8s.io/apiextensions-apiserver/vendor
Build it

```
go install github.com/jlewi/mlkube.io/cmd/tf_operator
go install github.com/tensorflow/k8s/cmd/tf_operator
```

## Runing the Operator Locally
Expand Down
4 changes: 2 additions & 2 deletions images/tf_operator/build_and_push.py
Original file line number Diff line number Diff line change
Expand Up @@ -95,8 +95,8 @@ def run_and_output(command, cwd=None):
go_path = os.environ["GOPATH"]

targets = [
"github.com/jlewi/mlkube.io/cmd/tf_operator",
"github.com/jlewi/mlkube.io/test/e2e",
"github.com/tensorflow/k8s/cmd/tf_operator",
"github.com/tensorflow/k8s/test/e2e",
]
for t in targets:
subprocess.check_call(["go", "install", t])
Expand Down
8 changes: 4 additions & 4 deletions pkg/controller/controller.go
Original file line number Diff line number Diff line change
Expand Up @@ -11,13 +11,13 @@ import (
"sync"
"time"

"github.com/jlewi/mlkube.io/pkg/spec"
"github.com/jlewi/mlkube.io/pkg/trainer"
"github.com/jlewi/mlkube.io/pkg/util/k8sutil"
"github.com/tensorflow/k8s/pkg/spec"
"github.com/tensorflow/k8s/pkg/trainer"
"github.com/tensorflow/k8s/pkg/util/k8sutil"
"k8s.io/client-go/kubernetes"

log "github.com/golang/glog"
"github.com/jlewi/mlkube.io/pkg/util"
"github.com/tensorflow/k8s/pkg/util"
v1beta1 "k8s.io/apiextensions-apiserver/pkg/apis/apiextensions/v1beta1"
apiextensionsclient "k8s.io/apiextensions-apiserver/pkg/client/clientset/clientset"
apierrors "k8s.io/apimachinery/pkg/api/errors"
Expand Down
2 changes: 1 addition & 1 deletion pkg/controller/util.go
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ import (
"io"
"time"

"github.com/jlewi/mlkube.io/pkg/spec"
"github.com/tensorflow/k8s/pkg/spec"

metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
kwatch "k8s.io/apimachinery/pkg/watch"
Expand Down
4 changes: 2 additions & 2 deletions pkg/garbagecollection/gc.go
Original file line number Diff line number Diff line change
Expand Up @@ -13,8 +13,8 @@
package garbagecollection

import (
"github.com/jlewi/mlkube.io/pkg/spec"
"github.com/jlewi/mlkube.io/pkg/util/k8sutil"
"github.com/tensorflow/k8s/pkg/spec"
"github.com/tensorflow/k8s/pkg/util/k8sutil"

log "github.com/golang/glog"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
Expand Down
2 changes: 1 addition & 1 deletion pkg/spec/tf_job.go
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ import (
"time"

"github.com/golang/protobuf/proto"
"github.com/jlewi/mlkube.io/pkg/util"
"github.com/tensorflow/k8s/pkg/util"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/client-go/pkg/api/v1"
)
Expand Down
2 changes: 1 addition & 1 deletion pkg/spec/tf_job_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ import (
"testing"

"github.com/gogo/protobuf/proto"
"github.com/jlewi/mlkube.io/pkg/util"
"github.com/tensorflow/k8s/pkg/util"
"k8s.io/apimachinery/pkg/api/resource"
"k8s.io/client-go/pkg/api/v1"
)
Expand Down
6 changes: 3 additions & 3 deletions pkg/trainer/replicas.go
Original file line number Diff line number Diff line change
Expand Up @@ -8,14 +8,14 @@ import (
"sort"
"strings"

"github.com/jlewi/mlkube.io/pkg/util/k8sutil"
"github.com/tensorflow/k8s/pkg/util/k8sutil"

"github.com/jlewi/mlkube.io/pkg/spec"
"github.com/tensorflow/k8s/pkg/spec"

log "github.com/golang/glog"
"github.com/golang/protobuf/proto"
// TOOO(jlewi): Rename to apiErrors
"github.com/jlewi/mlkube.io/pkg/util"
"github.com/tensorflow/k8s/pkg/util"
k8s_errors "k8s.io/apimachinery/pkg/api/errors"
meta_v1 "k8s.io/apimachinery/pkg/apis/meta/v1"
k8sErrors "k8s.io/apimachinery/pkg/util/errors"
Expand Down
4 changes: 2 additions & 2 deletions pkg/trainer/replicas_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -10,8 +10,8 @@ import (
"sync"
"time"

"github.com/jlewi/mlkube.io/pkg/spec"
tfJobFake "github.com/jlewi/mlkube.io/pkg/util/k8sutil/fake"
"github.com/tensorflow/k8s/pkg/spec"
tfJobFake "github.com/tensorflow/k8s/pkg/util/k8sutil/fake"
meta_v1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/client-go/kubernetes/fake"
"k8s.io/client-go/pkg/api/v1"
Expand Down
2 changes: 1 addition & 1 deletion pkg/trainer/tensorboard.go
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ import (

log "github.com/golang/glog"
"github.com/golang/protobuf/proto"
"github.com/jlewi/mlkube.io/pkg/spec"
"github.com/tensorflow/k8s/pkg/spec"
k8s_errors "k8s.io/apimachinery/pkg/api/errors"
meta_v1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/apimachinery/pkg/util/intstr"
Expand Down
4 changes: 2 additions & 2 deletions pkg/trainer/tensorboard_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,8 @@ import (
"reflect"
"sync"

"github.com/jlewi/mlkube.io/pkg/spec"
tfJobFake "github.com/jlewi/mlkube.io/pkg/util/k8sutil/fake"
"github.com/tensorflow/k8s/pkg/spec"
tfJobFake "github.com/tensorflow/k8s/pkg/util/k8sutil/fake"
meta_v1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/client-go/kubernetes/fake"
"k8s.io/client-go/pkg/api/v1"
Expand Down
10 changes: 5 additions & 5 deletions pkg/trainer/training.go
Original file line number Diff line number Diff line change
Expand Up @@ -7,17 +7,17 @@ import (
"reflect"

log "github.com/golang/glog"
"github.com/jlewi/mlkube.io/pkg/spec"
"github.com/jlewi/mlkube.io/pkg/util"
"github.com/jlewi/mlkube.io/pkg/util/k8sutil"
"github.com/jlewi/mlkube.io/pkg/util/retryutil"
"github.com/tensorflow/k8s/pkg/spec"
"github.com/tensorflow/k8s/pkg/util"
"github.com/tensorflow/k8s/pkg/util/k8sutil"
"github.com/tensorflow/k8s/pkg/util/retryutil"

"math"
"strings"
"sync"
"time"

"github.com/jlewi/mlkube.io/pkg/garbagecollection"
"github.com/tensorflow/k8s/pkg/garbagecollection"

apierrors "k8s.io/apimachinery/pkg/api/errors"
"k8s.io/client-go/kubernetes"
Expand Down
4 changes: 2 additions & 2 deletions pkg/trainer/training_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,8 @@ import (
"k8s.io/apimachinery/pkg/api/resource"
"k8s.io/client-go/kubernetes/fake"
"k8s.io/client-go/pkg/api/v1"
"github.com/jlewi/mlkube.io/pkg/spec"
tfJobFake "github.com/jlewi/mlkube.io/pkg/util/k8sutil/fake"
"github.com/tensorflow/k8s/pkg/spec"
tfJobFake "github.com/tensorflow/k8s/pkg/util/k8sutil/fake"
"sync"
)

Expand Down
2 changes: 1 addition & 1 deletion pkg/util/k8sutil/election/election.go
Original file line number Diff line number Diff line change
Expand Up @@ -59,7 +59,7 @@ import (
"k8s.io/apimachinery/pkg/util/wait"

"github.com/golang/glog"
rl "github.com/jlewi/mlkube.io/pkg/util/k8sutil/election/resourcelock"
rl "github.com/tensorflow/k8s/pkg/util/k8sutil/election/resourcelock"
)

const (
Expand Down
2 changes: 1 addition & 1 deletion pkg/util/k8sutil/fake/fake.go
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
package fake

import (
"github.com/jlewi/mlkube.io/pkg/spec"
"github.com/tensorflow/k8s/pkg/spec"
"net/http"
"time"
)
Expand Down
2 changes: 1 addition & 1 deletion pkg/util/k8sutil/k8sutil.go
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ import (
"net"
"os"

"github.com/jlewi/mlkube.io/pkg/spec"
"github.com/tensorflow/k8s/pkg/spec"

apiextensionsclient "k8s.io/apiextensions-apiserver/pkg/client/clientset/clientset"
apierrors "k8s.io/apimachinery/pkg/api/errors"
Expand Down
4 changes: 2 additions & 2 deletions pkg/util/k8sutil/tpr_util.go
Original file line number Diff line number Diff line change
Expand Up @@ -19,12 +19,12 @@ import (
"fmt"
"net/http"

"github.com/jlewi/mlkube.io/pkg/spec"
"github.com/tensorflow/k8s/pkg/spec"
"k8s.io/apimachinery/pkg/runtime"
"k8s.io/apimachinery/pkg/runtime/serializer"
"k8s.io/client-go/pkg/api"
"k8s.io/client-go/rest"
"github.com/jlewi/mlkube.io/pkg/util"
"github.com/tensorflow/k8s/pkg/util"
log "github.com/golang/glog"
)

Expand Down
2 changes: 1 addition & 1 deletion rename.sh
Original file line number Diff line number Diff line change
Expand Up @@ -3,5 +3,5 @@
# Rewrite some imports
files=`find ./ -name *.go`
for f in $files; do
sed -i "s/mlkube.io\//github.com\/jlewi\/mlkube.io\//" ${f}
sed -i "s/github.com\/jlewi\/mlkube.io\//github.com\/tensorflow\/k8s\//" ${f}
done
14 changes: 7 additions & 7 deletions test-infra/README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# mlkube Test Infrastructure
# Test Infrastructure

We use [Prow](https://github.com/kubernetes/test-infra/tree/master/prow),
K8s' continuous integration tool.
Expand All @@ -15,9 +15,9 @@ We use Prow to run:
Quick Links
* [config.yaml](https://github.com/kubernetes/test-infra/blob/master/prow/config.yaml)
defines the ProwJobs.
* Search for mlkube to find mlkube related jobs
* [mlkube Test Results Dashboard](https://k8s-testgrid.appspot.com/sig-big-data)
* [mlkube Prow Jobs dashboard](https://prow.k8s.io/?repo=jlewi%2Fmlkube.io)
* Search for tf-k8s to find tf-k8s related jobs
* [tf-k8s Test Results Dashboard](https://k8s-testgrid.appspot.com/sig-big-data)
* [tf-k8s Prow Jobs dashboard](https://prow.k8s.io/?repo=tensorflow%2Fk8s)

## Anatomy of our Prow Jobs

Expand Down Expand Up @@ -63,7 +63,7 @@ You can also run the tests inside the Docker image,
* This can be useful for debugging or testing changes

```
docker run -ti -v ${REPO_PATH}:/go/src/github.com/jlewi/mlkube.io \
docker run -ti -v ${REPO_PATH}:/go/src/github.com/tensorflow/k8s \
-v /var/run/docker.sock:/var/run/docker.sock \
--entrypoint=/bin/bash gcr.io/mlkube-testing/builder:latest
gcloud auth login
Expand Down Expand Up @@ -130,7 +130,7 @@ the results.
Our jobs should be added to
[K8s config](https://github.com/kubernetes/test-infra/blob/master/prow/config.yaml)

## Notes adding mlkube.io to K8s Prow Instance
## Notes adding tensorflow/k8s to K8s Prow Instance

Below is some notes on what it took to integrate with K8s Prow instance.

Expand All @@ -141,7 +141,7 @@ Below is some notes on what it took to integrate with K8s Prow instance.
* Add test dashboards to [testgrid/config/config.yaml](https://github.com/kubernetes/test-infra/pull/4951/files#diff-49f154cd90facc43fda49a99885e6d17)
* Modify [testgrid/jenkins_verify/jenkins_validat.go](https://github.com/kubernetes/test-infra/pull/4951/files#diff-7fb4731a02dd681bbd0daada8dd2f908)
to allow presubmits for the new repo.
1. For mlkube.io configure webhooks by following these [instructions](https://github.com/kubernetes/test-infra/blob/master/prow/getting_started.md#add-the-webhook-to-github)
1. For tensorflow/k8s configure webhooks by following these [instructions](https://github.com/kubernetes/test-infra/blob/master/prow/getting_started.md#add-the-webhook-to-github)
* Use https://prow.k8s.io/hook as the target
* Get HMAC token from k8s test team
1. Add the k8s bot account, k8s-ci-robot, as an admin on the repository
Expand Down
2 changes: 1 addition & 1 deletion test-infra/helm-test/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -107,7 +107,7 @@ var (
TEST_FAILURE_CODE = 2

// File path constants
chartsBasePath = path.Join(os.Getenv("GOPATH"), "src", "/github.com/jlewi/mlkube.io")
chartsBasePath = path.Join(os.Getenv("GOPATH"), "src", "/github.com/tensorflow/k8s")

image = flag.String("image", "", "The Docker image for Tfjob to use.")
outputPath = flag.String("output_dir", "", "The directory where test output should be written.")
Expand Down
6 changes: 3 additions & 3 deletions test-infra/image/bootstrap.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,13 +28,13 @@

# Default name for the repo organization and name.
# This should match the values used in Go imports.
GO_REPO_OWNER = "jlewi"
GO_REPO_NAME = "mlkube.io"
GO_REPO_OWNER = "tensorflow"
GO_REPO_NAME = "k8s"


def run(command, cwd=None):
logging.info("Running: %s", " ".join(command))
subprocess.check_call(command, cwd=cwd).decode("utf-8")
subprocess.check_call(command, cwd=cwd)

def run_and_output(command, cwd=None):
logging.info("Running: %s", " ".join(command))
Expand Down
4 changes: 2 additions & 2 deletions test-infra/runner.py
Original file line number Diff line number Diff line change
Expand Up @@ -54,8 +54,8 @@

# Default repository organization and name.
# This should match the values used in Go imports.
GO_REPO_OWNER = "jlewi"
GO_REPO_NAME = "mlkube.io"
GO_REPO_OWNER = "tensorflow"
GO_REPO_NAME = "k8s"

GCS_REGEX = re.compile("gs://([^/]*)/(.*)")

Expand Down
Loading

0 comments on commit bcf2f08

Please sign in to comment.