Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Workflow for building and pushing the TF serving image. #196

Merged
merged 12 commits into from Feb 6, 2018

Conversation

lluunn
Copy link
Contributor

@lluunn lluunn commented Feb 2, 2018

We want to have a workflow that automatically build and push the TF serving image from source, and run some test against it to verify it's working.
Argo is suitable for this kind of workflows, and it supports docker in docker.

In this PR, the workflows contains building+pushing part. (will add testing in followup PR).
It's using the Dockerfile in components/k8s-model-server/docker, which is a TF1.4 CPU image.

Related #50

@jlewi
Copy link
Contributor

jlewi commented Feb 2, 2018

Please reference #50 in the PR description.

Which version of TF does this end up building a Docker image for?

Copy link
Contributor

@jlewi jlewi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's where I think we should go with this:

We should create a push to build pipeline, so everytime push to GitHub we end up building a Docker image from the source at that revision.

We can do this by having prow trigger the workflow.

We should tag the images to be something like

:1.0-<job_kind>-<pr #> - -

All of which are prow environment variables:
https://github.com/kubernetes/test-infra/tree/master/prow

commit: "master",
name: "new4",
namespace: "kubeflow-test-infra",
prow_env: "REPO_OWNER=google,REPO_NAME=kubeflow,PULL_BASE_SHA=master",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the REPO_OWNER should be kubeflow.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -0,0 +1,31 @@
# This pod is useful for starting a shell that you can use to interactively debug our tests
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Delete this please.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

local artifactsDir = outputDir + "/artifacts";
local srcDir = testDir + "/src";
// local testing_image = "gcr.io/mlkube-testing/kubeflow-testing";
local testing_image = "gcr.io/kai-test2/kubeflow-testing:1.0";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this point at kai-cloud?
If you need to override these values make the parameters and just override them per environment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made some change to the Dockerfile (install docker).
Made it a param now.

@jlewi
Copy link
Contributor

jlewi commented Feb 4, 2018

Please update the PR description. Explain the motivation for this PR and why we are using Argo.

@lluunn
Copy link
Contributor Author

lluunn commented Feb 5, 2018

Thanks, PTAL

@jlewi
Copy link
Contributor

jlewi commented Feb 5, 2018

Lets change the directory to

testing/test-serving-container  -> images/releaser

local serving_image = params.serving_image;
local testing_image = params.testing_image;

// TODO(jlewi): Can we make name default so some random unique value?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Delete this TODO please.

@@ -0,0 +1,3 @@
# Building and testing the TF serving image.

work in progress.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The README should explain how I can trigger this workflow to build a TF Serving image. The README should explain the following

1. How to set which version of the Kubeflow source repository is used
2. How to set the GCR registry where the image is published to
3. Which version of TF serving is used.

@lluunn
Copy link
Contributor Author

lluunn commented Feb 5, 2018

PTAL

Copy link
Contributor

@jlewi jlewi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a couple nit picky things.

## Build and Push
Set the gcr registry to be published to by `ks set param workflows serving_image gcr.io/xxx/xxx:xx`.

The TF serving image that the workflow is currently using is TF1.4 CPU, located at `components/k8s-model-server/docker`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The text seems wrong. I assume you mean "TF version" not TF serving image?


## Checkout
The version of kubeflow is determined by the parameters such as `PULL_NUMBER` or `PULL_BASE_SHA`.
See [checkout.sh](https://github.com/kubeflow/testing/blob/master/images/checkout.sh) for more detail.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lluunn
Copy link
Contributor Author

lluunn commented Feb 6, 2018

done

@jlewi jlewi merged commit e686cf7 into kubeflow:master Feb 6, 2018
elenzio9 pushed a commit to arrikto/kubeflow that referenced this pull request Oct 31, 2022
I have adde myself in the org.yaml. I have previously some small contributions and hope to continue do so :)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants