-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Katib Pipeline Step #77
base: master
Are you sure you want to change the base?
Conversation
mbu93
commented
May 14, 2020
Hey @mbu93 👋, I predicted that this release note qualifies as kind/bug to 16.12%. A release note with the kind/bug needs a prediction rate with at least 60%. |
Hi @mbu93. Thanks for your PR. I'm waiting for a kubernetes-analysis member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: mbu93 The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Thank you for the PR, please rebase your changes on top of the latest master branch. |
/hold |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me check if the trigger plugin is correctly configured. This is the first time an external PR comes in.
@@ -201,6 +227,7 @@ def container( | |||
file_outputs[k] = out | |||
output_artifact_copy_args += dedent(""" | |||
mkdir -p {d} | |||
echo copying outputs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
echo copying outputs |
src/pipeline.py
Outdated
# ssh_key = "ssh-key" | ||
# ctr.add_volume( | ||
# k8s.V1Volume(name=ssh_key, | ||
# secret=k8s.V1SecretVolumeSource(default_mode=0o600, | ||
# secret_name=ssh_key))) | ||
# ctr.container.add_volume_mount( | ||
# k8s.V1VolumeMount(name=ssh_key, | ||
# read_only=True, | ||
# mount_path="/root/.ssh")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will be needed to work on the master branch
/ok-to-test |
Prow is not able to apply the patch, please rebase and squash your commits into a single one via |
d367d02
to
cbad958
Compare
/ok-to-test |
The pipeline within this commit is not up to date. Please run |
Should I take care of the linting errors as well? What standard do you use? Pylint pep8? |
@mbu93: The following test failed, say
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
|
Awesome! |
/wip |
/hold cancel |
/hold |
def create(self): | ||
self.op = dedent(""" | ||
rm -rf {outdir}/{repo} | ||
git clone https://github.com/saschagrunert/{repo} {outdir}/{repo} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I recommend to speed up the clone with --depth=1
and just apply the patch (see pipeline.py)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We also have to clone into a unique directory on the PV to make parallel runs possible.
rm -rf {outdir}/{repo} | ||
git clone https://github.com/saschagrunert/{repo} {outdir}/{repo} | ||
python /ml/launch_experiment.py \ | ||
--name {} \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can still indent here.
self.output = output | ||
self.name = name | ||
self.image = image | ||
self.controller_img = "mbu93/katib-launcher:latest" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please commit the container file (Dockerfile) into the PR as well as the source code of the ml_runner.py.
def trial_template(name, image, replicas, command, mount_path, pr=None): | ||
target = "pull/{pr}/head:{pr}".format(pr=pr) if pr else "master" | ||
revision = pr if pr else "master" | ||
args = ["cd data/kubernetes-analysis ;", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You have to &&
here instead of ;
to fail early.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I recommend to combine the clone and checkout to have them in one place from a source code perspective.
@@ -85,6 +97,19 @@ def __run(pr: str = "", commit: str = ""): | |||
update_data.after(update_api) | |||
data = update_data_outputs["data"] | |||
|
|||
katib_op = KatibOp(image=Pipeline.IMAGE, name="katib-experiment", | |||
output=Pipeline.PV_DIR, repo="kubernetes-analysis", pr="") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use Pipeline.REPO
for repo
and pass the pr
) | ||
katib.add_pvolumes({Pipeline.PV_DIR: PipelineVolume(pvc="pipeline-pv")}) | ||
katib.after(update_data) | ||
tune_params = func_to_container_op(to_args)(katib.outputs["params"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a TODO comment here that we have to find a better solution for this.
layers, units) | ||
accuracy, _ = self.__train( | ||
layers=layers, units=units, epochs=epochs, learning_rate=learning_rate) | ||
print( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use the logger here.
@@ -9,6 +10,9 @@ | |||
from sklearn.feature_extraction.text import TfidfVectorizer | |||
from sklearn.feature_selection import SelectKBest, f_classif | |||
|
|||
TuneParams = namedtuple("TuneParams", ["layers", "units"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we move those into the Nlp class?
for _ in range(layers - 1): | ||
model.add(tf.keras.layers.Dense(units=units, activation="relu")) | ||
for scale in range(1, units // 2): | ||
model.add(tf.keras.layers.Dense(units=units // (2**scale), activation="relu")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need this change?
Please fixup the linter as well. 😀 Give me a hint if we can go for a retest, I’ll create the pvc in the meanwhile. |
Beside that, let’s cleanup the PV after the Katib run (deletion of the repository) that it does not endlessly fill-up. |