Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't install TFMA for TF versions < 1.9 #1579

Merged
merged 4 commits into from Sep 22, 2018

Conversation

richardsliu
Copy link
Contributor

@richardsliu richardsliu commented Sep 19, 2018

Fixes #1576

According to https://github.com/tensorflow/model-analysis#compatible-versions, TFMA package 0.9.0 (which we use) is only compatible with TensorFlow 1.9+. This was causing the Jupyter notebook images to fail to build due to compatibility issues.


This change is Reviewable

@richardsliu
Copy link
Contributor Author

/assign @jlewi
/assign @pdmack

@pdmack
Copy link
Member

pdmack commented Sep 19, 2018

/lgtm

@pdmack
Copy link
Member

pdmack commented Sep 19, 2018

/hold

for @jlewi

@jlewi
Copy link
Contributor

jlewi commented Sep 20, 2018

Can you enable building the Docker images in presubmit so we can verify this fixes the issue?
Just uncomment this line
https://github.com/kubeflow/kubeflow/blob/master/prow_config.yaml#L85

@k8s-ci-robot k8s-ci-robot removed the lgtm label Sep 20, 2018
@richardsliu
Copy link
Contributor Author

/retest

@pdmack
Copy link
Member

pdmack commented Sep 20, 2018

/test all

@jlewi
Copy link
Contributor

jlewi commented Sep 20, 2018

Most recent test flake is

ERROR|2018-09-20T03:41:37|/src/kubeflow/testing/py/kubeflow/testing/run_e2e_workflow.py|257| Exception occurred: (401)
Reason: Unauthorized
HTTP response headers: HTTPHeaderDict({'Date': 'Thu, 20 Sep 2018 03:41:37 GMT', 'Audit-Id': '565e8792-b2a6-4df3-86ec-9a0e0b46ba16', 'Content-Length': '129', 'Content-Type': 'application/json', 'Www-Authenticate': 'Basic realm="kubernetes-master"'})
HTTP response body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"Unauthorized","reason":"Unauthorized","code":401}

Traceback (most recent call last):
  File "/src/kubeflow/testing/py/kubeflow/testing/run_e2e_workflow.py", line 243, in run
    status_callback=argo_client.log_status)
  File "/usr/local/lib/python2.7/dist-packages/retrying.py", line 49, in wrapped_f
    return Retrying(*dargs, **dkw).call(f, *args, **kw)
  File "/usr/local/lib/python2.7/dist-packages/retrying.py", line 212, in call
    raise attempt.get()
  File "/usr/local/lib/python2.7/dist-packages/retrying.py", line 247, in get
    six.reraise(self.value[0], self.value[1], self.value[2])
  File "/usr/local/lib/python2.7/dist-packages/retrying.py", line 200, in call
    attempt = Attempt(fn(*args, **kwargs), attempt_number, False)
  File "/src/kubeflow/testing/py/kubeflow/testing/argo_client.py", line 70, in wait_for_workflows
    GROUP, VERSION, namespace, PLURAL, n)
  File "/usr/local/lib/python2.7/dist-packages/kubernetes/client/apis/custom_objects_api.py", line 697, in get_namespaced_custom_object
    (data) = self.get_namespaced_custom_object_with_http_info(group, version, namespace, plural, name, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/kubernetes/client/apis/custom_objects_api.py", line 797, in get_namespaced_custom_object_with_http_info
    collection_formats=collection_formats)
  File "/usr/local/lib/python2.7/dist-packages/kubernetes/client/api_client.py", line 321, in call_api
    _return_http_data_only, collection_formats, _preload_content, _request_timeout)
  File "/usr/local/lib/python2.7/dist-packages/kubernetes/client/api_client.py", line 155, in __call_api
    _request_timeout=_request_timeout)
  File "/usr/local/lib/python2.7/dist-packages/kubernetes/client/api_client.py", line 342, in request
    headers=headers)
  File "/usr/local/lib/python2.7/dist-packages/kubernetes/client/rest.py", line 231, in GET
    query_params=query_params)
  File "/usr/local/lib/python2.7/dist-packages/kubernetes/client/rest.py", line 222, in request
    raise ApiException(http_resp=r)
ApiException: (401)

That seems like a retryable error.

@richardsliu
Copy link
Contributor Author

/test all

@jlewi
Copy link
Contributor

jlewi commented Sep 20, 2018

I see what the problem with the retries is; Filed kubeflow/testing#204

@richardsliu
Copy link
Contributor Author

/retest

@jlewi
Copy link
Contributor

jlewi commented Sep 21, 2018

/lgtm
/approve

I assume this test will be fixed by kubeflow/testing#208?

@jlewi
Copy link
Contributor

jlewi commented Sep 21, 2018

/test all
Since
kubeflow/testing#204 is submitted.

@richardsliu
Copy link
Contributor Author

/retest

@jlewi
Copy link
Contributor

jlewi commented Sep 22, 2018

@richardsliu Can you update the PR description please? It would be good to explain why TFMA worked in earlier versions of TF. It would also be good to capture the fact that previously we were installing it in versions of TF for 1.6 and later but now we are only doing it in 1.9

In particular, it looks like we are installing the latest version of TFMA so we end up picking up TFMA 0.9.0 which requires TF 1.9

It looks like there is a compatibility matrix here:
https://github.com/tensorflow/model-analysis#compatible-versions

Can we open up a follow on issue to allow us to install different versions of TFMA in different TF versions of our notebook.

@jlewi
Copy link
Contributor

jlewi commented Sep 22, 2018

/lgtm
/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jlewi

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@richardsliu
Copy link
Contributor Author

/hold cancel

@k8s-ci-robot k8s-ci-robot merged commit 71b01e8 into kubeflow:master Sep 22, 2018
saffaalvi pushed a commit to StatCan/kubeflow that referenced this pull request Feb 11, 2021
* Don't install TFMA for TF versions < 1.9

* Enable presubmit

* Add fix to libsonnet
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants