Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Periodically run KF ready tests against auto deployments #52

Open
jlewi opened this issue Jun 16, 2020 · 10 comments
Open

Periodically run KF ready tests against auto deployments #52

jlewi opened this issue Jun 16, 2020 · 10 comments

Comments

@jlewi
Copy link
Contributor

jlewi commented Jun 16, 2020

Follow on to #42

We should setup a periodic test that runs the tests that the kubeflow applications were correctly deployed.

@issue-label-bot
Copy link

Issue-Label Bot is automatically applying the labels:

Label Probability
area/engprod 0.74
kind/feature 0.87

Please mark this comment with 👍 or 👎 to give our bot feedback!
Links: app homepage, dashboard and code for this bot.

@kubeflow-bot kubeflow-bot added this to To Do in Needs Triage Jun 16, 2020
jlewi pushed a commit to jlewi/kfctl that referenced this issue Jun 17, 2020
* kf_is_ready_test.py currently assumes we have access to the KFDef and
  can use that to determine the platform and trigger platform specific tests.

* With blueprints and when running against auto deployed clusters we won't
  have the KFDef; so in this case just set platform = "" and don't trigger
  platform specific logic.

* Related to GoogleCloudPlatform/kubeflow-distribution#52
@jlewi
Copy link
Contributor Author

jlewi commented Jun 17, 2020

Getting a weird kubernetes client issue when running the tests.

self = <kubernetes.config.kube_config.KubeConfigLoader object at 0x7f12cc3197c0>

    def _refresh_gcp_token(self):
        if 'config' not in self._user['auth-provider']:
            self._user['auth-provider'].value['config'] = {}
        provider = self._user['auth-provider']['config']
        credentials = self._get_google_credentials()
        provider.value['access-token'] = credentials.token
        provider.value['expiry'] = format_rfc3339(credentials.expiry)
        if self._config_persister:
>           self._config_persister()
E           TypeError: _save_kube_config() missing 1 required positional argument: 'config_map'

/usr/local/lib/python3.8/dist-packages/kubernetes/config/kube_config.py:363: TypeError

@jlewi
Copy link
Contributor Author

jlewi commented Jun 17, 2020

Here's the full stacktrace

kf_is_ready_test.py:70: in check_deployments_ready
    util.load_kube_config()
/srcCache/kubeflow/testing/py/kubeflow/testing/util.py:814: in load_kube_config
    loader.load_and_set(config) # pylint: disable=too-many-function-args
/usr/local/lib/python3.8/dist-packages/kubernetes/config/kube_config.py:550: in load_and_set
    self._load_authentication()
/usr/local/lib/python3.8/dist-packages/kubernetes/config/kube_config.py:278: in _load_authentication
    if self._load_auth_provider_token():
/usr/local/lib/python3.8/dist-packages/kubernetes/config/kube_config.py:293: in _load_auth_provider_token
    return self._load_gcp_token(provider)
/usr/local/lib/python3.8/dist-packages/kubernetes/config/kube_config.py:350: in _load_gcp_token
    self._refresh_gcp_token()

It looks like the version of kubernetes lib in the container is "11.0.0" locally I have "9.0.0".

@issue-label-bot
Copy link

Issue-Label Bot is automatically applying the labels:

Label Probability
platform/gcp 0.73

Please mark this comment with 👍 or 👎 to give our bot feedback!
Links: app homepage, dashboard and code for this bot.

jlewi pushed a commit to jlewi/kfctl that referenced this issue Jun 17, 2020
* kf_is_ready_test.py currently assumes we have access to the KFDef and
  can use that to determine the platform and trigger platform specific tests.

* With blueprints and when running against auto deployed clusters we won't
  have the KFDef; so in this case just set platform = "" and don't trigger
  platform specific logic.

* Related to GoogleCloudPlatform/kubeflow-distribution#52
k8s-ci-robot pushed a commit to kubeflow/kfctl that referenced this issue Jun 17, 2020
* kf_is_ready_test.py currently assumes we have access to the KFDef and
  can use that to determine the platform and trigger platform specific tests.

* With blueprints and when running against auto deployed clusters we won't
  have the KFDef; so in this case just set platform = "" and don't trigger
  platform specific logic.

* Related to GoogleCloudPlatform/kubeflow-distribution#52
@jlewi
Copy link
Contributor Author

jlewi commented Jun 23, 2020

Dashboard:
https://k8s-testgrid.appspot.com/sig-big-data#kubeflow-gcp-blueprints-master-periodic

Tests are running regularly but some of the tests are failing.

@jlewi
Copy link
Contributor Author

jlewi commented Jul 8, 2020

@Bobgy
Copy link
Contributor

Bobgy commented Jul 8, 2020

looks like kf_is_ready and metadata_is_ready tests are not passing yet, did you see them?

@Bobgy
Copy link
Contributor

Bobgy commented Jul 8, 2020

Or is this issue just for setting up the test, then I have no objection

vpavlin pushed a commit to vpavlin/kfctl that referenced this issue Jul 10, 2020
* kf_is_ready_test.py currently assumes we have access to the KFDef and
  can use that to determine the platform and trigger platform specific tests.

* With blueprints and when running against auto deployed clusters we won't
  have the KFDef; so in this case just set platform = "" and don't trigger
  platform specific logic.

* Related to GoogleCloudPlatform/kubeflow-distribution#52
jlewi pushed a commit to jlewi/kfctl that referenced this issue Jul 10, 2020
* For workload identity binding tests the permissions were all wrong
* We no longer download service account keys
* cloud-endpoints-controller should be in the kubeflow namespace not
  istio-system namespace

Related to GoogleCloudPlatform/kubeflow-distribution#52
Related to GoogleCloudPlatform/kubeflow-distribution#73
k8s-ci-robot pushed a commit to kubeflow/kfctl that referenced this issue Jul 10, 2020
* For workload identity binding tests the permissions were all wrong
* We no longer download service account keys
* cloud-endpoints-controller should be in the kubeflow namespace not
  istio-system namespace

Related to GoogleCloudPlatform/kubeflow-distribution#52
Related to GoogleCloudPlatform/kubeflow-distribution#73
vpavlin pushed a commit to vpavlin/kfctl that referenced this issue Jul 20, 2020
* kf_is_ready_test.py currently assumes we have access to the KFDef and
  can use that to determine the platform and trigger platform specific tests.

* With blueprints and when running against auto deployed clusters we won't
  have the KFDef; so in this case just set platform = "" and don't trigger
  platform specific logic.

* Related to GoogleCloudPlatform/kubeflow-distribution#52
vpavlin pushed a commit to vpavlin/kfctl that referenced this issue Jul 22, 2020
* kf_is_ready_test.py currently assumes we have access to the KFDef and
  can use that to determine the platform and trigger platform specific tests.

* With blueprints and when running against auto deployed clusters we won't
  have the KFDef; so in this case just set platform = "" and don't trigger
  platform specific logic.

* Related to GoogleCloudPlatform/kubeflow-distribution#52
vpavlin pushed a commit to vpavlin/kfctl that referenced this issue Jul 22, 2020
* For workload identity binding tests the permissions were all wrong
* We no longer download service account keys
* cloud-endpoints-controller should be in the kubeflow namespace not
  istio-system namespace

Related to GoogleCloudPlatform/kubeflow-distribution#52
Related to GoogleCloudPlatform/kubeflow-distribution#73
vpavlin pushed a commit to vpavlin/kfctl that referenced this issue Jul 22, 2020
* kf_is_ready_test.py currently assumes we have access to the KFDef and
  can use that to determine the platform and trigger platform specific tests.

* With blueprints and when running against auto deployed clusters we won't
  have the KFDef; so in this case just set platform = "" and don't trigger
  platform specific logic.

* Related to GoogleCloudPlatform/kubeflow-distribution#52
vpavlin pushed a commit to vpavlin/kfctl that referenced this issue Jul 22, 2020
* For workload identity binding tests the permissions were all wrong
* We no longer download service account keys
* cloud-endpoints-controller should be in the kubeflow namespace not
  istio-system namespace

Related to GoogleCloudPlatform/kubeflow-distribution#52
Related to GoogleCloudPlatform/kubeflow-distribution#73
@jtfogarty jtfogarty moved this from To Do to Assigned to Area Owner For Triage in Needs Triage Sep 8, 2020
@jtfogarty
Copy link

jtfogarty commented Sep 8, 2020

/area gcp-blueprints
/priority p2

@jtfogarty
Copy link

/priority p2

@kubeflow-bot kubeflow-bot removed this from Assigned to Area Owner For Triage in Needs Triage Sep 8, 2020
crobby pushed a commit to crobby/kfctl that referenced this issue Feb 25, 2021
* kf_is_ready_test.py currently assumes we have access to the KFDef and
  can use that to determine the platform and trigger platform specific tests.

* With blueprints and when running against auto deployed clusters we won't
  have the KFDef; so in this case just set platform = "" and don't trigger
  platform specific logic.

* Related to GoogleCloudPlatform/kubeflow-distribution#52
crobby pushed a commit to crobby/kfctl that referenced this issue Feb 25, 2021
* For workload identity binding tests the permissions were all wrong
* We no longer download service account keys
* cloud-endpoints-controller should be in the kubeflow namespace not
  istio-system namespace

Related to GoogleCloudPlatform/kubeflow-distribution#52
Related to GoogleCloudPlatform/kubeflow-distribution#73
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants