Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add retries for V2 protocol tests #2650

Merged
merged 1 commit into from
Nov 19, 2020

Conversation

adriangonz
Copy link
Contributor

What this PR does / why we need it:

Tests with the "classic Seldon" API retry the initial requests a couple times. This is apparently done to work around some delay on the initial set up of the services by Istio and Ambassador.

This PR adds a similar mechanism to the integration tests against the V2 API.

Which issue(s) this PR fixes:

Fixes #2589

Special notes for your reviewer:

This PR also includes a formatting unrelated change in od_model.py.

Does this PR introduce a user-facing change?:

@adriangonz
Copy link
Contributor Author

/test integration

@seldondev
Copy link
Collaborator

Mon Nov 16 14:52:13 UTC 2020
The logs for [pr-build] [1] will show after the pipeline context has finished.
https://github.com/SeldonIO/seldon-core/blob/gh-pages/jenkins-x/logs/SeldonIO/seldon-core/PR-2650/1.log

impatient try
jx get build logs SeldonIO/seldon-core/PR-2650 --build=1

@seldondev
Copy link
Collaborator

Mon Nov 16 14:52:21 UTC 2020
The logs for [lint] [2] will show after the pipeline context has finished.
https://github.com/SeldonIO/seldon-core/blob/gh-pages/jenkins-x/logs/SeldonIO/seldon-core/PR-2650/2.log

impatient try
jx get build logs SeldonIO/seldon-core/PR-2650 --build=2

@seldondev
Copy link
Collaborator

Mon Nov 16 14:52:52 UTC 2020
The logs for [integration] [3] will show after the pipeline context has finished.
https://github.com/SeldonIO/seldon-core/blob/gh-pages/jenkins-x/logs/SeldonIO/seldon-core/PR-2650/3.log

impatient try
jx get build logs SeldonIO/seldon-core/PR-2650 --build=3

@adriangonz
Copy link
Contributor Author

The first integration error seems unrelated:

___________________________ TestPrepack.test_mlflow ____________________________
[gw0] linux -- Python 3.6.10 /usr/local/bin/python

self = <test_prepackaged_servers.TestPrepack object at 0x7f20ba2adb00>
namespace = 'test-mlflow'

    @skipif_engine
    def test_mlflow(self, namespace):
        spec = "../../servers/mlflowserver/samples/elasticnet_wine.yaml"
        retry_run(f"kubectl apply -f {spec} -n {namespace}")
        wait_for_status("mlflow", namespace)
>       wait_for_rollout("mlflow", namespace)

test_prepackaged_servers.py:177: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

sdep_name = 'mlflow', namespace = 'test-mlflow', attempts = 50, sleep = 5
expected_deployments = 1

    def wait_for_rollout(
        sdep_name, namespace, attempts=50, sleep=5, expected_deployments=1
    ):
        deployment_names = []
        for _ in range(attempts):
            deployment_names = get_deployment_names(sdep_name, namespace)
            deployments = len(deployment_names)
    
            if deployments == expected_deployments:
                break
            time.sleep(sleep)
    
        error_msg = (
            f"Expected {expected_deployments} deployment(s) but got {len(deployment_names)}"
        )
        assert len(deployment_names) == expected_deployments, error_msg
    
        for deployment_name in deployment_names:
            logging.info(f"Waiting for deployment {deployment_name}")
            for _ in range(attempts):
                ret = run(
                    f"kubectl rollout status -n {namespace} deploy/{deployment_name}",
                    shell=True,
                )
                if ret.returncode == 0:
                    logging.info(f"Successfully waited for deployment {deployment_name}")
                    break
                logging.warning(
                    f"Unsuccessful wait command but retrying for {deployment_name}"
                )
                time.sleep(sleep)
            assert (
                ret.returncode == 0
>           ), f"Wait for rollout of {deployment_name} failed: non-zero return code"
E           AssertionError: Wait for rollout of mlflow-default-0-classifier failed: non-zero return code

seldon_e2e_utils.py:158: AssertionError

@adriangonz
Copy link
Contributor Author

/test integration

@seldondev
Copy link
Collaborator

Mon Nov 16 16:02:12 UTC 2020
The logs for [integration] [4] will show after the pipeline context has finished.
https://github.com/SeldonIO/seldon-core/blob/gh-pages/jenkins-x/logs/SeldonIO/seldon-core/PR-2650/4.log

impatient try
jx get build logs SeldonIO/seldon-core/PR-2650 --build=4

@adriangonz
Copy link
Contributor Author

/test integration

@seldondev
Copy link
Collaborator

Tue Nov 17 09:39:22 UTC 2020
The logs for [integration] [5] will show after the pipeline context has finished.
https://github.com/SeldonIO/seldon-core/blob/gh-pages/jenkins-x/logs/SeldonIO/seldon-core/PR-2650/5.log

impatient try
jx get build logs SeldonIO/seldon-core/PR-2650 --build=5

@adriangonz
Copy link
Contributor Author

/test integration

@seldondev
Copy link
Collaborator

Tue Nov 17 14:53:34 UTC 2020
The logs for [integration] [6] will show after the pipeline context has finished.
https://github.com/SeldonIO/seldon-core/blob/gh-pages/jenkins-x/logs/SeldonIO/seldon-core/PR-2650/6.log

impatient try
jx get build logs SeldonIO/seldon-core/PR-2650 --build=6

@adriangonz adriangonz changed the title WIP : Add retries for V2 protocol tests Add retries for V2 protocol tests Nov 17, 2020
@adriangonz
Copy link
Contributor Author

/test integration

@seldondev
Copy link
Collaborator

Tue Nov 17 17:37:26 UTC 2020
The logs for [integration] [7] will show after the pipeline context has finished.
https://github.com/SeldonIO/seldon-core/blob/gh-pages/jenkins-x/logs/SeldonIO/seldon-core/PR-2650/7.log

impatient try
jx get build logs SeldonIO/seldon-core/PR-2650 --build=7

@adriangonz
Copy link
Contributor Author

/test integration

@seldondev
Copy link
Collaborator

Wed Nov 18 09:47:09 UTC 2020
The logs for [integration] [8] will show after the pipeline context has finished.
https://github.com/SeldonIO/seldon-core/blob/gh-pages/jenkins-x/logs/SeldonIO/seldon-core/PR-2650/8.log

impatient try
jx get build logs SeldonIO/seldon-core/PR-2650 --build=8

@adriangonz
Copy link
Contributor Author

/test integration

@adriangonz
Copy link
Contributor Author

/cc @cliveseldon @RafalSkolasinski

@ukclivecox
Copy link
Contributor

/approve

@seldondev
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cliveseldon

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@seldondev
Copy link
Collaborator

Thu Nov 19 14:48:29 UTC 2020
The logs for [lint] [10] will show after the pipeline context has finished.
https://github.com/SeldonIO/seldon-core/blob/gh-pages/jenkins-x/logs/SeldonIO/seldon-core/PR-2650/10.log

impatient try
jx get build logs SeldonIO/seldon-core/PR-2650 --build=10

@seldondev
Copy link
Collaborator

Thu Nov 19 14:48:39 UTC 2020
The logs for [pr-build] [9] will show after the pipeline context has finished.
https://github.com/SeldonIO/seldon-core/blob/gh-pages/jenkins-x/logs/SeldonIO/seldon-core/PR-2650/9.log

impatient try
jx get build logs SeldonIO/seldon-core/PR-2650 --build=9

@seldondev seldondev merged commit 207ad72 into SeldonIO:master Nov 19, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fix notebook failing integartion tests for sklearn and xgboost V2
3 participants