Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add option to disable retries & disable boto retries for CI tests #9074

Merged
merged 5 commits into from Sep 7, 2023

Conversation

dominikschubert
Copy link
Member

@dominikschubert dominikschubert commented Sep 5, 2023

Motivation

These uncontrolled boto-client level retries are an inherent source of unpredictable behavior, so let's see if disabling them causes any unexpected issues.

Retries also cause unnecessary time waiting when we actually want to test a failure case.

Changes

  • disables retries for all clients created via our client factories in our test suite (CI)
  • adds config variable DISABLE_BOTO_RETRIES. Set DISABLE_BOTO_RETRIES=1 to opt-tin of the new behavior.

Discussion

Might also want to go through cases where we use SDKs in lambda functions etc. to disable any retries.

@dominikschubert dominikschubert self-assigned this Sep 5, 2023
@dominikschubert dominikschubert added the semver: patch Non-breaking changes which can be included in patch releases label Sep 5, 2023
@github-actions
Copy link

github-actions bot commented Sep 5, 2023

LocalStack Community integration with Pro

       2 files         2 suites   1h 21m 24s ⏱️
2 169 tests 1 671 ✔️ 498 💤 0
2 170 runs  1 671 ✔️ 499 💤 0

Results for commit 8846daa.

♻️ This comment has been updated with latest results.

@joe4dev
Copy link
Member

joe4dev commented Sep 6, 2023

This should speed up exception testing considerably because most with pytest.raises calls were invoked 4 times 😱

It also shows good progress at mitigating flaky tests. 5 of the 9 failing tests are marked by CircleCI as flaky: https://app.circleci.com/pipelines/github/localstack/localstack/17875/workflows/93965461-12f5-4dff-9821-1fabbe7ea8ba/jobs/136694/tests

Screenshot 2023-09-06 at 11 07 06

Copy link
Member

@dfangl dfangl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@@ -333,7 +333,7 @@ def _get_client(
aws_access_key_id=aws_access_key_id,
aws_secret_access_key=aws_secret_access_key,
aws_session_token=aws_session_token,
config=config,
config=config.merge(Config(retries={"max_attempts": 0})),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this be reverted? Are we trying to address an issue with the test suite? I would rather set this in the fixtures.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No this is also intended to be set for internal communication, though we should observe this a bit. Generally I think it makes more sense to fail fast if there's an issue instead of retrying countless times.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not hide potential issues through retries for internal calls as long as we assume that LocalStack runs on the same machine (i.e., without a potentially problematic network connection).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've now also added a config option to opt-out of this behavior

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To keep this a bit less controversion for now, I've switched the default back to normal retries & explicitly set DISABLE_BOTO_RETRIES in our CI tests. This means you'll need to set this in your environment locally though if you want to execute your tests without retries.

@dominikschubert dominikschubert removed the request for review from thrau September 6, 2023 12:23
Copy link
Member

@joe4dev joe4dev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

Did the rebase fix the other flaky tests or did you patch b1a17823afc5cd2a147b5c5d83ab6a02aa660e9b ?

@@ -333,7 +333,7 @@ def _get_client(
aws_access_key_id=aws_access_key_id,
aws_secret_access_key=aws_secret_access_key,
aws_session_token=aws_session_token,
config=config,
config=config.merge(Config(retries={"max_attempts": 0})),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not hide potential issues through retries for internal calls as long as we assume that LocalStack runs on the same machine (i.e., without a potentially problematic network connection).

FunctionName=function_name, Qualifier=function_version
)["Status"]
if status == "FAILED":
raise ShortCircuitWaitException("terminal fail state")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice pattern we should remember and use more often in tests 👍

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI: also a better solution than the regular retry from the Lambda invocation loop 👍 (favored this one in rebase)

@dominikschubert
Copy link
Member Author

Did the rebase fix the other flaky tests or did you patch b1a1782 ?

just a normal re-run 🤷‍♂️ but they're gone now on master anyway.

@dominikschubert dominikschubert changed the title disable retries for boto clients add option to disable retries & disable boto retries for CI tests Sep 7, 2023
@dominikschubert dominikschubert merged commit 4953c48 into master Sep 7, 2023
30 checks passed
@dominikschubert dominikschubert deleted the disable-boto-retries branch September 7, 2023 15:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
semver: patch Non-breaking changes which can be included in patch releases
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants