Skip to content

Introduce buggify integration test mode#3552

Merged
nickva merged 1 commit intomainfrom
add-buggify-test-mode
May 10, 2021
Merged

Introduce buggify integration test mode#3552
nickva merged 1 commit intomainfrom
add-buggify-test-mode

Conversation

@nickva
Copy link
Contributor

@nickva nickva commented May 10, 2021

Add buggify-elixir-suite target to run Elixir integration tests under FoundationDB's client buggify mode [1]. In this mode, the FDB C client in the erlfdb application will periodically throw mostly retryable errors (1009, 1007, etc). Transaction closures should properly handle retryable errors without side-effects such as re-sending response data to the user more than once or, attempt to re-read data from the socket after it was already read once.

In order to avoid false positives, provide a custom .ini settings file which disables transaction timeouts (1031 errors). Those are not retryable by default, as far as the on_error callback is concerned. If we do have timeouts set ( = 60000), it signals to the FoundationDB client that we expect to handle timeouts in buggify mode, so it starts throwing them [2]. Since we don't handle those everywhere we get quite a few false positive errors.

Buggify settings I believe are the default -- 25% chance to activate an error, and 25% chance of firing the error when the code passes over that section. In most test runs this should result in a pass, but sometimes, due to lingering bugs, there will be timeouts, 409 conflicts and other failures so we cannot yet turn this into a reliable integration test step.

[1] https://apple.github.io/foundationdb/client-testing.html

[2] https://github.com/apple/foundationdb/blob/master/fdbclient/ReadYourWrites.actor.cpp#L1191-L1194

Copy link
Contributor

@iilyak iilyak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Add `buggify-elixir-suite` target to run Elixir integration tests
under FoundationDB's client buggify mode [1]. In this mode, the FDB C
client in the `erlfdb` application will periodically throw mostly
retryable errors (`1009`, `1007`, etc). Transaction closures should
properly handle retryable errors without side-effects such as
re-sending response data to the user more than once or, attempt to
re-read data from the socket after it was already read once.

In order to avoid false positives, provide a custom .ini settings file
which disables transaction timeouts (`1031` errors). Those are not
retryable by default, as far as the `on_error` callback is
concerned. Ff we do have timeouts set ( = 60000), it signals the
FoundationDB client that we expect to handle timeouts in buggify mode,
so it starts throwing them [2]. Since we don't handle those everywhere
we get quite a few false positive errors.

Buggify settings I believe are the default -- 25% chance to activate
an error, and 25% chance of firing the error when the code passes over
that section. In most test runs this should result in a pass, but
sometimes, due to lingering bugs, there will be timeouts, 409
conflicts and other failures so we cannot yet turn this into a
reliable integration test step.

[1] https://apple.github.io/foundationdb/client-testing.html

[2] https://github.com/apple/foundationdb/blob/master/fdbclient/ReadYourWrites.actor.cpp#L1191-L1194
@nickva nickva force-pushed the add-buggify-test-mode branch from 38cf6e6 to 680f842 Compare May 10, 2021 15:29
@nickva nickva merged commit 96d3860 into main May 10, 2021
@nickva nickva deleted the add-buggify-test-mode branch May 10, 2021 15:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants