New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Configurable health check threshold for triggerer #33089
Configurable health check threshold for triggerer #33089
Conversation
|
||
|
||
@pytest.mark.parametrize("check_triggerer_heartrate", [10, 60, 300]) | ||
def test_assign_unassigned_missing_heartbeat(session, create_task_instance, check_triggerer_heartrate): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A new assert was added to the above test that covers the usecase this tests was trying to cover. This test was quite long and more complicated than I think required to test this particular code branch.
Recently the Triggerer was forced to a health check threshold of `trigger_heartbeat * 2.1` with a default of 5s for the heartbeat, this generates a threshold of 10.5s, the previous threshold was 30s, this leads to a very unstable situation where Triggerers are not given a reasonable amount of time to heartbeat and their triggers are taken from them. This change allows the user to configure this threshold the same way we do for the scheduler.
c7185b7
to
3f65b7a
Compare
cc: @hussein-awala @ephraimbuddy -> i think this one should go to RC1/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice one, LGTM
+1 |
Recently the Triggerer was forced to a health check threshold of `trigger_heartbeat * 2.1` with a default of 5s for the heartbeat, this generates a threshold of 10.5s, the previous threshold was 30s, this leads to a very unstable situation where Triggerers are not given a reasonable amount of time to heartbeat and their triggers are taken from them. This change allows the user to configure this threshold the same way we do for the scheduler. (cherry picked from commit 6ec3b9a)
Recently the Triggerer was forced to a health check threshold of
trigger_heartbeat * 2.1
with a default of 5s for the heartbeat (as of #32123), this generates a threshold of 10.5s, the previous threshold was 30s, this leads to us observing a very unstable situation in many Airflow environments where Triggerers are not given a reasonable amount of time to heartbeat and their triggers are taken from them and given to another "healthy" Triggerer. This repeats back and fourth between multiple Triggerers.This change allows the user to configure this threshold the same way we do for the scheduler.
^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named
{pr_number}.significant.rst
or{issue_number}.significant.rst
, in newsfragments.