Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

control-service: add counter to track data job watching task executions #692

Merged
merged 1 commit into from
Feb 2, 2022

Commits on Feb 2, 2022

  1. control-service: add counter to track data job watching task executions

    Currently, we are lacking monitoring of our data job watching task -
    this is the task that monitors the K8s namespace for data job changes and
    updates the execution and termination statuses of the data jobs along with
    the metrics exposed by the control service.
    
    We have experienced cases when this task stops running. Considering the
    importance of this task it is essential that we get an early alert when
    this happens. This commit introduces a new metric (counter) that exposes
    the number of executions of this task. This counter can then be used in
    dashboards to alert when the task stops executing for a period of time.
    
    Testing done: new unit tests; manually starting the service to observe
    the new, gradually increasing metrics.
    
    Signed-off-by: Tsvetomir Palashki <tpalashki@vmware.com>
    tpalashki committed Feb 2, 2022
    Configuration menu
    Copy the full SHA
    aa36522 View commit details
    Browse the repository at this point in the history