Skip to content

fix host cancelation reporting on canceled future batches#32563

Merged
sgress454 merged 5 commits intorc-minor-fleet-v4.73.0from
sgress454/show-canceled-scheduled-hosts-correctly
Sep 3, 2025
Merged

fix host cancelation reporting on canceled future batches#32563
sgress454 merged 5 commits intorc-minor-fleet-v4.73.0from
sgress454/show-canceled-scheduled-hosts-correctly

Conversation

@sgress454
Copy link
Copy Markdown
Contributor

@sgress454 sgress454 commented Sep 3, 2025

for #32468

Details

Previously, if you canceled a script that was scheduled for the future, the hosts in that script were shown as "pending" in the batch summary and in the filtered hosts list. This PR fixes the displays so that those hosts are correctly shown as "canceled".

Checklist for submitter

  • Input data is properly validated, SELECT * is avoided, SQL injection is prevented (using placeholders for values in statements)

Testing

Scheduled script

  1. Created a new batch script run scheduled for a future date
  2. Canceled that batch run
  3. Clicked on the batch run in the "finished" tab of the batch scripts list
  4. Verified that the number of canceled hosts = the number of targeted hosts for that batch, and all other numbers were 0.
  5. Verified that clicking the canceled hosts number navigated to the correct list of canceled hosts

"Run now" script

  1. Created a new batch script run with "run now"
  2. Waited for at least one host to run.
  3. Canceled that batch
  4. Clicked on the batch run in the "finished" tab of the batch scripts list
  5. Verified that the number of canceled hosts = the number that were still pending when I canceled the script, and that the # of pending hosts was 0
  6. Verified that clicking the canceled hosts number navigated to the correct list of canceled hosts

One more batch

  1. Created another batch script with the same hosts, scheduled for the future
  2. Verified that the "pending" host list is correct

For unreleased bug fixes in a release candidate, one of:

  • Confirmed that the fix is not expected to adversely impact load test results

Comment thread server/datastore/mysql/hosts.go
Comment thread server/datastore/mysql/hosts.go Outdated
@codecov
Copy link
Copy Markdown

codecov Bot commented Sep 3, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
⚠️ Please upload report for BASE (rc-minor-fleet-v4.73.0@92fa2c3). Learn more about missing BASE report.

Additional details and impacted files
@@                    Coverage Diff                    @@
##             rc-minor-fleet-v4.73.0   #32563   +/-   ##
=========================================================
  Coverage                          ?   62.08%           
=========================================================
  Files                             ?     1984           
  Lines                             ?   194211           
  Branches                          ?     6519           
=========================================================
  Hits                              ?   120569           
  Misses                            ?    64062           
  Partials                          ?     9580           
Flag Coverage Δ
backend 63.17% <100.00%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

batchScriptExecutionJoin = `LEFT JOIN batch_activity_host_results bsehr ON h.id = bsehr.host_id`
batchScriptExecutionIDFilter = `bsehr.batch_execution_id = ?`
batchScriptExecutionJoin = `LEFT JOIN batch_activity_host_results bsehr ON h.id = bsehr.host_id JOIN batch_activities ba ON bsehr.batch_execution_id = ba.execution_id`
batchScriptExecutionIDFilter = `ba.execution_id = ?`
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're filtering by batch execution ID here, so batch_activities will always have a matching row. We need that row in order to determine if the batch was canceled.

Comment on lines -1226 to +1225
batchScriptExecutionIDFilter += ` AND (hsr.exit_code IS NULL AND (hsr.canceled IS NULL OR hsr.canceled = 0) AND bsehr.error IS NULL)`
batchScriptExecutionIDFilter += ` AND ((hsr.host_id AND (hsr.exit_code IS NULL AND (hsr.canceled IS NULL OR hsr.canceled = 0) AND bsehr.error IS NULL)) OR (hsr.host_id is NULL AND ba.canceled = 0 AND bsehr.error IS NULL))`
Copy link
Copy Markdown
Contributor Author

@sgress454 sgress454 Sep 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A host can be pending if:

  1. The batch is running, so the host has a host_script_results record (hsr.host_id is not null) with no exit code (so it hasn't run yet)
  2. The batch hasn't started yet, so the host has no host_script_results record (hsr.host_is is null), and the batch isn't canceled (ba.canceled = 0) and we haven't already marked the host as incompatible (bsehr.error IS NULL)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(probably have a couple more parentheses in here than are strictly necessary)

@sgress454 sgress454 marked this pull request as ready for review September 3, 2025 22:15
@sgress454 sgress454 requested a review from a team as a code owner September 3, 2025 22:15
// TODO -- remove this when status is set automatically
ExecAdhocSQL(t, ds, func(tx sqlx.ExtContext) error {
_, err := tx.ExecContext(ctx, "UPDATE batch_activities SET status = 'scheduled' WHERE execution_id = ?", execID)
_, err := tx.ExecContext(ctx, "UPDATE batch_activities SET status = 'started' WHERE execution_id = ?", execID)
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nothing functional here, just made the test confusing

Comment thread server/datastore/mysql/scripts_test.go Outdated
Comment on lines +2356 to +2357
// The summary should have two pending hosts and two errored ones, because
// the script is not compatible with the hostNoScripts and hostWindows.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove comment?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated comment!

Comment thread server/datastore/mysql/scripts_test.go Outdated
@lucasmrod lucasmrod self-assigned this Sep 3, 2025
@sgress454 sgress454 merged commit 611f8e8 into rc-minor-fleet-v4.73.0 Sep 3, 2025
34 of 37 checks passed
@sgress454 sgress454 deleted the sgress454/show-canceled-scheduled-hosts-correctly branch September 3, 2025 22:42
sgress454 added a commit that referenced this pull request Sep 4, 2025
# Details

Applying the patches from 
#32516 (comment) and
#32563
onto `main`.

This fixes:
* Batches where all remaining hosts are deleted will be correctly marked
as "finished"
* Batches scheduled for the future, and then canceled, will have al
hosts marked as "canceled" rather than pending

# Checklist for submitter

If some of the following don't apply, delete the relevant line.

- [X] Input data is properly validated, `SELECT *` is avoided, SQL
injection is prevented (using placeholders for values in statements)
- [X] If paths of existing endpoints are modified without backwards
compatibility, checked the frontend/CLI for any necessary changes

## Testing

- [X] Added/updated automated tests
- [X] Where appropriate, [automated tests simulate multiple hosts and
test for host
isolation](https://github.com/fleetdm/fleet/blob/main/docs/Contributing/reference/patterns-backend.md#unit-testing)
(updates to one hosts's records do not affect another)

- [X] QA'd all new/changed functionality manually
- [X] Started a new batch script run with one host, deleted that host,
and triggered the batch_activity_completion_checker schedule, and
verified that the batch was moved to "finished".
  - [X] Scheduled script
    - Created a new batch script run scheduled for a future date
    - Canceled that batch run
- Clicked on the batch run in the "finished" tab of the batch scripts
list
- Verified that the number of canceled hosts = the number of targeted
hosts for that batch, and all other numbers were 0.
- Verified that clicking the canceled hosts number navigated to the
correct list of canceled hosts
  - [X] "Run now" script
    - Created a new batch script run with "run now"
    - Waited for at least one host to run.
    - Canceled that batch
- Clicked on the batch run in the "finished" tab of the batch scripts
list
- Verified that the number of canceled hosts = the number that were
still pending when I canceled the script, and that the # of pending
hosts was 0
- Verified that clicking the canceled hosts number navigated to the
correct list of canceled hosts
  - [X] Multiple batches with the same hosts don't bleed into each other
- Created another batch script with the same hosts, scheduled for the
future
    - Verified that the "pending" host list is correct
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants