Skip to content

Add "batch script host results" API#32174

Merged
sgress454 merged 20 commits intomainfrom
sgress454/31536-batch-exec-host-details-api
Aug 27, 2025
Merged

Add "batch script host results" API#32174
sgress454 merged 20 commits intomainfrom
sgress454/31536-batch-exec-host-details-api

Conversation

@sgress454
Copy link
Copy Markdown
Contributor

@sgress454 sgress454 commented Aug 21, 2025

for #31536

Details

This PR adds a new API as specced in the API PR for scheduled scripts.

Checklist for submitter

If some of the following don't apply, delete the relevant line.

  • Changes file added for user-visible changes in changes/, orbit/changes/ or ee/fleetd-chrome/changes.
    See Changes files for more information.
  • Input data is properly validated, SELECT * is avoided, SQL injection is prevented (using placeholders for values in statements)

Testing

  • Added/updated automated tests

  • Where appropriate, automated tests simulate multiple hosts and test for host isolation (updates to one hosts's records do not affect another)

  • QA'd all new/changed functionality manually
    ran a batch script on 100 hosts and ran the API in Postman for each status, then canceled the batch and ran the API to check the canceled status.

@sgress454 sgress454 requested a review from a team as a code owner August 21, 2025 20:32
@codecov
Copy link
Copy Markdown

codecov Bot commented Aug 21, 2025

Codecov Report

❌ Patch coverage is 74.12587% with 37 lines in your changes missing coverage. Please review.
✅ Project coverage is 64.02%. Comparing base (dc4fd67) to head (c80ca48).
⚠️ Report is 26 commits behind head on main.

Files with missing lines Patch % Lines
server/service/scripts.go 40.42% 20 Missing and 8 partials ⚠️
server/datastore/mysql/hosts.go 90.52% 6 Missing and 3 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main   #32174      +/-   ##
==========================================
- Coverage   64.03%   64.02%   -0.01%     
==========================================
  Files        1987     1987              
  Lines      194532   194701     +169     
  Branches     6436     6436              
==========================================
+ Hits       124562   124667     +105     
- Misses      60255    60306      +51     
- Partials     9715     9728      +13     
Flag Coverage Δ
backend 65.29% <74.12%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Copy Markdown
Member

@lucasmrod lucasmrod left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some comments/questions.

Comment thread server/service/scripts.go
Comment thread server/datastore/mysql/hosts.go Outdated
Comment on lines -1209 to +1271
batchScriptExecutionIDFilter := "TRUE"
batchScriptExecutionFilter := "TRUE"
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed since this filter can include more than just a batch execution ID.

Comment on lines -1211 to +1273
batchScriptExecutionJoin = `LEFT JOIN batch_activity_host_results bsehr ON h.id = bsehr.host_id`
batchScriptExecutionIDFilter = `bsehr.batch_execution_id = ?`
whereParams = append(whereParams, *opt.BatchScriptExecutionIDFilter)
if opt.BatchScriptExecutionStatusFilter.IsValid() {
batchScriptExecutionJoin += ` LEFT JOIN host_script_results hsr ON bsehr.host_execution_id = hsr.execution_id`
switch opt.BatchScriptExecutionStatusFilter {
case fleet.BatchScriptExecutionRan:
batchScriptExecutionIDFilter += ` AND hsr.exit_code = 0`
case fleet.BatchScriptExecutionPending:
// Pending can mean "waiting for execution" or "waiting for results".
batchScriptExecutionJoin += ` LEFT JOIN upcoming_activities ua ON ua.execution_id = bsehr.host_execution_id`
batchScriptExecutionIDFilter += ` AND ((ua.execution_id IS NOT NULL) OR (hsr.host_id is NOT NULL AND hsr.exit_code IS NULL AND hsr.canceled = 0 AND bsehr.error IS NULL))`
case fleet.BatchScriptExecutionErrored:
// TODO - remove exit code condition when we split up "errored" and "failed"
batchScriptExecutionIDFilter += ` AND hsr.exit_code > 0`
case fleet.BatchScriptExecutionIncompatible:
batchScriptExecutionIDFilter += ` AND bsehr.error IS NOT NULL`
case fleet.BatchScriptExecutionCanceled:
batchScriptExecutionIDFilter += ` AND hsr.exit_code IS NULL AND hsr.canceled = 1`
}
}
batchScriptExecutionJoin, batchScriptExecutionFilter, whereParams = ds.getBatchExecutionFilters(whereParams, opt)
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved code to generate batch script filters into a shared function so that it can be used by both ListHosts() and the new ListBatchScriptHosts()

Comment thread server/datastore/mysql/hosts.go Outdated
lucasmrod
lucasmrod previously approved these changes Aug 26, 2025
Copy link
Copy Markdown
Contributor Author

@sgress454 sgress454 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lucasmrod couple of updates after @jacobshandling tested against the front end, please and thank you.

Comment thread server/fleet/scripts.go
// if no result was received yet.
ScriptOutput string `json:"script_output_preview,omitempty" db:"output"`
// Executed at is the time the script was executed on the host (if at all).
ScriptExecutedAt *time.Time `json:"script_executed_at,omitempty" db:"updated_at"`
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not referenced anywhere in code (so no new nil checks needed), but handling it as a string like I was made it not come out as an ISO timestamp which caused confusion on the frontend.

ELSE NULL
END as updated_at,
COALESCE(LEFT(hsr.output, 100), '') as output,
COALESCE(hsr.execution_id, '') as execution_id
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was just wrong previously, i had it returning the batch execution ID rather than the host execution ID, due to misreading the API doc. I updated the tests to check this as well.

@sgress454
Copy link
Copy Markdown
Contributor Author

@lucasmrod stand by, found another issue during testing.

@sgress454 sgress454 marked this pull request as draft August 27, 2025 15:44
Comment on lines +1388 to +1394
// Pending can mean "waiting for execution" or "waiting for results".
// hsr.exit_code IS NULL <- this means the script has not reported back
// (hsr.canceled IS NULL OR hsr.canceled = 0) <- this can mean the script is running, or that it hasn't been activated yet,
// but either way we haven't canceled it.
// bsehr.error IS NULL <- this means the batch script framework didn't mark this host as incompatible
// with this script run.
batchScriptExecutionFilter += ` AND (hsr.exit_code IS NULL AND (hsr.canceled IS NULL OR hsr.canceled = 0) AND bsehr.error IS NULL)`
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was also wrong in previous iterations, because "pending" can mean:

  • scheduled for the future (so that there's no activity record for it at all), or
  • part of a started batch, but not the next activity for the host, or
  • actively ready to be run (or currently running) on the host

and for any of those scenarios, it can also be canceled, in which case we want to return it as "canceled" and not "pending".

Comment thread server/service/scripts.go
BatchScriptExecutions []fleet.BatchActivity `json:"batch_executions"`
Count uint `json:"count"`
Err error `json:"error,omitempty"`
Meta fleet.PaginationMetadata `json:"meta"`
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed this to align with API spec

Comment thread server/service/scripts.go
Comment on lines 1149 to 1176
Count: uint(count), //nolint:gosec // dismiss G115
PaginationMetadata: fleet.PaginationMetadata{
Meta: fleet.PaginationMetadata{
HasNextResults: hasNextResults,
HasPreviousResults: hasPreviousResults,
},
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed this to match API spec

@sgress454 sgress454 marked this pull request as ready for review August 27, 2025 19:03
sgress454 added a commit that referenced this pull request Aug 27, 2025
for #32231

# Details

This PR adjusts the queries for listing batch scripts slightly to count
_every_ row in `batch_activities` matching the filters, regardless of
whether any `batch_activity_host_results` rows exist for it. This
handles the edge case of a batch script where all the hosts have been
deleted.

# Checklist for submitter

If some of the following don't apply, delete the relevant line.

- [X] Input data is properly validated, `SELECT *` is avoided, SQL
injection is prevented (using placeholders for values in statements)

## Testing

- [ ] Added/updated automated tests
I didn't add tests for this because these tests have already changed
quite a bit in #32174. I can add
tests in there when this merges.

- [X] QA'd all new/changed functionality manually

* Select a host in Manage Hosts, click Run Script, select a script and
do Run Now
* Delete that host
* Go to the batch scripts list (Controls -> Scripts -> Batch Progress)
* Verify that the batch script is still listed.

We don't have clear expectations for what numbers should be displayed
for the progress of a batch like this, but this PR at least ensures the
batch doesn't disappear.

For unreleased bug fixes in a release candidate, one of:

- [X] Confirmed that the fix is not expected to adversely impact load
test results
sgress454 added a commit that referenced this pull request Aug 27, 2025
for #32231

# Details

This PR adjusts the queries for listing batch scripts slightly to count
_every_ row in `batch_activities` matching the filters, regardless of
whether any `batch_activity_host_results` rows exist for it. This
handles the edge case of a batch script where all the hosts have been
deleted.

# Checklist for submitter

If some of the following don't apply, delete the relevant line.

- [X] Input data is properly validated, `SELECT *` is avoided, SQL
injection is prevented (using placeholders for values in statements)

## Testing

- [ ] Added/updated automated tests
I didn't add tests for this because these tests have already changed
quite a bit in #32174. I can add
tests in there when this merges.

- [X] QA'd all new/changed functionality manually

* Select a host in Manage Hosts, click Run Script, select a script and
do Run Now
* Delete that host
* Go to the batch scripts list (Controls -> Scripts -> Batch Progress)
* Verify that the batch script is still listed.

We don't have clear expectations for what numbers should be displayed
for the progress of a batch like this, but this PR at least ensures the
batch doesn't disappear.

For unreleased bug fixes in a release candidate, one of:

- [X] Confirmed that the fix is not expected to adversely impact load
test results
Comment thread server/datastore/mysql/hosts.go
Comment thread server/datastore/mysql/hosts.go
Co-authored-by: Lucas Manuel Rodriguez <lucas@fleetdm.com>
Co-authored-by: Lucas Manuel Rodriguez <lucas@fleetdm.com>
@sgress454 sgress454 merged commit a874984 into main Aug 27, 2025
42 checks passed
@sgress454 sgress454 deleted the sgress454/31536-batch-exec-host-details-api branch August 27, 2025 21:39
sgress454 added a commit that referenced this pull request Aug 27, 2025
> # Cherry pick from main to rc 4.73.0

for #32231

# Details

This PR adjusts the queries for listing batch scripts slightly to count
_every_ row in `batch_activities` matching the filters, regardless of
whether any `batch_activity_host_results` rows exist for it. This
handles the edge case of a batch script where all the hosts have been
deleted.

# Checklist for submitter

If some of the following don't apply, delete the relevant line.

- [X] Input data is properly validated, `SELECT *` is avoided, SQL
injection is prevented (using placeholders for values in statements)

## Testing

- [ ] Added/updated automated tests I didn't add tests for this because
these tests have already changed quite a bit in
#32174. I can add tests in there
when this merges.

- [X] QA'd all new/changed functionality manually

* Select a host in Manage Hosts, click Run Script, select a script and
do Run Now
* Delete that host
* Go to the batch scripts list (Controls -> Scripts -> Batch Progress)
* Verify that the batch script is still listed.

We don't have clear expectations for what numbers should be displayed
for the progress of a batch like this, but this PR at least ensures the
batch doesn't disappear.

For unreleased bug fixes in a release candidate, one of:

- [X] Confirmed that the fix is not expected to adversely impact load
test results

# Checklist for submitter

If some of the following don't apply, delete the relevant line.

- [ ] Changes file added for user-visible changes in `changes/`,
`orbit/changes/` or `ee/fleetd-chrome/changes`.
See [Changes
files](https://github.com/fleetdm/fleet/blob/main/docs/Contributing/guides/committing-changes.md#changes-files)
for more information.

- [ ] Input data is properly validated, `SELECT *` is avoided, SQL
injection is prevented (using placeholders for values in statements)
- [ ] If paths of existing endpoints are modified without backwards
compatibility, checked the frontend/CLI for any necessary changes

## Testing

- [ ] Added/updated automated tests
- [ ] Where appropriate, [automated tests simulate multiple hosts and
test for host
isolation](https://github.com/fleetdm/fleet/blob/main/docs/Contributing/reference/patterns-backend.md#unit-testing)
(updates to one hosts's records do not affect another)

- [ ] QA'd all new/changed functionality manually

For unreleased bug fixes in a release candidate, one of:

- [ ] Confirmed that the fix is not expected to adversely impact load
test results
- [ ] Alerted the release DRI if additional load testing is needed

## Database migrations

- [ ] Checked table schema to confirm autoupdate
- [ ] Checked schema for all modified table for columns that will
auto-update timestamps during migration.
- [ ] Confirmed that updating the timestamps is acceptable, and will not
cause unwanted side effects.
- [ ] Ensured the correct collation is explicitly set for character
columns (`COLLATE utf8mb4_unicode_ci`).

## New Fleet configuration settings

- [ ] Setting(s) is/are explicitly excluded from GitOps

If you didn't check the box above, follow this checklist for
GitOps-enabled settings:

- [ ] Verified that the setting is exported via `fleetctl
generate-gitops`
- [ ] Verified the setting is documented in a separate PR to [the GitOps
documentation](https://github.com/fleetdm/fleet/blob/main/docs/Configuration/yaml-files.md#L485)
- [ ] Verified that the setting is cleared on the server if it is not
supplied in a YAML file (or that it is documented as being optional)
- [ ] Verified that any relevant UI is disabled when GitOps mode is
enabled

## fleetd/orbit/Fleet Desktop

- [ ] Verified compatibility with the latest released version of Fleet
(see [Must
rule](https://github.com/fleetdm/fleet/blob/main/docs/Contributing/workflows/fleetd-development-and-release-strategy.md))
- [ ] If the change applies to only one platform, confirmed that
`runtime.GOOS` is used as needed to isolate changes
- [ ] Verified that fleetd runs on macOS, Linux and Windows
- [ ] Verified auto-update works from the released version of component
to the new version (see [tools/tuf/test](../tools/tuf/test/README.md))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants