Skip to content

High MySQL reader load on GET /hosts with device_mapping=true + search query #47722

Description

@lucasmrod

Fleet versions

  • Discovered: 4.86.0
  • Reproduced: 4.86.0 (code paths unchanged since 4.84.0)

Web browser and operating system: N/A (server-side / API)


💥 Actual behavior

Requests of the form:

GET /api/v1/fleet/hosts?device_mapping=true&page=1&per_page=100&query=john.doe%40example.com

cause high load on the MySQL reader on instances with ~10k hosts. Each page load runs an expensive aggregation over the entire host_emails table, even though only 100 rows are returned.

Root cause — in server/datastore/mysql/hosts.go, when device_mapping=true, applyHostFilters adds a LEFT JOIN on a derived table with GROUP BY (hosts.go:1289):

LEFT JOIN (
    SELECT host_id,
           CONCAT('[', GROUP_CONCAT(JSON_OBJECT('email', email, 'source', ...)), ']') AS device_mapping
    FROM host_emails
    GROUP BY host_id
) dm ON dm.host_id = h.id

Because this derived table has a GROUP BY, MySQL must materialize it in full — aggregating every row of host_emails for all hosts into a temp table — before the join and the outer LIMIT 100 can be applied. The outer WHERE/LIMIT cannot be pushed into it. The full cost is paid on every page request regardless of result size.

Two factors compound it:

  1. The count query pays the same cost. CountHosts (hosts.go:2048) reuses the same options with DeviceMapping=true, so the derived join is added to the SELECT count(*) statement too, even though device_mapping is never selected for counting. With per_page set (pagination metadata included), each page load runs both the list query and the count query, materializing the full aggregation twice.

  2. The search term triggers a second pass over host_emails. hostSearchLike (server/datastore/mysql/mysql.go:1166) appends, for any non-empty query:

    OR EXISTS (SELECT 1 FROM host_emails he WHERE he.host_id = h.id AND he.email LIKE '%john.doe@example.com%')

    The leading % wildcard prevents use of idx_host_emails_email, falling back to a per-host lookup on host_id. (Broadened from "only when the term looks like an email" to "always" in PR Improve host search by end user email address #40197, shipped 4.84.0.)

The materialized GROUP_CONCAT derived table is the dominant load source.

🛠️ Expected behavior

Listing hosts with device_mapping=true and a search query should scale with the size of the returned page (~100 rows), not with the total number of hosts/emails in the instance. It should not place disproportionate load on the MySQL reader.

🧑‍💻 Steps to reproduce

These steps:

  • Have been confirmed to consistently lead to reproduction in multiple Fleet instances.
  • Describe the workflow that led to the error, but have not yet been reproduced in multiple Fleet instances.
  1. Have a Fleet instance with ~10k hosts and populated host_emails (e.g. IdP / end-user email mapping).
  2. Call GET /api/v1/fleet/hosts?device_mapping=true&page=1&per_page=100&query=<some-email> (e.g. from the Hosts page with a search term while device mapping is requested).
  3. Observe elevated MySQL reader load / slow query. EXPLAIN shows materialization of the host_emails GROUP BY derived table.

🕯️ More info (optional)

Proposed fix (two independent, low-risk changes):

  1. Skip the device_mapping join in the count query — set opt.DeviceMapping = false in CountHosts (the column is never counted), mirroring the existing DisableIssues handling.

  2. Replace the derived-table join with a SELECT-list correlated subquery (only when opt.DeviceMapping), so it is evaluated only for the ~100 rows actually returned, each as an indexed lookup on idx_host_emails_host_id_email:

    COALESCE((SELECT CONCAT('[', GROUP_CONCAT(JSON_OBJECT('email', email, 'source', ...)), ']')
              FROM host_emails he WHERE he.host_id = h.id), 'null') AS device_mapping

    This matches the existing pattern already used for host_additional in the same query (hosts.go:1164).

Validate with EXPLAIN ANALYZE on a ~10k-host dataset before/after.

Metadata

Metadata

Assignees

Labels

#g-orchestrationOrchestration product group:releaseReady to write code. Scheduled in a release. See "Making changes" in handbook.bugSomething isn't working as documentedcustomer-shackleton

Type

No type

Fields

No fields configured for issues without a type.

Projects

Status
✔️Awaiting QA

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions