Skip to content

Cherry-pick #45709: Implement roaring bitmaps for historical data collection#45826

Merged
sgress454 merged 1 commit into
rc-minor-fleet-v4.86.0from
sgress454/roaring-bitmaps-cp
May 19, 2026
Merged

Cherry-pick #45709: Implement roaring bitmaps for historical data collection#45826
sgress454 merged 1 commit into
rc-minor-fleet-v4.86.0from
sgress454/roaring-bitmaps-cp

Conversation

@sgress454
Copy link
Copy Markdown
Contributor

Cherry-pick of #45709 into the RC branch.

<!-- Add the related story/sub-task/bug number, like Resolves #123, or
remove if NA -->
**Related issue:** Resolves #45715 

# Details

This PR refactors the way the charts module stores historical data to
use the [roaring bitmap](https://github.com/RoaringBitmap/roaring)
package instead of saving raw bitmaps. See [this
blurb](https://github.com/RoaringBitmap/roaring#how-does-roaring-compares-with-the-alternatives)
to learn how roaring compresses data, but TL;DR for our purposes it
represents a huge improvement especially for larger deployments where
host ID numbers may be very large. In testing, some data was reduced
96%.

The majority of the changes in this PR are straight swapping of types
from `[]byte` to `*roaring.Bitmap` in vars and function signatures, and
updating the internals of our bit math helpers to use roaring methods
instead of native AND and OR methods. I've tried to comment on all
functional changes.

Since the charts have been shipped already, so there will be data in the
wild in the prior "dense" format, the code still handles dense bitmaps
on _read_, but will always _write_ roaring bitmaps. The majority of the
data will therefore have turned over within 30 days on its own, but I
plan on a follow-up PR that will transform open rows when the cron runs
so that we should be guaranteed to turn over completely within 30 days.

# Checklist for submitter

If some of the following don't apply, delete the relevant line.

- [X] Changes file added for user-visible changes in `changes/`,
`orbit/changes/` or `ee/fleetd-chrome/changes`.
See [Changes
files](https://github.com/fleetdm/fleet/blob/main/docs/Contributing/guides/committing-changes.md#changes-files)
for more information.

- [X] Input data is properly validated, `SELECT *` is avoided, SQL
injection is prevented (using placeholders for values in statements), JS
inline code is prevented especially for url redirects, and untrusted
data interpolated into shell scripts/commands is validated against shell
metacharacters.

## Testing

- [X] Added/updated automated tests
- Tests updated to accommodate the new format, and existing unchanged
tests act as proof against regression
- [X] QA'd all new/changed functionality manually
- Using a tool that dumps the `host_scd_data` rows data into a JSON file
(with the keys being entity_id+data and the values being host IDs on
that date), compared the data from main branch and this and confirmed
they're identical
- With a host count of ~9000, some of which have IDs of over 1,000,000,
the data storage requirements were:
     * 82,558,976 bytes for dense
     * 2,867,200 for roaring (a 96% decrease)

For unreleased bug fixes in a release candidate, one of:

- [X] Confirmed that the fix is not expected to adversely impact load
test results
  - should hugely improve
- [X] Alerted the release DRI if additional load testing is needed

## Database migrations

- [X] Checked schema for all modified table for columns that will
auto-update timestamps during migration.


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

* **New Features**
* Implemented roaring bitmaps in historical data collection to optimize
bitmap handling for chart data aggregation
* Added encoding support to bitmap storage schema for flexible data
representation

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
@sgress454 sgress454 requested a review from a team as a code owner May 19, 2026 17:57
Copy link
Copy Markdown

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Code Review

This repository is configured for manual code reviews. Comment @claude review to trigger a review and subscribe this PR to future pushes, or @claude review once for a one-time review.

Tip: disable this comment in your organization's Code Review settings.

@codecov
Copy link
Copy Markdown

codecov Bot commented May 19, 2026

Codecov Report

❌ Patch coverage is 68.75000% with 70 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (rc-minor-fleet-v4.86.0@0f21172). Learn more about missing BASE report.

Files with missing lines Patch % Lines
server/chart/internal/mysql/data.go 44.56% 45 Missing and 6 partials ⚠️
server/chart/blob.go 89.47% 6 Missing and 2 partials ⚠️
...les/20260518194422_AddEncodingTypeToHostSCDData.go 57.14% 4 Missing and 2 partials ⚠️
server/chart/bootstrap/bootstrap.go 0.00% 3 Missing ⚠️
server/chart/internal/testutils/testutils.go 93.33% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@                    Coverage Diff                    @@
##             rc-minor-fleet-v4.86.0   #45826   +/-   ##
=========================================================
  Coverage                          ?   66.78%           
=========================================================
  Files                             ?     2748           
  Lines                             ?   219689           
  Branches                          ?    10848           
=========================================================
  Hits                              ?   146716           
  Misses                            ?    59715           
  Partials                          ?    13258           
Flag Coverage Δ
backend 68.60% <68.75%> (?)
backend-activity 86.35% <ø> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@sgress454 sgress454 merged commit 874b403 into rc-minor-fleet-v4.86.0 May 19, 2026
40 checks passed
@sgress454 sgress454 deleted the sgress454/roaring-bitmaps-cp branch May 19, 2026 18:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants