Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test impact of turning on host_async_processing for CIS benchmarks #12697

Closed
14 tasks
zhumo opened this issue Jul 10, 2023 · 5 comments
Closed
14 tasks

Test impact of turning on host_async_processing for CIS benchmarks #12697

zhumo opened this issue Jul 10, 2023 · 5 comments
Assignees
Labels
customer-ufa #g-endpoint-ops Endpoint ops product group :product Product Design department (shows up on 🦢 Drafting board) story A user story defining an entire feature
Milestone

Comments

@zhumo
Copy link
Contributor

zhumo commented Jul 10, 2023

This issue's remaining effort can be completed in ≤1 sprint. It will be valuable even if nothing else ships.

It is planned and ready to implement. It is on the proper kanban board.

Goal

User story
As a fleet admin,
I want to know that turning on FLEET_OSQUERY_ENABLE_ASYNC_HOST_PROCESSING=policy_membership=true to enable my CIS benchmark policies will not affect the rest of my fleet experience,
so that I can protect my endpoints without affecting the stability of my fleet instance or impacting end user experience.

How to run the test

Run Fleet locally (you should see the below log line [...]"async enabled, ..." name=policy_membership):

./build/fleet serve --dev --logging_debug \
  --dev_license \
  --osquery_enable_async_host_processing "policy_membership=true" 2>&1 | tee ~/fleet.txt
[...]
level=debug ts=2023-07-10T18:36:46.115005Z cron=async_task
task="async enabled, starting collectors" name=policy_membership interval=30s jitter=10

Metrics to capture

All features related to policies (automations, host refetching, policies tab, policies tab in host details) should work as expected (as if the flag was set to off, which is the default)

There shouldn't be any noticeable performance changes server side when the number of hosts is low.

What potential qualitative issues to watch out for

Emphasis in policies automations and host refetch, because of this comment in https://github.com/fleetdm/fleet/blob/main/docs/Deploying/Configuration.md#osquery_enable_async_host_processing:

Note that currently, if both the failing policies webhook and this osquery.enable_async_host_processing option are set, some failing policies webhooks could be missing (some transitions from succeeding to failing or vice-versa could happen without triggering a webhook request).

Changes

This issue's estimation includes completing:

  • UI changes: TODO
  • CLI usage changes: TODO
  • REST API changes: TODO
  • Permissions changes: TODO
  • Database schema migrations: TODO
  • Outdated documentation changes: TODO
  • Scope transparency changes? TODO
  • Breaking changes requiring major version bump? TODO
  • Changes to paid features or tiers? TODO
  • QA complete?
  • ...

ℹ️  Please read this issue carefully and understand it. Pay special attention to UI wireframes, especially "dev notes".

Context

  • Requestor(s): _________________________

QA

Risk assessment

  • Requires load testing TODO

Risk level: Low / High TODO

Risk description: TODO

Automated:

  • Fleet: Cover / Will not cover
  • QAWolf: Cover / Will not cover

Manual testing steps

  1. Step 1
  2. Step 2
  3. Step 3

Testing notes

Confirmation

  1. Engineer (@____): Added comment to user story confirming succesful completion of QA.
  2. QA (@____): Added comment to user story confirming succesful completion of QA.
@zhumo zhumo added story A user story defining an entire feature customer-ufa :product Product Design department (shows up on 🦢 Drafting board) #g-endpoint-ops Endpoint ops product group labels Jul 10, 2023
@lucasmrod lucasmrod assigned xpkoala and unassigned lucasmrod Jul 10, 2023
@zhumo
Copy link
Contributor Author

zhumo commented Jul 24, 2023

Hi @zayhanlon, unfortunately, we were not able to get to this work in our 6-week timeframe. Please bring this back to Feature Fest if it's still desired. Thanks!

@zhumo zhumo removed the :product Product Design department (shows up on 🦢 Drafting board) label Jul 24, 2023
@zhumo zhumo changed the title Test impact of turning on host_async_processing Test impact of turning on host_async_processing for CIS benchmarks Jul 26, 2023
@sharon-fdm sharon-fdm added :product Product Design department (shows up on 🦢 Drafting board) :release Ready to write code. Scheduled in a release. See "Making changes" in handbook. and removed :product Product Design department (shows up on 🦢 Drafting board) labels Aug 15, 2023
@xpkoala
Copy link
Contributor

xpkoala commented Aug 28, 2023

This flag was enabled for over 48 hours and no issues were found with the use of policies or other functionality in the app.

@lukeheath lukeheath added this to the 4.37.0 milestone Sep 1, 2023
@lukeheath lukeheath added :product Product Design department (shows up on 🦢 Drafting board) and removed :release Ready to write code. Scheduled in a release. See "Making changes" in handbook. labels Sep 8, 2023
@noahtalerman
Copy link
Member

Confirm and celebrate: @zhumo do we want docs for this? Does this PR cover it? https://github.com/fleetdm/fleet/pull/13799/files

@zhumo
Copy link
Contributor Author

zhumo commented Sep 15, 2023

No, i don't think it does. documented here: #13966

@zhumo zhumo closed this as completed Sep 20, 2023
@fleet-release
Copy link
Contributor

CIS benchmarks on,
Stability remains strong,
Fleet in harmony.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
customer-ufa #g-endpoint-ops Endpoint ops product group :product Product Design department (shows up on 🦢 Drafting board) story A user story defining an entire feature
Development

No branches or pull requests

7 participants