-
Notifications
You must be signed in to change notification settings - Fork 380
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Find hosts with the most issues #18115
Comments
Add sort to the "Issues" column on the Hosts page. Update issues count = # failing critical policies + # of vulnerabilities w/ known exploits (CISA KEV) |
Hey @cjwalton this story covers the next iteration of Fleet's version of a host "risk score." Our understanding is that y'all are looking for a way to prioritize hosts that need fixing/updating/patching. The plan is to allow y'all to sort hosts by "issues" in Fleet: # critical policies failed + # vulns w/ known exploits (from CISA KEV) Jason: EPSS takes CISA KEV into an account. Maybe let's start w/ EPSS > 70% and/or CVSS > 8 We want to start simple so that we move quickly for y'all while leaving the door open for future iterations. I recorded a Loom video that walks through the improvement in more details here: https://www.loom.com/share/d594151980ec47298efafb159f0e91b1?sid=c4465470-cc23-47d9-9d86-b2542898774f What do you think? |
Hey @cjwalton, based on your feedback (above) we tweaked the "Issues" count to include critical vulns (CVEs w/ CVSS score > 8.9): The plan is to start with this. In future iterations we can add the ability to customize the "Issues" count. For example:
Does that work for you? |
BE 5 |
@jacobshandling mentioned the scope of this for FE is probably larger than anticipated. TLDR: Looks like device user page and host details page use the same code, To be thorough, this might be a 5. |
## Issue Unreleased fix for #18115 ## Description - BE shows `0` count for empty state so FE needs to account for `0` instead of `undefined` ## Screenshot of fix <img width="1219" alt="Screenshot 2024-06-18 at 5 00 04 PM" src="https://github.com/fleetdm/fleet/assets/71795832/cd6ec944-ce99-4f8e-a630-9bf037abd0b9"> # Checklist for submitter If some of the following don't apply, delete the relevant line. <!-- Note that API documentation changes are now addressed by the product design team. --> - [x] Manual QA for all new/changed functionality
@xpkoala The regular QA was done by @RachelElysia I added a few test scenarios for load testing in the description, and moved issue back to Awaiting QA. |
Thanks @getvictor! |
Found an issue when modifying a policy that affects 50k+ hosts with 100k+ hosts enrolled.
Reproduce:
A 422 http error is recorded in the web console. |
Remaining work reset to 1 point. |
Docker image being used to test this fix is 4530loadtestA |
#18115 Fixing unreleased bug found when load testing host issues update.
#18115 Fixing issue saw in load test: ``` level=error ts=2024-06-25T17:09:08.230514976Z cron=vulnerabilities schedule=vulnerabilities instanceID="5boTc/PamsSp8Jsh4kiEOpECmPu+bmOAJaVX4XV7ZOG4vgO4U6peHyxH8mFQhBXYJt+roRpwNuGmUoEI8n/otg==" err="running job" details="get critical vulnerabilities count: Error 1114 (HY000): The table '/rdsdbdata/tmp/#sql127_6b4b_ad107' is full" jobID=update_host_issues_vulnerabilities_counts ```
#18115 Fixing issue saw in load test: ``` level=error ts=2024-06-25T17:09:08.230514976Z cron=vulnerabilities schedule=vulnerabilities instanceID="5boTc/PamsSp8Jsh4kiEOpECmPu+bmOAJaVX4XV7ZOG4vgO4U6peHyxH8mFQhBXYJt+roRpwNuGmUoEI8n/otg==" err="running job" details="get critical vulnerabilities count: Error 1114 (HY000): The table '/rdsdbdata/tmp/#sql127_6b4b_ad107' is full" jobID=update_host_issues_vulnerabilities_counts ``` (cherry picked from commit 918773b)
Hey @pintomi1989 this story has shipped. @noahtalerman There are TODOs in issue description to solve before moving to closed. |
API changes for the "Find hosts with the most issues" story - #18115
Docs are merged! |
Sorting hosts by flaws, |
Goal
Context
Changes
Product
Engineering
QA
Risk assessment
Load testing plan
For the below scenarios, monitor latency and DB performance.
Start with 100K hosts failing a policy. Modify the SQL of that policy.
Start with 100K hosts failing a policy. Modify the platforms of that policy (like uncheck "windows").
Start with 100K hosts failing a policy. Transfer them to a different team.
Start with 100K hosts failing a policy. Delete that policy.
Testing notes
Confirmation
The text was updated successfully, but these errors were encountered: