-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize snapshot and report generation threads in create_snapshots_reports_scorecard.py
#106
Optimize snapshot and report generation threads in create_snapshots_reports_scorecard.py
#106
Conversation
We now pull individual org IDs from the global list of snapshots_to_generate and pass them into the create_snapshot function instead of passing in the entire list of orgs. Each thread checks to see if the global list is empty; if it isn't, it grabs an org ID and generates a snapshot for it. If the global list is empty, the thread exits.
Also, add a new debug log statement.
This was overlooked in the past so I figured I'd fix it while I'm here.
This means using a single global list of reports_to_generate that each thread pulls from, rather than assigning a pre-determined list of reports to each thread.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Requesting a few minor changes.
It's probably also worth at least creating an issue for more gracefully handling snapshot or report generation failures. Right now I think the corresponding thread will crash and exit its thread loop and we will lose that bit of concurrency. At a minimum we should do some error handling to ensure that the thread does not exit, and eventually we probably want to put the item back onto the queue for another chance before abandoning all hope.
Co-authored-by: Shane Frasier <jeremy.frasier@gwe.cisa.dhs.gov>
Co-authored-by: Shane Frasier <jeremy.frasier@gwe.cisa.dhs.gov>
I don't believe we will lose the thread if a failure occurs during snapshot or report generation because those commands ( I think that if a failure occurs, we will end up in the same position as we are now (before this PR). Namely, we will see in the output that there was one or more failed snapshots/reports and we will have to manually attempt to generate them later. I will still create an issue to document this where we can discuss options for automatically re-attempting the failed snapshots/reports before giving up. |
Nope, I think you are correct. IGNORE ME!!! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work!
See #107 for this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm going to do one last pass on the logic but I wanted to throw out this feedback instead of sitting on it until I finish my review. Mostly around manual black
ening and formatting with some questions/requests interspersed.
Co-authored-by: Nick <50747025+mcdonnnj@users.noreply.github.com>
This only needs to be done once before the loop starts, not on every iteration of the loop. Co-authored-by: Nick <50747025+mcdonnnj@users.noreply.github.com>
Co-authored-by: Nick <50747025+mcdonnnj@users.noreply.github.com>
Co-authored-by: Nick <50747025+mcdonnnj@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One minor formatting thang.
Co-authored-by: Shane Frasier <jeremy.frasier@gwe.cisa.dhs.gov>
…ningful This also makes them named and structured similarly to the snapshot functions. While I was here, I tidied up the code that formulates the report generation command since it was a bit bulky before. Co-authored-by: Nick <50747025+mcdonnnj@users.noreply.github.com> Co-authored-by: Shane Frasier <jeremy.frasier@gwe.cisa.dhs.gov>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for clearing up my feedback. This LGTM with one minor quibble around a docstring.
Co-authored-by: Nick <50747025+mcdonnnj@users.noreply.github.com> Co-authored-by: Shane Frasier <jeremy.frasier@gwe.cisa.dhs.gov>
🗣 Description
This PR gives work to threads more efficiently when creating snapshots and reports in
extras/create_snapshots_reports_scorecard.py
.Note that this does not include any changes to the way third-party snapshots and reports are generated. That should be done in a later PR (see issue #60).
I also took this opportunity to improve a few logging statements that did not include the thread name. This will make the log output easier to understand.
💭 Motivation and context
The current code splits the list of snapshots/reports to be generating into a group of sub-lists and assigns each thread one of those sub-lists to work on. This PR improves upon this by enabling each thread to pull one snapshot or report (to be generated) from a single list and generating it, repeating the process until the list is empty, at which point each thread exits. This is a more efficient way to utilize the threads. It should result in a faster overall processing time by preventing some threads from having to generate multiple long-running snapshots/reports, while other threads are sitting idle because they have already completed their work.
This PR resolves #62. Since I was making related changes here, I also included changes to resolve #59.
🧪 Testing
I tested this by commenting out or not executing (via switches) parts of the script that were not relevant (e.g. pausing/resuming the commander, creating the sample report, creating third-party snapshots/reports, etc). Then I temporarily modified the script so that it ran a simple
sleep
command instead ofcyhy-snapshot
andcyhy-report
. This allowed me to run the entire script (e.g../create_snapshots_reports_scorecard.py --no-dock --no-log --no-pause cyhy-read-only scan-read-only
) and I confirmed that the threading and list consumption worked as intended.Sample test run output (sanitized and snipped for clarity; note I also limited the number of reports here to generate to 100 for a quicker test run):
I also ran a test against the entire Production set of orgs and confirmed that everything worked fine (from a thread and work consumption standpoint) with a larger set of snapshots and reports.
✅ Pre-approval checklist
✅ Post-merge checklist