Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FAC-D-2023-009 - Scanning at rest #2693

Closed
3 tasks done
Tracked by #2649
jadudm opened this issue Nov 2, 2023 · 12 comments
Closed
3 tasks done
Tracked by #2649

FAC-D-2023-009 - Scanning at rest #2693

jadudm opened this issue Nov 2, 2023 · 12 comments
Assignees
Labels
compliance Stuff which may relate to a specific requirement or timelines for resolution eng

Comments

@jadudm
Copy link
Contributor

jadudm commented Nov 2, 2023

Tasks

@jadudm jadudm mentioned this issue Nov 2, 2023
@jadudm
Copy link
Contributor Author

jadudm commented Nov 2, 2023

@jadudm jadudm added the compliance Stuff which may relate to a specific requirement or timelines for resolution label Nov 2, 2023
@danswick
Copy link
Contributor

@asteel-gsa will test in preview, monitor performance, check results with a collection of historical test data.

@jadudm jadudm changed the title FAC-D-2023-009 - 2023-11-14 FAC-D-2023-009 - Scanning at rest Jan 2, 2024
@jadudm jadudm added the eng label Jan 3, 2024
@danswick danswick assigned danswick and unassigned jadudm Feb 21, 2024
@jadudm
Copy link
Contributor Author

jadudm commented Feb 26, 2024

@danswick , @asteel-gsa lets put in a GH Action/cron that runs this every six months, scheduling it to run in April and... whatever six months from April is.

We're going to have to discuss the strategy as a whole w/ GSA, but for now, we need to complete the ticket.

@jadudm
Copy link
Contributor Author

jadudm commented Mar 4, 2024

@asteel-gsa , do you know if this has merged to main yet, or is this in a branch?

@asteel-gsa
Copy link
Contributor

asteel-gsa commented Mar 4, 2024

@jadudm

Depends on #3448

1912 scan_at_rest FAILED Mon, 26 Feb 2024 17:45:57 UTC python manage.py scan_bucket_files_for_viruses --bucket fac-private-s3 --paths excel singleauditreport

Not sure why it failed, or when it ended, So the logs are likely gone for this. Due to it running in preview for two days, we may need to run this in dev so we can actually see what happens at the very end of the scan.

Alex Steel@DESKTOP-NL4DO24 MINGW64 ~/Code/FAC (terraform-updates)
$ cf run-task gsa-fac -k 2G -m 2G --name scan_at_rest --command "python manage.py scan_bucket_files_for_viruses --bucket fac-private-s3 --paths excel"
Creating task for app gsa-fac in org gsa-tts-oros-fac / space preview as alexander.steel@gsa.gov...
Task has been submitted successfully for execution.
OK

task name:   scan_at_rest
task id:     1923

Alex Steel@DESKTOP-NL4DO24 MINGW64 ~/Code/FAC (terraform-updates)
$ cf logs gsa-fac | grep "scan_at_rest"
   2024-03-04T08:49:55.28-0500 [APP/TASK/scan_at_rest/0] OUT Invoking pre-start scripts.
   2024-03-04T08:49:55.50-0500 [APP/TASK/scan_at_rest/0] OUT STARTUP LOCAL_ENV Environment set as: PREVIEW
   2024-03-04T08:49:55.80-0500 [APP/TASK/scan_at_rest/0] OUT STARTUP STARTUP_CHECK setup_env PASS
   2024-03-04T08:49:55.80-0500 [APP/TASK/scan_at_rest/0] OUT Invoking start command.
   2024-03-04T08:51:07.80-0500 [APP/TASK/scan_at_rest/0] ERR {"message": "SCAN OK: PATH excel COUNT passed: 310, failed: 0"}
   2024-03-04T08:51:08.07-0500 [APP/TASK/scan_at_rest/0] OUT Exit status 0

Will update with SAR scan as well when it completes

@jadudm
Copy link
Contributor Author

jadudm commented Mar 4, 2024

Hm. We may also want to think about how it reports things, how it keeps track of what it has run, etc... :/

@asteel-gsa
Copy link
Contributor

asteel-gsa commented Mar 4, 2024

I am fine merging it as is, just to say it is running. However, I am not keen on the idea of watching it for x months 😆

it would probably be nice to have a logger.info(f"SCAN PASS: {object}") statement as well, though that may clutter the logs, so...

the excel bucket is fine however, I see a lot of Read timed out. (read timeout=15)"} errors. I don't see anything in the code to exit 1 either, so, I am rerunning singleauditreport scan in preview atm, but, considering it will run for 2 days, we probably shouldn't deploy anything else to preview, so I can check the logs as to why it failed the task. (at this point, my only guess is the task runner crashed)

@asteel-gsa
Copy link
Contributor

@jadudm I think we need to look at this a bit further. Even in preview this operation is so long running that when I realize the task has failed and check the logs, they are gone.

It has a lot of trouble combing through ${bucket}/singleauditreport/ with an incredible amount of failed scans, so we might want to adjust the read timeout to 30s or more

@jadudm
Copy link
Contributor Author

jadudm commented Mar 7, 2024

It is probably worse than it appears.

It may be the smart thing is to just have the convo about scanning inbound and outbound, as opposed to trying to solve this problem. Otherwise, we're going to need a whole database/logging framework for tracking how often we've scanned all 700K+ files.

:/

More TBD.

@danswick
Copy link
Contributor

danswick commented May 3, 2024

MVP work is being tracked here: #3758. I'm going to move this issue to "Blocked" until the MVP is done, mostly so it's clear which of these issues is active.

@asteel-gsa
Copy link
Contributor

This should be satisfied with the completion on #3758
@timoballard will be configuring New Relic alerts and I can support if needed to wrap this up.

@timoballard timoballard self-assigned this May 16, 2024
@asteel-gsa
Copy link
Contributor

As of the 2024-05-16 release, production files are now being scanned with the logs being sent to new relic.

$ cf logs fac-file-scanner
Retrieving logs for app fac-file-scanner in org gsa-tts-oros-fac / space production as alexander.steel@gsa.gov...

   2024-05-16T14:15:35.80-0400 [APP/PROC/WEB/0] OUT [2024-05-16 18:15:35,806] INFO in app: singleauditreport/2016-01-CENSUS-0000003259.pdf scan result: ScanResult.CLEAN
   2024-05-16T14:15:37.09-0400 [APP/PROC/WEB/0] OUT [2024-05-16 18:15:37,089] INFO in app: singleauditreport/2016-01-CENSUS-0000019833.pdf scan result: ScanResult.CLEAN
   2024-05-16T14:15:38.42-0400 [APP/PROC/WEB/0] OUT [2024-05-16 18:15:38,426] INFO in app: singleauditreport/2016-01-CENSUS-0000019890.pdf scan result: ScanResult.CLEAN
   2024-05-16T14:15:39.66-0400 [APP/PROC/WEB/0] OUT [2024-05-16 18:15:39,663] INFO in app: singleauditreport/2016-01-CENSUS-0000020971.pdf scan result: ScanResult.CLEAN

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compliance Stuff which may relate to a specific requirement or timelines for resolution eng
Projects
Status: Done
Development

No branches or pull requests

4 participants