Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A Django command that scans files at rest #3285

Merged
merged 6 commits into from
Jan 25, 2024

Conversation

tadhg-ohiggins
Copy link
Contributor

The same as #2705 but with signed commits.


#2693

This is a Django command that can do one of two things:

  1. Given a --bucket and --object, it will scan that object.
  2. Given a --bucket and --path, it will scan all the files at that path

Common calling patterns will be:

  • fac scan_bucket_files_for_viruses --bucket gsa-fac-private-s3 --path singleauditreport
  • fac scan_bucket_files_for_viruses --bucket gsa-fac-private-s3 --path excel

This should be testable locally, and once merged, we can test it in dev. If it can be run manually in dev, we can then build an associated GH Action to trigger it.

This does not write to any files, and therefore is not a risk to data. It does generate log messages.

sfv.mp4

PR checklist: submitters

  • Link to an issue if possible. If there’s no issue, describe what your branch does. Even if there is an issue, a brief description in the PR is still useful.
  • List any special steps reviewers have to follow to test the PR. For example, adding a local environment variable, creating a local test file, etc.
  • For extra credit, submit a screen recording like this one.
  • Make sure you’ve merged main into your branch shortly before creating the PR. (You should also be merging main into your branch regularly during development.)
  • Make sure you’ve accounted for any migrations. When you’re about to create the PR, bring up the application locally and then run git status | grep migrations. If there are any results, you probably need to add them to the branch for the PR. Your PR should have only one new migration file for each of the component apps, except in rare circumstances; you may need to delete some and re-run python manage.py makemigrations to reduce the number to one. (Also, unless in exceptional circumstances, your PR should not delete any migration files.)
  • Make sure that whatever feature you’re adding has tests that cover the feature. This includes test coverage to make sure that the previous workflow still works, if applicable.
  • Make sure the full-submission.cy.js Cypress test passes, if applicable.
  • Do manual testing locally. Our tests are not good enough yet to allow us to skip this step. If that’s not applicable for some reason, check this box.
  • Verify that no Git surgery was necessary, or, if it was necessary at any point, repeat the testing after it’s finished.
  • Once a PR is merged, keep an eye on it until it’s deployed to dev, and do enough testing on dev to verify that it deployed successfully, the feature works as expected, and the happy path for the broad feature area (such as submission) still works.

PR checklist: reviewers

  • Pull the branch to your local environment and run make docker-clean; make docker-first-run && docker compose up; then run docker compose exec web /bin/bash -c "python manage.py test"
  • Manually test out the changes locally, or check this box to verify that it wasn’t applicable in this case.
  • Check that the PR has appropriate tests. Look out for changes in HTML/JS/JSON Schema logic that may need to be captured in Python tests even though the logic isn’t in Python.
  • Verify that no Git surgery is necessary at any point (such as during a merge party), or, if it was, repeat the testing after it’s finished.

The larger the PR, the stricter we should be about these points.

jadudm and others added 6 commits January 25, 2024 10:18
Given a path, it will scan everything.

Given a single object, it will scan one thing
I was passing params wrong to the SimpleUploadedFile creation.

Fixed.
…o instead log errors at the point of each scan.
Copy link
Contributor

github-actions bot commented Jan 25, 2024

Terraform plan for meta

No changes. Your infrastructure matches the configuration.
No changes. Your infrastructure matches the configuration.

Terraform has compared your real infrastructure against your configuration
and found no differences, so no changes are needed.

Warning: Argument is deprecated

  with module.s3-backups.cloudfoundry_service_instance.bucket,
  on /tmp/terraform-data-dir/modules/s3-backups/s3/main.tf line 14, in resource "cloudfoundry_service_instance" "bucket":
  14:   recursive_delete = var.recursive_delete

Since CF API v3, recursive delete is always done on the cloudcontroller side.
This will be removed in future releases

✅ Plan applied in Deploy to Development and Management Environment #514

Copy link
Contributor

github-actions bot commented Jan 25, 2024

Terraform plan for dev

Plan: 0 to add, 1 to change, 0 to destroy.
Terraform used the selected providers to generate the following execution
plan. Resource actions are indicated with the following symbols:
!~  update in-place

Terraform will perform the following actions:

  # module.dev.module.https-proxy.cloudfoundry_app.egress_app will be updated in-place
!~  resource "cloudfoundry_app" "egress_app" {
        id                              = "a6eb3870-2769-4aa9-a8f2-f1a31939f563"
        name                            = "https-proxy"
!~      source_code_hash                = "cfddd8df06cdb61c6b5a5fe8b879a9ad5b8eeb0752a9f7d73bf4ae050184e237" -> "48b2f5eba8d1f35c0fbc2f90c2ec87cab180f5f19e7e34b16b9190fe5c5ca09a"
#        (19 unchanged attributes hidden)

#        (1 unchanged block hidden)
    }

Plan: 0 to add, 1 to change, 0 to destroy.

Warning: Argument is deprecated

  with module.dev.module.cg-logshipper.module.s3-logshipper-storage.cloudfoundry_service_instance.bucket,
  on /tmp/terraform-data-dir/modules/dev.cg-logshipper.s3-logshipper-storage/s3/main.tf line 14, in resource "cloudfoundry_service_instance" "bucket":
  14:   recursive_delete = var.recursive_delete

Since CF API v3, recursive delete is always done on the cloudcontroller side.
This will be removed in future releases

(and 3 more similar warnings elsewhere)

✅ Plan applied in Deploy to Development and Management Environment #514

Copy link
Contributor

File Coverage Missing
All files 87%
api/serializers.py 88% 177-178 183 188
api/test_views.py 95% 103
api/uei.py 88% 87 118-119 163 167-168
api/views.py 98% 198-199 337-338
audit/forms.py 60% 31-38 108-115
audit/intake_to_dissemination.py 87% 71-72 205-211 261 305-313
audit/test_commands.py 87%
audit/test_manage_submission_access_view.py 98% 15 19
audit/test_mixins.py 90% 112-113 117-119 184-185 189-191
audit/test_validators.py 95% 436 440 608-609 848 855 862 869
audit/test_views.py 95% 410-442 451-482 491-519
audit/test_workbooks_should_fail.py 88% 58 87-88 92
audit/test_workbooks_should_pass.py 87% 59 74-76
audit/utils.py 91% 33-35 38
audit/validators.py 94% 137 189 288-289 304-305 486-490 495-499 515-524
audit/cross_validation/additional_ueis.py 93% 33
audit/cross_validation/check_award_ref_declaration.py 90%
audit/cross_validation/check_award_reference_uniqueness.py 93%
audit/cross_validation/check_certifying_contacts.py 87%
audit/cross_validation/check_findings_count_consistency.py 91%
audit/cross_validation/check_ref_number_in_cap.py 90%
audit/cross_validation/check_ref_number_in_findings_text.py 90%
audit/cross_validation/errors.py 78% 30 69
audit/cross_validation/naming.py 93% 197
audit/cross_validation/submission_progress_check.py 95% 80
audit/cross_validation/tribal_data_sharing_consent.py 81% 33 36 40
audit/cross_validation/validate_general_information.py 93% 28-29
audit/fixtures/single_audit_checklist.py 55% 146-183 229-238
audit/intakelib/exceptions.py 71% 7-9 12
audit/intakelib/intermediate_representation.py 91% 27-28 73 91 129 200-203 212-213
audit/intakelib/mapping_audit_findings.py 97% 55
audit/intakelib/mapping_audit_findings_text.py 97% 52
audit/intakelib/mapping_federal_awards.py 93% 90
audit/intakelib/mapping_util.py 79% 21 25 29 63 99 104-105 114-120 130 145 150
audit/intakelib/checks/check_all_unique_award_numbers.py 79% 24
audit/intakelib/checks/check_cluster_names.py 75% 21-26
audit/intakelib/checks/check_cluster_total.py 95% 73
audit/intakelib/checks/check_has_all_the_named_ranges.py 95% 66
audit/intakelib/checks/check_is_a_workbook.py 69% 20
audit/intakelib/checks/check_loan_balance_entries.py 83% 29
audit/intakelib/checks/check_loan_balance_present.py 88% 30
audit/intakelib/checks/check_look_for_empty_rows.py 91% 18
audit/intakelib/checks/check_no_major_program_no_type.py 76% 18 27
audit/intakelib/checks/check_no_repeat_findings.py 76% 21 30
audit/intakelib/checks/check_other_cluster_names.py 81% 23 33
audit/intakelib/checks/check_passthrough_name_when_no_direct.py 88% 9 47
audit/intakelib/checks/check_sequential_award_numbers.py 76% 14 22
audit/intakelib/checks/check_start_and_end_rows_of_all_columns_are_same.py 89% 14
audit/intakelib/checks/check_state_cluster_names.py 81% 23 33
audit/intakelib/checks/check_version_number.py 73% 21 31-32
audit/intakelib/checks/runners.py 96% 147
audit/intakelib/common/util.py 90% 22 39
audit/intakelib/transforms/xform_rename_additional_notes_sheet.py 81% 14
audit/management/commands/load_fixtures.py 46% 39-45
audit/models/models.py 83% 57 59 64 66 216 228-231 249 426 444-445 453 475 573-574 578 586 595 601
audit/views/audit_info_form_view.py 27% 25-74 77-117 120-137
audit/views/manage_submission.py 86% 73-80
audit/views/manage_submission_access.py 98% 113-114
audit/views/pre_dissemination_download_view.py 76% 15-18 23-26 33-43
audit/views/submission_progress_view.py 96% 176-177
audit/views/tribal_data_consent.py 34% 23-41 44-79
audit/views/unlock_after_certification.py 57% 28-51 73-87
audit/views/upload_report_view.py 26% 32-35 44 91-117 120-170 178-209
audit/views/views.py 38% 71 78-97 120-121 195-196 217-229 258-259 270-271 273-277 319-332 335-349 354-367 384-390 395-415 418-446 451-480 483-527 532-552 555-583 588-617 620-664 669-681 684-694 699-711 738-739
census_historical_migration/change_record.py 95% 30 34 46
census_historical_migration/test_federal_awards_xforms.py 99% 194-195
census_historical_migration/sac_general_lib/audit_information.py 91% 24 78
census_historical_migration/sac_general_lib/cognizant_oversight.py 68% 11
census_historical_migration/sac_general_lib/general_information.py 91% 150-151 159-160 204 275
census_historical_migration/sac_general_lib/sac_creator.py 90% 34
census_historical_migration/sac_general_lib/utils.py 83% 33 60-69
census_historical_migration/transforms/xform_retrieve_uei.py 67% 10
census_historical_migration/workbooklib/additional_eins.py 84% 58-60 67-77
census_historical_migration/workbooklib/additional_ueis.py 77% 27-29 36-46
census_historical_migration/workbooklib/excel_creation_utils.py 61% 98 107-112 117-124 128-146 159-163 177-180
census_historical_migration/workbooklib/federal_awards.py 62% 139 180-188 198-223 326-408
census_historical_migration/workbooklib/findings.py 26% 41-46 92-112 119-121 132-152 165-167 174-195
census_historical_migration/workbooklib/notes_to_sefa.py 57% 34-38 96-97 99-100 102-103 125-131 139-147 155-157 164-198
config/test_settings.py 92% 33-34 49-50
config/urls.py 71% 83
dissemination/file_downloads.py 73% 35-53 83-85
dissemination/models.py 99% 469
dissemination/search.py 84% 58 88 94 123 125 128-136
dissemination/summary_reports.py 68% 268-270 274-278 355-399 424 460-462 477-484
dissemination/views.py 85% 170 174 204 252 254 256 339-343
dissemination/migrations/0002_general_fac_accepted_date.py 47% 10-12
djangooidc/backends.py 78% 32 57-63
djangooidc/exceptions.py 66% 19 21 23 28
djangooidc/oidc.py 16% 32-35 45-51 64-70 92-149 153-199 203-226 230-275 280-281 286
djangooidc/views.py 80% 22 43 114
djangooidc/tests/common.py 96%
report_submission/forms.py 92% 35
report_submission/views.py 75% 86 218-219 221 243-244 263-264 290-436 439-449
report_submission/templatetags/get_attr.py 76% 8 11-14 18
support/admin.py 88% 76 79 84 91-97 100-102
support/cog_over.py 91% 30-33 93 145
support/test_admin_api.py 80% 22 146-147 236-237 316-317
support/test_cog_over.py 98% 134-135 224
support/management/commands/seed_cog_baseline.py 98% 20-21
support/models/cog_over.py 89% 103-104
tools/update_program_data.py 89% 96
users/admin.py 99% 27
users/auth.py 96% 58-59
users/models.py 96% 18 74-75
users/fixtures/user_fixtures.py 91%

Minimum allowed coverage is 85%

Generated by 🐒 cobertura-action against 2ec5bfb

@danswick danswick added this pull request to the merge queue Jan 25, 2024
Merged via the queue into main with commit 7298e1b Jan 25, 2024
13 checks passed
@danswick danswick deleted the tadhg/jadudm-scan-files-at-rest-for-viruses-signed branch January 25, 2024 18:35
purvinptl pushed a commit that referenced this pull request Jan 25, 2024
* A Django command that scans files at rest

Given a path, it will scan everything.

Given a single object, it will scan one thing

* Passed parameters wrong

I was passing params wrong to the SimpleUploadedFile creation.

Fixed.

* Fix the empty case.

* Linted.

* Change AV scan code path to not assemble a list of all results, but to instead log errors at the point of each scan.

* Switch Django command for AV scan of S3 objects to use pagination.

---------

Co-authored-by: Matt Jadud <matthew.jadud@gsa.gov>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants