Add ECR report allowlist uploading to image-data-storage bucket #4006

Annaa3 · 2024-06-14T22:10:34Z

GitHub Issue #, if available:

Note:

If merging this PR should also close the associated Issue, please also add that Issue # to the Linked Issues section on the right.
All PR's are checked weekly for staleness. This PR will be closed if not updated in 30 days.

Description

Add functionality of uploading ECR report allowlist to a s3 bucket to the test ECR scanning portion of sanity tests. The uploaded allowlist information is used for scanning dashboards to identify allowlisted vulnerabilities for ECR. Checks are added so that this is only run during the test phase and in the build pipelines.

Tests run

NOTE: By default, docker builds are disabled. In order to build your container, please update dlc_developer_config.toml and specify the framework to build in "build_frameworks"

I have run builds/tests on commit for my changes.

NOTE: If you are creating a PR for a new framework version, please ensure success of the standard, rc, and efa sagemaker remote tests by updating the dlc_developer_config.toml file:

Expand

sagemaker_remote_tests = true
sagemaker_efa_tests = true
sagemaker_rc_tests = true

Additionally, please run the sagemaker local tests in at least one revision:

sagemaker_local_tests = true

Formatting

I have run black -l 100 on my code (formatting tool: https://black.readthedocs.io/en/stable/getting_started.html)

DLC image/dockerfile

Builds to Execute

Expand

Fill out the template and click the checkbox of the builds you'd like to execute

Note: Replace with <X.Y> with the major.minor framework version (i.e. 2.2) you would like to start.

Additional context

PR Checklist

Expand

I've prepended PR tag with frameworks/job this applies to : [mxnet, tensorflow, pytorch] | [ei/neuron/graviton] | [build] | [test] | [benchmark] | [ec2, ecs, eks, sagemaker]
If the PR changes affects SM test, I've modified dlc_developer_config.toml in my PR branch by setting sagemaker_tests = true and efa_tests = true
If this PR changes existing code, the change fully backward compatible with pre-existing code. (Non backward-compatible changes need special approval.)
(If applicable) I've documented below the DLC image/dockerfile this relates to
(If applicable) I've documented below the tests I've run on the DLC image
(If applicable) I've reviewed the licenses of updated and new binaries and their dependencies to make sure all licenses are on the Apache Software Foundation Third Party License Policy Category A or Category B license list. See https://www.apache.org/legal/resolved.html.
(If applicable) I've scanned the updated and new binaries to make sure they do not have vulnerabilities associated with them.

NEURON/GRAVITON Testing Checklist

When creating a PR:

I've modified dlc_developer_config.toml in my PR branch by setting neuron_mode = true or graviton_mode = true

Benchmark Testing Checklist

When creating a PR:

I've modified dlc_developer_config.toml in my PR branch by setting ec2_benchmark_tests = true or sagemaker_benchmark_tests = true

Pytest Marker Checklist

Expand

(If applicable) I have added the marker @pytest.mark.model("<model-type>") to the new tests which I have added, to specify the Deep Learning model that is used in the test (use "N/A" if the test doesn't use a model)
(If applicable) I have added the marker @pytest.mark.integration("<feature-being-tested>") to the new tests which I have added, to specify the feature that will be tested
(If applicable) I have added the marker @pytest.mark.multinode(<integer-num-nodes>) to the new tests which I have added, to specify the number of nodes used on a multi-node test
(If applicable) I have added the marker @pytest.mark.processor(<"cpu"/"gpu"/"eia"/"neuron">) to the new tests which I have added, if a test is specifically applicable to only one processor type

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

test/dlc_tests/sanity/test_ecr_scan.py

shantanutrip · 2024-06-21T00:03:22Z

test/dlc_tests/sanity/test_ecr_scan.py

@@ -272,6 +276,29 @@ def helper_function_for_leftover_vulnerabilities_from_enhanced_scanning(
        LOGGER.info(
            f"[NonPatchableVulns] [image_uri:{ecr_enhanced_repo_uri}] {json.dumps(non_patchable_vulnerabilities.vulnerability_list, cls= test_utils.EnhancedJSONEncoder)}"
        )
+
+    if is_mainline_context() and not is_generic_image():


On a second thought, this helper function is also called during the build phase of the image. We want to make sure that we do not upload the data to s3 bucket in case it is during the build phase.

Env variable TEST_TYPE is already set and exists in the environment in case it is a test CB job. You can check for the presence of this environment variable to detect if its a test phase or build phase. You can create a method for checking if its test phase over here: https://github.com/aws/deep-learning-containers/blob/master/test/test_utils/__init__.py#L681

If its a test phase, only then upload to s3, otherwise not.

shantanutrip · 2024-06-21T00:04:25Z

test/dlc_tests/sanity/test_ecr_scan.py

+        image_sha = get_sha_of_an_image_from_ecr(
+            ecr_client_for_enhanced_scanning_repo, ecr_enhanced_repo_uri
+        )
+        s3_resource = boto3.resource("s3")
+        sts_client = boto3.client("sts")
+        account_id = sts_client.get_caller_identity().get("Account")
+        s3object = s3_resource.Object(
+            f"image-data-storage-{account_id}", image_sha + "/ecr_allowlist.json"
+        )
+        s3object.put(
+            Body=(
+                bytes(
+                    json.dumps(
+                        allowlist_for_daily_scans.vulnerability_list,
+                        cls=test_utils.EnhancedJSONEncoder,
+                    ).encode("UTF-8")
+                )
+            )
+        )
+        LOGGER.info(f"ECR allowlist uploaded to S3 Bucket")
+


Lets create a small function within this file to handle this.

…4006) * testing build * upload allowlist for daily scans to s3 bucket * testing allowlist to s3 * fix err * change s3 upload method, fix err * change to ecr_enhanced_repo_uri * change s3 file to json type * change to building tensorflow * PT1.13ec2 build * PT1.13ec2 build * PT1.13ec2 test * change to using image-data-storage- bucket * restored the config file * add is_mainline_context check * test resolve comments * test resolve comments * resolve comments * add docstring and indent * testing build & test * testing test * restore config file * add mainline context check back --------- Co-authored-by: Anna Liu <ziqili@amazon.com>

Annaa3 added the test Reflects file change in test folder label Jun 14, 2024

Annaa3 requested a review from a team as a code owner June 14, 2024 22:10

aws-deep-learning-containers-ci bot added sanity Reflects file change in dlc_tests/sanity folder Size:XS Determines the size of the PR labels Jun 14, 2024

Annaa3 force-pushed the ecr-allowlist branch from b42484c to bfd1677 Compare June 14, 2024 23:07

aws-deep-learning-containers-ci bot added the Size:S Determines the size of the PR label Jun 14, 2024

Annaa3 closed this Jun 17, 2024

Annaa3 force-pushed the ecr-allowlist branch from a3f1149 to 4ace9e0 Compare June 17, 2024 17:19

testing build

ee453dc

Annaa3 reopened this Jun 17, 2024

Anna Liu and others added 8 commits June 17, 2024 15:03

upload allowlist for daily scans to s3 bucket

98f73fc

testing allowlist to s3

a695ad9

fix err

fd978e1

change s3 upload method, fix err

ebc1f1e

change to ecr_enhanced_repo_uri

ca3ca76

Merge branch 'aws:master' into ecr-allowlist

ff20c8c

change s3 file to json type

4096f08

change to building tensorflow

d442e95

shantanutrip reviewed Jun 20, 2024

View reviewed changes

test/dlc_tests/sanity/test_ecr_scan.py Outdated Show resolved Hide resolved

Anna Liu and others added 6 commits June 20, 2024 13:31

PT1.13ec2 build

0bbd937

PT1.13ec2 build

af17f65

PT1.13ec2 test

fea1d5e

change to using image-data-storage- bucket

747a785

Merge branch 'master' into ecr-allowlist

621ed04

restored the config file

ef313f7

Annaa3 changed the title ~~Testing ecr report allowlist upload to s3 bucket~~ Upload ecr report allowlist upload to image-data-storage bucket Jun 20, 2024

Annaa3 changed the title ~~Upload ecr report allowlist upload to image-data-storage bucket~~ Upload ECR report allowlist upload to image-data-storage bucket Jun 20, 2024

Annaa3 changed the title ~~Upload ECR report allowlist upload to image-data-storage bucket~~ Upload ECR report allowlist to image-data-storage bucket Jun 20, 2024

Annaa3 changed the title ~~Upload ECR report allowlist to image-data-storage bucket~~ Add ECR report allowlist uploading to image-data-storage bucket Jun 20, 2024

shantanutrip reviewed Jun 20, 2024

View reviewed changes

test/dlc_tests/sanity/test_ecr_scan.py Outdated Show resolved Hide resolved

Anna Liu and others added 2 commits June 20, 2024 16:19

add is_mainline_context check

dd1ee97

Merge branch 'master' into ecr-allowlist

0fd2cb8

shantanutrip reviewed Jun 21, 2024

View reviewed changes

Anna Liu added 7 commits June 21, 2024 09:19

test resolve comments

167f1e9

test resolve comments

3a7752a

resolve comments

b909893

add docstring and indent

9f49743

testing build & test

da25746

testing test

c0953dc

restore config file

93c470b

shantanutrip previously approved these changes Jun 21, 2024

View reviewed changes

add mainline context check back

9d0e087

Annaa3 dismissed shantanutrip’s stale review via 9d0e087 June 21, 2024 22:18

shantanutrip approved these changes Jun 21, 2024

View reviewed changes

shantanutrip merged commit 3e469d2 into aws:master Jun 21, 2024
28 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add ECR report allowlist uploading to image-data-storage bucket #4006

Add ECR report allowlist uploading to image-data-storage bucket #4006

Annaa3 commented Jun 14, 2024 •

edited

Loading

shantanutrip Jun 21, 2024

shantanutrip Jun 21, 2024

Add ECR report allowlist uploading to image-data-storage bucket #4006

Add ECR report allowlist uploading to image-data-storage bucket #4006

Conversation

Annaa3 commented Jun 14, 2024 • edited Loading

Description

Tests run

Formatting

DLC image/dockerfile

Builds to Execute

Additional context

PR Checklist

NEURON/GRAVITON Testing Checklist

Benchmark Testing Checklist

Pytest Marker Checklist

shantanutrip Jun 21, 2024

Choose a reason for hiding this comment

shantanutrip Jun 21, 2024

Choose a reason for hiding this comment

Annaa3 commented Jun 14, 2024 •

edited

Loading