-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: packaging of built artifacts #1789
Conversation
f64f1c8
to
f2e104d
Compare
Is there a way we can integ test this? It will help guard against regressions in the future (especially as this has been a pain point for customers for some time now). |
d495cfa
to
8dc4b44
Compare
|
||
def file_checksum(file_name): | ||
""" | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you explain here it is specifically used with sam deploy
and might potentially be back-compat issues if we break this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense, I can add a series of comments there.
md5 checksum of the directory. | ||
|
||
""" | ||
md5_dir = hashlib.md5() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is md5 the right one? I am not sure if md5 is stable enough or has issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for checksum purposes it should be fine (I dont think its cryptographically secure though), we can also look at sha256 as well.
""" | ||
md5_dir = hashlib.md5() | ||
# Walk through given directory and find all directories and files. | ||
for dirpath, _, filenames in os.walk(directory, followlinks=followlinks): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have to catch circular links?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure how os.walk handles that, can look into it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tried to create a series of files that had a circular symlink reference.
~/w/ap/py37/s/.aws-sam/build/HelloWorldFunction ▓▒░ ls -lrth a ░▒▓ ✔ | py3.7 Py | 12:46:18
lrwxr-xr-x 1 srirammv ANT\Domain Users 1B Feb 12 12:46 a -> c
~/w/ap/py37/s/.aws-sam/build/HelloWorldFunction ▓▒░ ls -lrth b ░▒▓ ✔ | py3.7 Py | 12:46:23
lrwxr-xr-x 1 srirammv ANT\Domain Users 1B Feb 12 12:46 b -> a
~/w/ap/py37/s/.aws-sam/build/HelloWorldFunction ▓▒░ ls -lrth c ░▒▓ ✔ | py3.7 Py | 12:46:25
lrwxr-xr-x 1 srirammv ANT\Domain Users 1B Feb 12 12:46 c -> b
Get this error:
Error: Unable to upload artifact HelloWorldFunction referenced by CodeUri parameter of HelloWorldFunction resource.
[Errno 62] Too many levels of symbolic links: '/Users/srirammv/workspace/apps/py37_app/sam-app/.aws-sam/build/HelloWorldFunction/a'
So Looks like circular references are taken care of.
samcli/lib/utils/hash.py
Outdated
# Go through every file in the directory and sub-directory. | ||
for filepath in [os.path.join(dirpath, filename) for filename in filenames]: | ||
# Encode file's checksum to be utf-8 and bytes. | ||
md5_dir.update(file_checksum(filepath).encode("utf-8")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hang on, you are just checksumming the file path and not the content of the files. Is this intentional? If so, Why is this sufficient?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I'm passing it through to the file_checksum method.
Why is this change necessary? * sam deploy of the same built artifacts can result in new uploads to s3 during the package step. How does it address the issue? * Hash of the directory is constructed rather than on the packaged zip file. What side effects does this change have? * Potentially None
* Remove dependency on 3rd party library. * Integration test to check if repeated builds dont trigger new deployments.
Why is this change necessary? * Just changing a file name should trigger a new package/deploy operation. How does it address the issue? * Checksumming on both file name and contents generates a new checksum, which will not match with a checksum where the file name was the same as before. What side effects does this change have? * None
7851c5a
to
793e0f5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Feedback was handled and approved by jfuss
Why is this change necessary?
during the package step.
How does it address the issue?
file.
What side effects does this change have?
Checklist:
make pr
passesBy submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.