-
Notifications
You must be signed in to change notification settings - Fork 295
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a flag to artifact upload
to allow skipping symlinks when uploading files
#2072
Add a flag to artifact upload
to allow skipping symlinks when uploading files
#2072
Conversation
f344ae1
to
43a32d5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
everything here LGTM, but maybe while we're here it'd be worth adding a flag that does the same thing as what --follow-symlinks
does, but with a better name, and using the existing deprecated-and-renamed-to
gear in our CLI lib. maybe calling it --glob-resolve-symlinks
or similar?
f4ba6ba
to
334f2e3
Compare
As far as I can tell, it is not used anywhere.
334f2e3
to
83994d6
Compare
…ks`; rename `upload-symlinks` to `upload-skip-symlinks`
83994d6
to
5651c4d
Compare
Thanks for fixing this! IMO the documentation could be a bit more clear on the fact that symlinks are skipped entirely, i.e. it’s not just that the symlinked-to file won’t be uploaded, but the symlink itself won’t be uploaded. Maybe something like: « Note that symlinks themselves are never uploaded because some object storages can’t represent them. —upload-skip-symlinks only controls whether the symlink targets are uploaded under the symlink’s name. If you want to upload a full directory hierarchy and preserve its structure, a tarball can be used instead. » This might be a bit long winded, but you get the idea. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
love your work 💚
TL;DR: There is a new flag
--upload-skip-symlinks
that prevents files that are symlinks from being uploaded. The flag--follow-symlinks
has been deprecated and renamed to--glob-resolve-follow-symlinks
.Description
Some users may misinterpret specifying
false
to the--follow-symlinks
flag to mean that the agent will not upload symlinks.However, this is not the case. Until this PR, the agent will always upload all files that match the glob provided to the
artifact upload
subcommand. The--follow-symlinks
flag only affected whether symlinked directories would be followed when resolving the glob.However, I think it is a reasonable expectation that
--follow-symlinks=false
WOULD prevent symlinked files from being uploaded. Indeed, on theaws s3 sync
command, that is the case.However, I think we are constrained from changing the behaviour of the existing flag. Some customers may have been uploading artifacts unaware that they are actually symlinks. As the flag is confusing, we should deprecate it. However, it is not straightforward if we should replace it with one flag that both does not follow symlinks when resolving the glob, and skips uploading files that are symlinks, or create a flag for each of these.
I decided on the former so that there will be an exact replacement for the deprecated
--follow-symlinks
flag,--glob-resolve-follow-symlinks
. The new feature of not uploading symlinks is enabled by the--upload-skip-symlink
flag.This gives the follow table
As all four possibilities are meaningfully different, I think it's justified to have 2 flags even if the backward compatibility need not be maintained.
Uploading just the symlink
The binary nature of the
--upload-skip-symlinks
flag precludes a third behaviour where instead of uploading the file behind the symlink, the agent could endeavour to preserve the symlink.While this is an interesting use case, S3 does not support symlinks. An implementation of the feature would have to choose a representation of the symlink in the object storage, and on downloading such a representation, reconstruct the symlink on the target file system. Also, the upload and download could occur on different file systems and by agents on different operating systems, each with differing levels of implementations of symlinks, exposing us to many edge cases. I think that if users want to preserve what's on the file system to that extent, they are better served by creating a tarball.
Implementation Note:
The implementation uses a call to
lstat
to determine if the file is a link. There is a previous call tostat
to determine if the file is a directory. I thought about ways to combine these, but it seems both calls need to be made to determine if a path is a symlink to a directory.