-
Couldn't load subscription status.
- Fork 1.3k
get: implement --show-url to display only url/path to remote #3156
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Some subtle difference that I noticed with
So, this is currently failing: def test_get_url_with_rev(tmp_dir, erepo_dir):
with erepo_dir.chdir():
erepo_dir.dvc_gen("foo", "foo")
assert main(["get", ".", "foo", "--rev", "HEAD", "--show-url"]) == 0 |
f466bed to
97d57ea
Compare
This comment has been minimized.
This comment has been minimized.
dvc/command/get.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess good question here is whether or not --show-url should also become a part of repo.get. Hm... For now, we can leave it as is, since api has get_url anyways. 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I wanted to refactor get_url and make that part of the internals, but, I left it for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Though, we could consider splitting it into _get_url and _show_url here, so that we could test them separately without invoking main each time. Kinda like we do in pipeline show.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I refactored it a bit to make it like pipeline show, but, I didn't see any tests where we directly make use of Cmd* commands (we do check for isinstance in few places though). pipeline show tests also calls main.
97d57ea to
e99607f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@skshetry , looks good!
I don't like the idea of having --show-url instead of a separate command.
Specially because get is pretty ambiguous 😓 (get what?)
It could be like get --url, but colliding two verbs like get + --show feels weird.
I know that you didn't make the decision, just advocating for something else) (cc: @shcheklein)
Is get --show-url | wget the same as get?
|
@MrOutis I think the short version |
|
@shcheklein , indeed! let's roll with |
Only for HTTP remotes. |
So, if I use it for S3 it won't give me a usable URL? |
|
@MrOutis It will, but not usable with wget 🙂 I think would work though 😄 Checkmate, wget! |
@efiop, do we support reading from stdin? Couldn't find any, and above command does not work, because we don't read from stdin (and, I need to change format of the output). Workaround for now is following: dvc get .. --show-url | xargs dvc get-url
The above also does not seem to work unless we use But, I have to change the output from |
|
Good point @skshetry ! So |
You mean, introducing |
|
@skshetry I meant just leaving it as is for now, indeed 🙂 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left few comments under tests
b7ce015 to
aa7119a
Compare
| remote_obj = _repo.cloud.get_remote(remote) | ||
| return str(remote_obj.checksum_to_path_info(out.checksum)) | ||
| except NotDvcRepoError: | ||
| raise UrlNotDvcRepoError(repo) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll raise this same error in _make_repo in a separate PR. Just keeping it here for now.
On windows, `str(exc_info)` is returning escaped path, eg:
'<ExceptionInfo UrlNotDvcRepoError("URL \'C:\\\\Users\\\\travis\\\\AppData\\\\Local\\\\Temp\\\\pytest-of-travis\\\\pytest-0\\\\popen-gw2\\\\test_get_url_git_only_repo_thr0\' is not a dvc repository.") tblen=2>'
This commit just replaces the double slashes with single one. Bit of a hack, I know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor suggestions, not required, besides that LGTM.
Co-Authored-By: Paweł Redzyński <pawelredzynski@gmail.com>
| get_parser.add_argument( | ||
| "--show-url", | ||
| action="store_true", | ||
| help="Returns path/url to the location in remote for specified path", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could use the description from https://github.com/iterative/dvc/pull/3130/files
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Though, looks like it is still up for discussion, so i'm sure we will modify it while working on the docs. Let's keep it as is for now 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yeah I would change this description, its confusing. Sorry, didn't notice before merge...
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! Please see one minor comment above.
|
@skshetry Also, need to not forget about docs 🙂 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! 🎉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@skshetry sorry didn't get to review this before merge but there are some language improvements from my side, tied to iterative/dvc.org/pull/936. Thanks
| ) | ||
| logger.info(url) | ||
| except DvcException: | ||
| logger.exception("failed to show url") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Capitalization: "Failed to show URL" Maybe end in period . ?
Depends on whether it's printed as part of a larger message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jorgeorpinel, the lowercase seems to be used everywhere in the codebase. And, because this is a logger.exception, our formatter handles adding extra information from raised exception besides this particular logging message. Notice the messages on following examples:
➜ dvc get https://github.com/schacon/cowsay install.sh --show-url
ERROR: failed to show url - URL 'https://github.com/schacon/cowsay' is not a dvc repository.
➜ dvc get . dir/file --show-url
ERROR: failed to show url - unable to find DVC-file with output 'dir/file'There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
failed to show url - URL it looks strange though :) First it proves that the lowercase seems to be used everywhere is not true (and it should not be, typically, acronyms and initialisms are written in all capital letters to distinguish them from ordinary words. - NASA, URL, DVC, etc). Second, you have a repetition here. Repetition is probably fine if reasons vary and you can't actually control them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shcheklein, it's not something that I add, but, I'm utilizing the fact that something will be there. So, it's more like "(what failed) - (the reason)".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And, yes, regarding lowercase, I just looked into it and it's not everywhere for other cases of logging, but for exceptions, lowercase is used everywhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because this is a logger.exception, our formatter handles adding extra information
Thanks, good to know this.
lowercase seems to be used everywhere is not true...
➕ 1️⃣ No need to use bad grammar on purpose because of existing bad grammar.
failed to show url - URL it looks strange though :) ...
it's more like "(what failed) - (the reason)".
So then the reason part in this case should start directly with http://... – but that's not part of this PR...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the reason part in this case should start directly with http://... – but that's not part of this PR
OK I changed this in ec0c39a#diff-3b2745249f7139953ab4b45b382e69e9 but this whole conversation kind of opened a can of worms and that PR (#3220) grew quite a bit 😅 Hoping core team doesn't flip out.
but for exceptions, lowercase is used everywhere
Thanks for the tip. I may review some of these too 😋
| get_parser.add_argument( | ||
| "--show-url", | ||
| action="store_true", | ||
| help="Returns path/url to the location in remote for specified path", |
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
|
UPDATE: I'm addressing my own feedback in #3220. |
Closes #2994.
From the product requirements, it deviates on the following issue:
For now, it does not support showing URLs for multiple paths.
So, we can only do
dvc get <url> <path> --show-urlfor a single path, so asto make it compatible with
dvc get.It also does not support specifying remote at the moment, so as to not introduce new flags. (get: --show-url does not support specifying remotes #3183)
It does not work with the file inside of a directory (i.e.
dir/file). Limitation ofdvc.api.get_url(). Requires adding directory granularity support. (api: get_url does not support links for granular file #3180)How to show path/URL to the directory? Currently, it only shows
.dirfile. (api: get_url() returns path to.dirfor directory #3182)➜ dvc get . dir --show-url ../tmp.fKEyzBMWge/12/11325135bf45fe5f67efa975110a57.dirSame is the case with
dvc get(cc @jorgeorpinel for docs regardingdvc.api.get_url()).It does not work with non-dvc repos (i.e. git only repos). So, the following won't work (api/get: cannot show url for git only repos #3181):
➜ dvc get https://github.com/schacon/cowsay install.sh --show-url ERROR: failed to show url - URL 'https://github.com/schacon/cowsay' is not a dvc repository.Custom revision does not work for the local repo.
Not fixing at the moment. api/get: local repo with custom rev is not supported #3179 tracks this.
Example Usage
➜ dvc get . foo --show-url ../tmp.uJVvn5ZaLY/d3/b07384d113edec49eaa6238ad5ff00 ➜ dvc get https://github.com/iterative/dataset-registry.git get-started/data.xml --rev=HEAD@{5.days.ago} --show-url https://remote.dvc.org/dataset-registry/a3/04afb96060aad90176268345e10355EDIT: Made changes to format from
<path> <url>previously to just<url>.❗ Have you followed the guidelines in the Contributing to DVC list?
📖 Check this box if this PR does not require documentation updates, or if it does and you have created a separate PR in dvc.org with such updates (or at least opened an issue about it in that repo). Please link below to your PR (or issue) in the dvc.org repo.
cmd ref: document
get --show-urloption dvc.org#930❌ Have you checked DeepSource, CodeClimate, and other sanity checks below? We consider their findings recommendatory and don't expect everything to be addressed. Please review them carefully and fix those that actually improve code or fix bugs.
Thank you for the contribution - we'll try to review it as soon as possible. 🙏
TODO
Edgecases