Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tool_operate: for -O, use a default filename when the URL has none #13988

Closed
wants to merge 2 commits into from

Conversation

bagder
Copy link
Member

@bagder bagder commented Jun 23, 2024

curl sets the filename to the last directory part of the URL or if that also is missing to default (without extension) for this situation.

Instead of returning error.

Add test 690 and 691 to verify

  • code
  • documentation
  • test cases

@github-actions github-actions bot added the tests label Jun 23, 2024
@bagder bagder added the feature-window A merge of this requires an open feature window label Jun 23, 2024
@bagder
Copy link
Member Author

bagder commented Jun 23, 2024

Alternatively, we could consider picking the filename from the last directory part in the URL or similar, but that is harder to explain (also: there is not always a directory).

It cannot use the content-type or similar since it needs to be 100% deterministic based on the given URL.

Note: --no-clobber can be used fine even when the file name is picked like this.

@njdoyle
Copy link

njdoyle commented Jun 24, 2024

If it were up to me, I would prefer a filename other than default. The word default, to me, is a property of something, not the thing itself. I would choose a filename like response or response_body or curl_response or something rather than default. I think these filenames would also help alleviate confusion when a user unfamiliar with this behaviour encounters this for the first time.

@bagder
Copy link
Member Author

bagder commented Jun 24, 2024

Yeah, default is definitely not the best. I think I prefer curl_response out of the ones so far because:

  1. mentions curl (answers who saved it)
  2. says it is a response
  3. does not use an extension (because people will think the extension means something)

@dfandrich
Copy link
Contributor

As a point of reference, wget uses the name index.html in this case, no matter if the Content-Type: is text/html or anything else. But, if an index.html file already exists, it gets a new name with a numeral appended so an existing index.html file doesn't get overwritten. That's the down side of curl using a common name like index.html since the contents would get overwritten by default which might be unexpected. Something like curl_response seems better in the curl case for this reason because it's a name that's unlikely to exist except if written by curl and it's clear afterward where it came from, even if it's a little ugly.

@bagder bagder changed the title tool_operate: for -O, use "default" as filename when the URL has none tool_operate: for -O, use a default filename when the URL has none Jun 25, 2024
@bagder
Copy link
Member Author

bagder commented Jun 25, 2024

It was suggested that we should effectively enable --no-clobber when we pick a name this way.

I'm not sure that's a good idea even if I like the thinking.

@samueloph
Copy link
Member

It was suggested that we should effectively enable --no-clobber when we pick a name this way.

Having --no-clobber only be set for the default URL case can cause confusion on the user's side, for example, consider how the current manpage entry for -O would have to be rewritten to accommodate this:

The remote filename to use for saving is extracted from the given URL, nothing else,
and if it already exists it is overwritten. If you want the server to be able to
choose the filename refer to -J, --remote-header-name which can be used in addition
to this option. If the server chooses a filename and that name already exists it is
not overwritten.

This also means that if curl were to have --no-clobber for the default filename case, it might be better to have it set on all cases then, but we'd be talking about a behavior change which could be unexpected too.

I believe users overall would still prefer --no-clobber enabled on all cases, but if that's not possible, it becomes a matter of consistent behavior.
Will it be too much for an user to remember that --no-clobber is set if -J is used OR when curl goes with the default filename? Will the user struggle to understand on which cases curl picks the default filename and fail to set --no-clobber when they want to?
This might force the user to end up always explicitly setting either --no-clobber or --clobber in order to avoid any risk of uncertainty.

I don't really know which one is best, I'm commenting just to expose how I see the issue and hopefully help.

... or pick the last directory part from the path if available.

Instead of returning error.

Add test 690 to verify. Test 76 and 2036: no longer apply.
... or pick the last directory part from the path if available.

Instead of returning error.

Add test 690 to verify. Test 76 and 2036: no longer apply.

Closes #13988
@BrianInglis
Copy link
Contributor

Could we please use curl-response rather than curl_response to nake it easier for users to type, or better the host name e.g. curl.se, curl-se, or curl[-.]se[-.]index and append media-type/mime-type suffix e.g. .html?
Added similar comment to @wCurl #4

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cmdline tool feature-window A merge of this requires an open feature window tests
Development

Successfully merging this pull request may close these issues.

5 participants