-
-
Notifications
You must be signed in to change notification settings - Fork 7k
tool_doswin: remove the max length check #20143
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR removes maximum length validation for file names in the DOS/Windows sanitization function. The rationale is that excessively long file names will likely cause errors later in the process when actually used, and length limits are not enforced for other operating systems, making this check inconsistent.
- Removes the
max_sanitized_lenvariable and its platform-specific initialization logic - Eliminates three separate length validation checks that would return
SANITIZE_ERR_INVALID_PATHfor overly long names - Simplifies the sanitization logic by deferring length-related errors to the actual file operations
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
Analysis of PR #20143 at 3e6ee7b5: Test 1604 failed, which has NOT been flaky recently, so there could be a real issue in this PR. Note that this test has failed in 70 different CI jobs (the link just goes to one of them). Generated by Testclutch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed to leave this to the OS, as does curl on other platforms. It's tricky to do this (length may depend on OS settings, FS, etc), it's adding an extra failure mode, and couldn't think of a case when catching it early would have benefits.
This may fix the 1604 test fail:
--- a/tests/tunit/tool1604.c
+++ b/tests/tunit/tool1604.c
@@ -179,15 +179,6 @@ static CURLcode test_tool1604(const char *arg)
{ "COM56", 0,
"COM56", SANITIZE_ERR_OK
},
- { "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA"
- "BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB"
- "CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC"
- "DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD"
- "EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE"
- "FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF",
- 0,
- NULL, SANITIZE_ERR_INVALID_PATH
- },
{ NULL, 0,
NULL, SANITIZE_ERR_BAD_ARGUMENT
},(though it'd also remove the only test for SANITIZE_ERR_INVALID_PATH which may still happen in other cases.
edit: on MS-DOS mostly. On Windows there is just one remaining case, when (len == bufsize - 1). Can't grok how it can happen, and maybe returning SANITIZE_ERR_OUT_OF_MEMORY would also be OK?)
Ah right, that can't happen (anymore), since |
|
This is probably ok. I think it is easier than the other PRs related to this issue.
It's rare but it's possible that two reserved names have an underscore appended. If there's more than two reserved names (which should never happen) |
My PR proposal also fixes the confusing messaging for OOM, TL;DR: I think both this and the other PR are useful. |
How? Line 445: len = strlen(file_name);
bufsize = len + 2 + 1;Then nothing changes neither |
Ouch I missed that you did that. That check cannot be removed. You can change it to an OOM check. len is incremented whenever an underscore is prepended. If more than two underscores need to be prepended to reserved names (one in the basename and one in the entire path) that should not happen, and so before prepending an underscore that check determines if two underscores are prepended (ie len == bufsize-1 because ++len ++len already happened. See code: Lines 525 to 533 in 7e08d56
So what I was saying is to Viktor's question you could do it like below if you don't want the PATH error code: diff --git a/src/tool_doswin.c b/src/tool_doswin.c
index a2d6bb8..9fd369d 100644
--- a/src/tool_doswin.c
+++ b/src/tool_doswin.c
@@ -525,7 +525,7 @@ static SANITIZEcode rename_if_reserved_dos(char ** const sanitized,
/* Prepend a '_' */
if(len == bufsize - 1) {
curlx_free(buffer);
- return SANITIZE_ERR_INVALID_PATH;
+ return SANITIZE_ERR_OUT_OF_MEMORY;
}
memmove(p + 1, p, p_len + 1);
p[0] = '_';but the check is needed, even though it should not happen that more than two underscores could be prepended, I couldn't see how someone could devise a path to do that but as a sanity check we should leave it. |
This comment was marked as resolved.
This comment was marked as resolved.
Shouldn't it just allocate enough room to handle that as well? |
There's no reason to have more than two underscores. Two is already improbable. Take this improbable example from test 1604 : Lines 153 to 155 in d057b70
Neither com1 will now be treated by Windows as a reserved name because of the prepended underscores. There's no way a third underscore is needed anywhere. As long as com1 is _com1 it is not a reserved name. Whether Windows would even accept this path is unlikely but Windows does make exceptions for reserved names for legacy reasons. Sometimes it's almost like Windows tries to compensate for software mistakes (or what I would call a mistake..) rather than doing what is documented. The only way I see a third underscore being needed is if an attacker somehow manipulates the sanitizer in a way I can't think of. It's just not needed by Windows, and we shouldn't accept it. We can't just keep prepending underscores. In the worst case the path has two underscores added. This is why len + 2 + nul. Anything else we should consider malformed when it comes to reserved names. So yes the check should be there we don't want to write outside the buffer and I don't see the point in making the buffer bigger. |
So the check is there to bail out if this is the third lap in the loop? |
That should not happen but yes. |
We could simplify the loop condition to make it clear that it can only loop twice. |
I already wrote it so it only happens twice, as I mentioned earlier that removed check it is more of a sanity check. There's no problem returning SANITIZE_ERR_OUT_OF_MEMORY if you don't want the path error for(p = buffer; p; p = (p == buffer && buffer != base ? base : NULL)) { |
I don't mind the error code. I'm not happy with an error check that can't happen. |
Fair, if it bothers you that much I will see if I can do it differently by adding a check to break after the prepend. |
|
how about this diff --git a/src/tool_doswin.c b/src/tool_doswin.c
index a2d6bb8..0c790eb 100644
--- a/src/tool_doswin.c
+++ b/src/tool_doswin.c
@@ -523,10 +523,6 @@ static SANITIZEcode rename_if_reserved_dos(char ** const sanitized,
p_len = strlen(p);
/* Prepend a '_' */
- if(len == bufsize - 1) {
- curlx_free(buffer);
- return SANITIZE_ERR_INVALID_PATH;
- }
memmove(p + 1, p, p_len + 1);
p[0] = '_';
++p_len;
@@ -535,6 +531,8 @@ static SANITIZEcode rename_if_reserved_dos(char ** const sanitized,
/* the basename pointer must be updated since the path has expanded */
if(p == buffer)
base = basename(buffer);
+ else
+ break;
}
/* This is the legacy portion from rename_if_dos_device_name that checks for |
|
Alright I took a closer look and I think it's impossible for an attacker to do anything in the loop that would make it run more than twice. The basename pointer is only updated in a way that it can be used for the second iteration of the loop. So you are right. The check can be removed and does not need a different change like my last suggestion. |
|
Lovely, thanks for the extra depth check! |
A too long name is likely to cause a problem later anyway and get reported there. We don't enforce file name lengths for any other systems. Assisted-by: Jay Satiro Assisted-by: Viktor Szakats Closes #20143
28303a9 to
39a98e8
Compare
|
augment review |
🤖 Augment PR SummarySummary: Removes the explicit maximum-length enforcement from Windows/MS-DOS filename sanitization. Changes:
Why: Overly-long names are expected to fail later at the OS/filesystem layer, and curl does not enforce name-length limits on other platforms. 🤖 Was this summary useful? React with 👍 or 👎 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review completed. No suggestions at this time.
Comment augment review to trigger a new review at any time.
…n fails On MS-DOS (OOM and bad filename) and Windows (OOM only). Given the rarity of both platform and error, we make a compromise and return an unrelated libcurl error (43) in case of a bad output filename on MS-DOS. After: ``` $ CURL_FN_SANITIZE_OOM=1 wine curl.exe https://curl.se/ --output out.txt curl: (27) Out of memory $ CURL_FN_SANITIZE_BAD=1 wine curl.exe https://curl.se/ --output out.txt Warning: bad output filename curl: (43) A libcurl function was given a bad argument $ CURL_FN_SANITIZE_OOM=1 wine curl.exe https://curl.se/index.html --globoff -O curl: (27) Out of memory $ CURL_FN_SANITIZE_BAD=1 wine curl.exe https://curl.se/index.html --globoff -O curl: bad output filename curl: (43) A libcurl function was given a bad argument ``` Before: ``` $ CURL_FN_SANITIZE_OOM=1 wine curl.exe https://curl.se/ --output out.txt Warning: bad output glob curl: (27) Out of memory $ CURL_FN_SANITIZE_BAD=1 wine curl.exe https://curl.se/ --output out.txt Warning: bad output glob curl: (3) URL using bad/illegal format or missing URL $ CURL_FN_SANITIZE_OOM=1 wine curl.exe https://curl.se/index.html --globoff -O curl: Failed to extract a filename from the URL to use for storage curl: (27) Out of memory $ CURL_FN_SANITIZE_BAD=1 wine curl.exe https://curl.se/index.html --globoff -O curl: Failed to extract a filename from the URL to use for storage curl: (3) URL using bad/illegal format or missing URL ``` Ref: #20116 (simpler reboot of) Ref: #20113 #20121 Ref: 40c1748 #20198 Ref: eb7f5b7 #20143 Ref: 8c02407 #20125 Fixes #20044 Closes #20199
A too long name is likely to cause a problem later anyway and get reported there. We don't enforce file name lengths for any other systems.