Skip to content

Conversation

@bagder
Copy link
Member

@bagder bagder commented Jan 1, 2026

A too long name is likely to cause a problem later anyway and get reported there. We don't enforce file name lengths for any other systems.

@bagder bagder added cmdline tool Windows Windows-specific labels Jan 1, 2026
@bagder bagder marked this pull request as ready for review January 1, 2026 17:25
@bagder bagder requested a review from Copilot January 1, 2026 17:25
@bagder bagder requested review from jay and vszakats January 1, 2026 17:26
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR removes maximum length validation for file names in the DOS/Windows sanitization function. The rationale is that excessively long file names will likely cause errors later in the process when actually used, and length limits are not enforced for other operating systems, making this check inconsistent.

  • Removes the max_sanitized_len variable and its platform-specific initialization logic
  • Eliminates three separate length validation checks that would return SANITIZE_ERR_INVALID_PATH for overly long names
  • Simplifies the sanitization logic by deferring length-related errors to the actual file operations

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@testclutch
Copy link

Analysis of PR #20143 at 3e6ee7b5:

Test 1604 failed, which has NOT been flaky recently, so there could be a real issue in this PR. Note that this test has failed in 70 different CI jobs (the link just goes to one of them).

Generated by Testclutch

Copy link
Member

@vszakats vszakats left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed to leave this to the OS, as does curl on other platforms. It's tricky to do this (length may depend on OS settings, FS, etc), it's adding an extra failure mode, and couldn't think of a case when catching it early would have benefits.

This may fix the 1604 test fail:

--- a/tests/tunit/tool1604.c
+++ b/tests/tunit/tool1604.c
@@ -179,15 +179,6 @@ static CURLcode test_tool1604(const char *arg)
     { "COM56", 0,
       "COM56", SANITIZE_ERR_OK
     },
-    { "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA"
-      "BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB"
-      "CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC"
-      "DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD"
-      "EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE"
-      "FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF",
-        0,
-      NULL, SANITIZE_ERR_INVALID_PATH
-    },
     { NULL, 0,
       NULL, SANITIZE_ERR_BAD_ARGUMENT
     },

(though it'd also remove the only test for SANITIZE_ERR_INVALID_PATH which may still happen in other cases.
edit: on MS-DOS mostly. On Windows there is just one remaining case, when (len == bufsize - 1). Can't grok how it can happen, and maybe returning SANITIZE_ERR_OUT_OF_MEMORY would also be OK?)

@bagder
Copy link
Member Author

bagder commented Jan 1, 2026

there is just one remaining case, when (len == bufsize - 1).

Ah right, that can't happen (anymore), since bufsize = len + 2 + 1...

@github-actions github-actions bot added the tests label Jan 1, 2026
@jay
Copy link
Member

jay commented Jan 2, 2026

This is probably ok. I think it is easier than the other PRs related to this issue.

On Windows there is just one remaining case, when (len == bufsize - 1). Can't grok how it can happen, and maybe returning SANITIZE_ERR_OUT_OF_MEMORY would also be OK?)

It's rare but it's possible that two reserved names have an underscore appended. If there's more than two reserved names (which should never happen) then len == bufsize-1 (edit: oops, len already equals bufsize-1 when two reserved names have the underscore appended. that check is only true before the third append, which should fail and the function aborts). You could call this OOM, it would be acceptable.

@vszakats
Copy link
Member

vszakats commented Jan 2, 2026

This is probably ok. I think it is easier than the other PRs related to this issue.

My PR proposal also fixes the confusing messaging for OOM,
and still applies to MS-DOS that keeps returning bad filename
errors. It's also meant as a solution to show non-libcurl errors
from other places in the code, and I think it'd be useful to keep
(I've iterated on it yesterday). Not saying it can't be improved,
but most of the code is not specific to the sanitization case.

TL;DR: I think both this and the other PR are useful.

@bagder
Copy link
Member Author

bagder commented Jan 2, 2026

len already equals bufsize-1 when two reserved names have the underscore appended

How?

Line 445:

  len = strlen(file_name);
  bufsize = len + 2 + 1;

Then nothing changes neither len nor bufsize before the check I remove in 40c8546.

@jay
Copy link
Member

jay commented Jan 2, 2026

len already equals bufsize-1 when two reserved names have the underscore appended

How?

Line 445:

  len = strlen(file_name);
  bufsize = len + 2 + 1;

Then nothing changes neither len nor bufsize before the check I remove in 40c8546.

Ouch I missed that you did that. That check cannot be removed. You can change it to an OOM check. len is incremented whenever an underscore is prepended. If more than two underscores need to be prepended to reserved names (one in the basename and one in the entire path) that should not happen, and so before prepending an underscore that check determines if two underscores are prepended (ie len == bufsize-1 because ++len ++len already happened. See code:

curl/src/tool_doswin.c

Lines 525 to 533 in 7e08d56

/* Prepend a '_' */
if(len == bufsize - 1) {
curlx_free(buffer);
return SANITIZE_ERR_INVALID_PATH;
}
memmove(p + 1, p, p_len + 1);
p[0] = '_';
++p_len;
++len;

So what I was saying is to Viktor's question you could do it like below if you don't want the PATH error code:

diff --git a/src/tool_doswin.c b/src/tool_doswin.c
index a2d6bb8..9fd369d 100644
--- a/src/tool_doswin.c
+++ b/src/tool_doswin.c
@@ -525,7 +525,7 @@ static SANITIZEcode rename_if_reserved_dos(char ** const sanitized,
     /* Prepend a '_' */
     if(len == bufsize - 1) {
       curlx_free(buffer);
-      return SANITIZE_ERR_INVALID_PATH;
+      return SANITIZE_ERR_OUT_OF_MEMORY;
     }
     memmove(p + 1, p, p_len + 1);
     p[0] = '_';

but the check is needed, even though it should not happen that more than two underscores could be prepended, I couldn't see how someone could devise a path to do that but as a sanity check we should leave it.

@bagder

This comment was marked as resolved.

@bagder
Copy link
Member Author

bagder commented Jan 2, 2026

but the check is needed

Shouldn't it just allocate enough room to handle that as well?

@jay
Copy link
Member

jay commented Jan 2, 2026

but the check is needed

Shouldn't it just allocate enough room to handle that as well?

There's no reason to have more than two underscores. Two is already improbable. Take this improbable example from test 1604 :

curl/tests/tunit/tool1604.c

Lines 153 to 155 in d057b70

{ "com1:\\com1", SANITIZE_ALLOW_PATH,
"_com1:\\_com1", SANITIZE_ERR_OK
},

Neither com1 will now be treated by Windows as a reserved name because of the prepended underscores. There's no way a third underscore is needed anywhere. As long as com1 is _com1 it is not a reserved name. Whether Windows would even accept this path is unlikely but Windows does make exceptions for reserved names for legacy reasons. Sometimes it's almost like Windows tries to compensate for software mistakes (or what I would call a mistake..) rather than doing what is documented.

The only way I see a third underscore being needed is if an attacker somehow manipulates the sanitizer in a way I can't think of. It's just not needed by Windows, and we shouldn't accept it. We can't just keep prepending underscores. In the worst case the path has two underscores added. This is why len + 2 + nul. Anything else we should consider malformed when it comes to reserved names. So yes the check should be there we don't want to write outside the buffer and I don't see the point in making the buffer bigger.

@bagder
Copy link
Member Author

bagder commented Jan 2, 2026

So yes the check should be there

So the check is there to bail out if this is the third lap in the loop?

@jay
Copy link
Member

jay commented Jan 2, 2026

So the check is there to bail out if this is the third lap in the loop?

That should not happen but yes.

@bagder
Copy link
Member Author

bagder commented Jan 2, 2026

That should not happen but yes.

We could simplify the loop condition to make it clear that it can only loop twice.

@jay
Copy link
Member

jay commented Jan 2, 2026

We could simplify the loop condition to make it clear that it can only loop twice.

I already wrote it so it only happens twice, as I mentioned earlier that removed check it is more of a sanity check. There's no problem returning SANITIZE_ERR_OUT_OF_MEMORY if you don't want the path error

  for(p = buffer; p; p = (p == buffer && buffer != base ? base : NULL)) {

@bagder
Copy link
Member Author

bagder commented Jan 2, 2026

There's no problem returning SANITIZE_ERR_OUT_OF_MEMORY if you don't want the path error

I don't mind the error code. I'm not happy with an error check that can't happen.

@jay
Copy link
Member

jay commented Jan 2, 2026

I don't mind the error code. I'm not happy with an error check that can't happen.

Fair, if it bothers you that much I will see if I can do it differently by adding a check to break after the prepend.

@jay
Copy link
Member

jay commented Jan 2, 2026

how about this

diff --git a/src/tool_doswin.c b/src/tool_doswin.c
index a2d6bb8..0c790eb 100644
--- a/src/tool_doswin.c
+++ b/src/tool_doswin.c
@@ -523,10 +523,6 @@ static SANITIZEcode rename_if_reserved_dos(char ** const sanitized,
     p_len = strlen(p);
 
     /* Prepend a '_' */
-    if(len == bufsize - 1) {
-      curlx_free(buffer);
-      return SANITIZE_ERR_INVALID_PATH;
-    }
     memmove(p + 1, p, p_len + 1);
     p[0] = '_';
     ++p_len;
@@ -535,6 +531,8 @@ static SANITIZEcode rename_if_reserved_dos(char ** const sanitized,
     /* the basename pointer must be updated since the path has expanded */
     if(p == buffer)
       base = basename(buffer);
+    else
+      break;
   }
 
   /* This is the legacy portion from rename_if_dos_device_name that checks for

@jay
Copy link
Member

jay commented Jan 2, 2026

Alright I took a closer look and I think it's impossible for an attacker to do anything in the loop that would make it run more than twice. The basename pointer is only updated in a way that it can be used for the second iteration of the loop. So you are right. The check can be removed and does not need a different change like my last suggestion.

@bagder
Copy link
Member Author

bagder commented Jan 3, 2026

Lovely, thanks for the extra depth check!

A too long name is likely to cause a problem later anyway and get
reported there. We don't enforce file name lengths for any other
systems.

Assisted-by: Jay Satiro
Assisted-by: Viktor Szakats

Closes #20143
@bagder bagder force-pushed the bagder/doswin-remove-maxlen branch from 28303a9 to 39a98e8 Compare January 3, 2026 09:56
@bagder
Copy link
Member Author

bagder commented Jan 3, 2026

augment review

@augmentcode
Copy link

augmentcode bot commented Jan 3, 2026

🤖 Augment PR Summary

Summary: Removes the explicit maximum-length enforcement from Windows/MS-DOS filename sanitization.

Changes:

  • Drops the computed max_sanitized_len and the associated pre-/post-sanitization length checks in sanitize_file_name().
  • Removes the buffer-length guard around prepending _ in rename_if_reserved_dos() (reserved-name rewrite path).
  • Updates tunit tool1604 by deleting the “too long name” test case that previously expected SANITIZE_ERR_INVALID_PATH.

Why: Overly-long names are expected to fail later at the OS/filesystem layer, and curl does not enforce name-length limits on other platforms.

🤖 Was this summary useful? React with 👍 or 👎

Copy link

@augmentcode augmentcode bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review completed. No suggestions at this time.

Comment augment review to trigger a new review at any time.

@bagder bagder closed this in eb7f5b7 Jan 4, 2026
@bagder bagder deleted the bagder/doswin-remove-maxlen branch January 4, 2026 10:23
vszakats added a commit that referenced this pull request Jan 10, 2026
…n fails

On MS-DOS (OOM and bad filename) and Windows (OOM only).

Given the rarity of both platform and error, we make a compromise and
return an unrelated libcurl error (43) in case of a bad output filename
on MS-DOS.

After:
```
$ CURL_FN_SANITIZE_OOM=1 wine curl.exe https://curl.se/ --output out.txt
curl: (27) Out of memory

$ CURL_FN_SANITIZE_BAD=1 wine curl.exe https://curl.se/ --output out.txt
Warning: bad output filename
curl: (43) A libcurl function was given a bad argument

$ CURL_FN_SANITIZE_OOM=1 wine curl.exe https://curl.se/index.html --globoff -O
curl: (27) Out of memory

$ CURL_FN_SANITIZE_BAD=1 wine curl.exe https://curl.se/index.html --globoff -O
curl: bad output filename
curl: (43) A libcurl function was given a bad argument
```

Before:
```
$ CURL_FN_SANITIZE_OOM=1 wine curl.exe https://curl.se/ --output out.txt
Warning: bad output glob
curl: (27) Out of memory

$ CURL_FN_SANITIZE_BAD=1 wine curl.exe https://curl.se/ --output out.txt
Warning: bad output glob
curl: (3) URL using bad/illegal format or missing URL

$ CURL_FN_SANITIZE_OOM=1 wine curl.exe https://curl.se/index.html --globoff -O
curl: Failed to extract a filename from the URL to use for storage
curl: (27) Out of memory

$ CURL_FN_SANITIZE_BAD=1 wine curl.exe https://curl.se/index.html --globoff -O
curl: Failed to extract a filename from the URL to use for storage
curl: (3) URL using bad/illegal format or missing URL
```

Ref: #20116 (simpler reboot of)
Ref: #20113 #20121
Ref: 40c1748 #20198
Ref: eb7f5b7 #20143
Ref: 8c02407 #20125
Fixes #20044
Closes #20199
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Development

Successfully merging this pull request may close these issues.

4 participants