Skip to content

Conversation

@bagder
Copy link
Member

@bagder bagder commented Nov 28, 2025

Let's see what fallout this makes

@icing
Copy link
Contributor

icing commented Nov 28, 2025

CI is so good, you can see what you break!

@vszakats
Copy link
Member

I guess this would break debug builds because these calls are
used from the memdebug wrappers. I think it was original reason
for using the raw allocators here.

@bagder
Copy link
Member Author

bagder commented Nov 28, 2025

So why do we need to fopen() magic in the memdebug code? It's just debug code, we can keep it simple can we not?

@vszakats
Copy link
Member

So why do we need to fopen() magic in the memdebug code? It's just debug code, we can keep it simple can we not?

It's saving files to disk, and saving files to disk involves dealing with
long filenames, unicode. On Windows simple things are not simple.

@bagder
Copy link
Member Author

bagder commented Nov 28, 2025

So why do we need to fopen() magic in the memdebug code? It's just debug code, we can keep it simple can we not?

It's saving files to disk, and saving files to disk involves dealing with long filenames, unicode. On Windows simple things are not simple.

This is debug code - we control the entire thing. We don't need to accept unicode and weird stuff here.

@vszakats
Copy link
Member

Then is creates and exception, which needs to be kept in mind and dealt all the time.

What would be the goal of using the debug/curl allocators in these low-level functions?
They have never been used there before.

@bagder
Copy link
Member Author

bagder commented Nov 28, 2025

I think my main point is that the malloc() replacement feels more important since they are used in production code, while the memdebug code is primarily for us to debug curl with and we use those in more controlled environments and it feels easy enough to apply a few restrictions there, if the benefit is that we can do the mallocs correctly.

@bagder
Copy link
Member Author

bagder commented Nov 28, 2025

They have never been used there before.

That was a bug. We offer users the ability to replace all libcurl mallocs to instead use their allocation functions.

@bagder
Copy link
Member Author

bagder commented Nov 28, 2025

Then is creates and exception

This just reverses the exception. Instead of accepting malloc() in production code this has to accept fopen and freopen in debug code.

@bagder
Copy link
Member Author

bagder commented Nov 28, 2025

@vszakats are these windows CI failures due to this change? They look odd!

@vszakats
Copy link
Member

vszakats commented Nov 28, 2025

They seem to be crashing on curl -V.

Forgot to say thet memdebug functions also wrap fopen/open functions
and it'd be unfortunate if they'd change behavior compared to the original,
in addition to doing their memdebug things.

These allocators are fine. They could probably be updated to use
win32-specific allocators, if the optics of this exception is not good, but
we still cannot use the curlx_ ones if we want to avoid recursion.

@bagder
Copy link
Member Author

bagder commented Nov 28, 2025

They could probably be updated to use win32-specific allocators,

That would not help though. I want us to

A) honor the curl_global_init_mem() function as well as possible. The user then provides their set of allocation functions to replace the ones libcurl uses. We therefore need to use the correct set to have this work.

B) for debug builds, I want the memdebug system to trace all the memory calls we do in curl + libcurl, which then also requires us to use the correct set

@vszakats
Copy link
Member

vszakats commented Nov 28, 2025

I'd argue these goals aren't worth the mess it may need to support
them, though this is of course subjective. I'm also silently
celebrating that I could untangle allocators after many false-starts
and a lot of work, and not so keen today to re-add even a small bit
of a mess.

For fopen, these Windows-specific allocations are self-contained,
the code is mature and not expected to change. We saw nobody
complaining they aren't using custom allocators.

For the two multibyte ones, and their unicodefree pair, memdebug
may theoretically discover issues. As for custom allocators, nobody
complained either, and these pointers (AFAIU) should never be
returned by a libcurl API. They are private to libcurl, thus unlikely to
bother most API users how they are allocated. Unless for stats or
rare use-cases.

But, I also have no idea how to achieve these goals without making
more of a mess to counteract the benefits.

fopen/open can be dumbed down for debug-enabled builds, which
in turn will cause great pain down the line, because they'll behave
differently than normal builds. They would also expose the messy
Windows open functions to more code, also something I just cleaned
up.

For the unicode functions: Unicode barely works in curl, so perhaps
it could all be just deleted and replaced with the unobstrusive solution
that requires Win10. Breaking or disallowing Unicode for debug-enabled
builds, I'd find super problematic, to wrap one's head around when
things start breaking.

A duplicate copy of these functions could be kept that's exclusively
used from memdebug, and keep using standard allocators. Which
would mean to support 2 copies of this in slightly different variants
from now on. I'm not a fan of that.

Finally, just to make it clear because I feel this is now suddenly on
me: This had always worked liked this, and I merely kept it up to not
break things.

And as always: it's more than likely I'm missing something.

@bagder
Copy link
Member Author

bagder commented Nov 28, 2025

I feel this is now suddenly on me

I don't think it is on you nor "your problem". I think it is just like always. Iteratively taking steps forward. You took this a big step forward which then enabled me to spot and realize this particular detail that I would like to fix.

@vszakats
Copy link
Member

vszakats commented Nov 28, 2025

I feel this is now suddenly on me

I don't think it is on you nor "your problem". I think it is just like always. Iteratively taking steps forward. You took this a big step forward which then enabled me to spot and realize this particular detail that I would like to fix.

Phew, thanks for clarifying! It's good it's easier to oversee what's
happening.

IMO it'd be worth considering to declare defeat for the "WIDE"
Unicode interface, and focus on UTF-8.

But! According to comments, GetFullPathNameW() is required
for long-filename support, raising the question of how to support
long-filenames in UTF-8 mode. Maybe it's automatically done by
GetFullPathNameA() in that case?

For non-Windows multibye, I'm trying to map them to the curlx
ones in #19751.

@vszakats vszakats added the Windows Windows-specific label Nov 29, 2025
@vszakats
Copy link
Member

curlx_convert_UTF8_to_wchar and curlx_convert_wchar_to_UTF8 are
used only in lib/idn.c outside curlx. Since the max input string length is
255 chars, it seems feasible to replace them with MultiByteToWideChar
and WideCharToMultiByte and stack buffers.

Let's see what fallout this makes
@bagder bagder force-pushed the bagder/curlx-malloc branch from 7c4f5b9 to 2613bd9 Compare December 1, 2025 13:52
@bagder
Copy link
Member Author

bagder commented Dec 1, 2025

Since the max input string length is 255 chars

In idn.c it might have that limit, but the function is also used for filenames in fopen.c and I figure they can be (much) larger than 255.

@vszakats
Copy link
Member

vszakats commented Dec 1, 2025

Since the max input string length is 255 chars

In idn.c it might have that limit, but the function is also used for filenames in fopen.c and I figure they can be (much) larger than 255.

Sorry, I meant to say to replace the curlx calls in idn.c with direct MultiByte*
functions.

This would reduce raw memallocs, a bit. It'd also remove the last uses of the
two curlx multibyte functions outside curlx, and allow to define them for
Windows Unicode builds only.

@bagder
Copy link
Member Author

bagder commented Dec 1, 2025

Sorry, I meant to say to replace the curlx calls in idn.c with direct MultiByte* functions.

Ah right. Yeah that would simplify things a little!

vszakats added a commit that referenced this pull request Dec 3, 2025
Eliminate a heap buffer in both `win32_idn_to_ascii()` and
`win32_ascii_to_idn()`, by replacing it with stack buffer. The maximum
size is fixed in these cases, and small enough to fit there.

Also reuse length returned by the UTF-8 to wchar conversion, allowing
to drop `wcslen()` call in both functions, and allowing to call
the wchar to UTF-8 conversion API `WideCharToMultiByte()` with the known
length, saving length calculations within that API too.

Ref: #19748 (comment)

Closes #19798
@vszakats
Copy link
Member

vszakats commented Dec 5, 2025

Made #19845 to fix this. Reduces system allocator use to Windows
TrackMemory builds in the 4 fopen/stat wrappers. Buffers are freed
within these 4 functions.

@vszakats vszakats closed this in 4e051ff Dec 5, 2025
@bagder bagder deleted the bagder/curlx-malloc branch December 12, 2025 13:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Windows Windows-specific

Development

Successfully merging this pull request may close these issues.

3 participants