Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

curl_easy_perform() fails if a nested mime_handle was already used in a transfer #15842

Closed
Qriist opened this issue Dec 28, 2024 · 24 comments
Closed
Assignees

Comments

@Qriist
Copy link

Qriist commented Dec 28, 2024

I did this

To clarify on the title, nesting a mime_handle via curl_mime_subparts() technically works (returns 0) but the subsequent curl_easy_perform() fails.

Bascially, In the process of writing a usage example, I discovered that I cannot use a mime_handle if that handle had a newly nested mime_part that had previously been used in a curl_easy_perform() (and presumably in curl_multi_perform() as well but I didn't test).

Attempting to perform under this condition would always return CURLE_READ_ERROR (26). As I can freely reuse the mime_handle in other ways and embed it as a subpart if freshly made, this seems to suggest something internal to curl.

I recognize that this may be intended behavior, but I didn't see a note about it on the documentation so I figured I'd raise the flag. If it's intended please feel free to close the ticket. :)

I expected the following

to add the mime_handle as a mime_part

curl/libcurl version

custom built libcurl from vcpkg with all supported features enabled: libcurl/8.11.1-DEV wolfSSL/5.7.4 (GnuTLS/3.8.7) (mbedTLS/3.6.1) (LibreSSL/4.0.0) (Schannel) zlib/1.3.1 brotli/1.1.0 zstd/1.5.6 c-ares/1.34.3 WinIDN libpsl/0.21.5 libssh2/1.11.1_DEV nghttp2/1.64.0

operating system

Win10

@Qriist Qriist changed the title curl_mime_subparts() fails if the input mime_handle as already been used in a transfer curl_easy_perform() fails if a nested mime_handle was already used in a transfer Dec 28, 2024
@bagder
Copy link
Member

bagder commented Jan 2, 2025

Any chance you can provide an example code (in C) that shows this happening? Maybe by adjusting https://curl.se/libcurl/c/smtp-mime.html ? This way we would be sure we understand your issue exactly.

@Qriist
Copy link
Author

Qriist commented Jan 2, 2025

Any chance you can provide an example code (in C) that shows this happening? Maybe by adjusting https://curl.se/libcurl/c/smtp-mime.html ? This way we would be sure we understand your issue exactly.

I'm kind of embarassed to say that I'm not really fluent in C at all (wrapping libcurl has featured a steep learning curve for me, lol)... but I think I can get there since that mime example you kindly linked looks fairly complete and consists mainly of curl functions. It might be a couple days for my thick head to figure out the compiler process.

If I can't figure it out reasonably quickly I'll at least come back with the equivalent raw DLL calls in AHK, heavily commented so you can follow.

@bagder
Copy link
Member

bagder commented Jan 2, 2025

yeah, you can probably start with a rough psedu-code style version and I can try to convert that for us and then we can take it step by step

@Qriist
Copy link
Author

Qriist commented Jan 2, 2025

yeah, you can probably start with a rough psedu-code style version and I can try to convert that for us and then we can take it step by step

Whew, cuz I'm struggling lol
image

I'll type something up real quick.

@Qriist
Copy link
Author

Qriist commented Jan 2, 2025

I apologize for taking so long, I had to tend to something. ^^;

Anyways, the simplified instructions to replicate:

  • initialize libcurl (curl_global_sslset / curl_global_init)
  • initialize easy_handle
    • +relevant setopts
  • initialize mime_handle1
    • curl_easy_setopt(easy_handle,CURLOPT_MIMEPOST,mime_handle1)
    • +add any random data as mime_parts
  • execute the transfer via curl_easy_perform
    • (this uploads as expected, and returns 0)
  • initialize mime_handle2
    • curl_easy_setopt(easy_handle,CURLOPT_MIMEPOST,mime_handle2)
    • mime_part = curl_mime_addpart(mime_handle2)
    • curl_mime_subparts(mime_part,mime_handle1)
      • (this returns 0)
  • execute the transfer via curl_easy_perform
    • (this returns 26)

Notably, if we do everything the same except execute the first transfer, the now-only transfer will execute as normal.

Using my class to partially intialize variables, I have written a lengthy breakdown of the functions I've invoked to get to the error:
https://pastebin.com/HW9YWNx5

Using that script, my class' error handling function captured the following info:

[1]
    [error code] => 26
    [error family] => Curlcode
    [error string1] => Failed to open/read local data from file/application
    [error string2] => client mime read EOF fail, only 226/426 of needed bytes read
    [invoked LibQurl method] => Sync
    [invoked curl function] => curl_easy_perform
    [options snapshot]
        [1]
            [ACCEPT_ENCODING] =>
            [CAINFO] => C:\Projects\troubleshooting libcurl MIME\Lib\Aris\Qriist\LibQurl@v0.90.0\bin\curl-ca-bundle.crt
            [ERRORBUFFER] => [BUFFER]
            [FOLLOWLOCATION] => 1
            [HEADERDATA] => 142004192
            [HEADERFUNCTION] => 11309248
            [MAXREDIRS] => 30
            [MIMEPOST] => 11092016
            [WRITEDATA] => 150974528
            [WRITEFUNCTION] => 11309120
    [timestamp] => 20250102205126

Finally, for sake of completion, here is the returned body from that script's first transfer to demonstrate that the mime functions are working as expected under most circumstances.

{
  "args": {}, 
  "data": "", 
  "files": {}, 
  "form": {
    "sample part 1": "sample data 1"
  }, 
  "headers": {
    "Accept": "*/*", 
    "Accept-Encoding": "deflate, gzip, br, zstd", 
    "Content-Length": "199", 
    "Content-Type": "multipart/form-data; boundary=------------------------rvUfRysNoCwUdgNjvgcrC4", 
    "Host": "httpbin.org", 
    "X-Amzn-Trace-Id": "Root=1-6776fc4e-0e58cc64596234e6450f3dfc"
  }, 
  "json": null, 
  "method": "POST", 
  "origin": "73.134.67.54", 
  "url": "https://httpbin.org/anything"
}

Assuming it's not user error on my part, I do think this "locking" could be intended behavior, if for no other reason than the way mime_handles are tightly integrated with their respective easy_handles.

It's not really a problem that subparting previously used mime_handles creates a failure, I was just suprised since I didn't see it in the documentation.

If this is indeed a bug, then I think a check (+appropriate error code) at the curl_mime_subparts function would be a good idea.

@bagder

This comment was marked as outdated.

@Qriist
Copy link
Author

Qriist commented Jan 2, 2025

If I understand the description, this should be a C version of the code?

Just to make sure I'm grokking things correctly:

  • Variables with an asterisk (ie, curl_mime *mime1;) are pointers to that kind of struct, while those without (ie, CURLcode res;) are enums of that type?
  • And this line...
    • CURLcode rc = curl_mime_subparts(part2, mime1);
    • ...assigns rc as a CURLcode type integer that can still be used later on?

If my limited C managed to understand that correctly, then yep, that looks good!

(Now here is the part where you tell me it works on your end, haha. ^^; )

@bagder
Copy link
Member

bagder commented Jan 3, 2025

Your understanding of the code seems correct, yes.

When I run this, I get a rc = 43 in the output, meaning that this line fails:

      CURLcode rc = curl_mime_subparts(part2, mime1);

43 means CURLE_BAD_FUNCTION_ARGUMENT.

Is this perhaps the problem you see as well? It seems to come from here:

https://github.com/curl/curl/blob/abf8062449f0a5000284bebdcea3c692cc6e3de4/lib/mime.c#L1550C1-L1551C1

I'll take a deeper look.

@bagder
Copy link
Member

bagder commented Jan 3, 2025

I think this minor tweak reproduces the original report:

#include <stdio.h>
#include <curl/curl.h>

/* write callback that does nothing */
size_t write_it(char *ptr, size_t size, size_t nmemb, void *userdata)
{
  return size * nmemb;
}

int main(void)
{
  CURL *curl;
  CURLcode res;

  curl_global_init(CURL_GLOBAL_ALL);

  curl = curl_easy_init();
  if(curl) {
    curl_mime *mime1;
    curl_mimepart *part1;
    curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, write_it);
    curl_easy_setopt(curl, CURLOPT_URL, "http://localhost/");

    mime1 = curl_mime_init(curl);
    part1 = curl_mime_addpart(mime1);
    curl_mime_data(part1, "<title>hello</title>", CURL_ZERO_TERMINATED);
    curl_mime_type(part1, "text/html");
    curl_mime_name(part1, "data");

    curl_easy_setopt(curl, CURLOPT_MIMEPOST, mime1);
    curl_easy_setopt(curl, CURLOPT_VERBOSE, 1L);

    /* Perform the request, res gets the return code */
    res = curl_easy_perform(curl);
    /* Check for errors */
    if(res != CURLE_OK)
      fprintf(stderr, "curl_easy_perform() 1 failed: %s\n",
              curl_easy_strerror(res));

    /* phase two, create a mime struct using the mime1 handle */
    {
      curl_mime *mime2 = curl_mime_init(curl);
      curl_mimepart *part2 = curl_mime_addpart(mime2);

      /* use the new mime setup */
      curl_easy_setopt(curl, CURLOPT_MIMEPOST, mime2);
      CURLcode rc = curl_mime_subparts(part2, mime1);

      printf("rc = %d\n", (int)rc);


      /* Perform the request, res gets the return code */
      res = curl_easy_perform(curl);
      /* Check for errors */
      if(res != CURLE_OK)
        fprintf(stderr, "curl_easy_perform() 2 failed: %s\n",
                curl_easy_strerror(res));
    }
    /* always cleanup */
    curl_easy_cleanup(curl);
  }
  return 0;
}

@bagder bagder self-assigned this Jan 3, 2025
@bagder
Copy link
Member

bagder commented Jan 3, 2025

/cc @monnerat for info.

There seems to be some state that is not reset properly when the same mime part is used again like this. The problem is easily reproducible but I have yet to find the exact culprit.

@Qriist
Copy link
Author

Qriist commented Jan 3, 2025

Your understanding of the code seems correct, yes.

Good, seems we're communicating effectively then. :D

I think this minor tweak reproduces the original report:

I apologize I didn't catch the difference at a glance, but at least you caught it yourself pretty quickly.

A thought had occurred to me earlier that the error might have been something relating to the DLL interface, like maybe Windows was butchering a buffer or whatever. That you were able to reproduce the issue without using a DLL cancels that idea, though, which is good since that would likely be harder to troubleshoot.

Now that you've been able to reproduce the bug on your end I think that my usefulness to the conversation has ended. If there's anything more you need from me (to test a fixed build, perhaps) by all means ask! ^_^

@bagder
Copy link
Member

bagder commented Jan 3, 2025

Yes, I'll get back. Just need some time to get to the bottom of this while juggling some other things at the same time.

@monnerat
Copy link
Contributor

monnerat commented Jan 3, 2025

There seems to be some state that is not reset properly when the same mime part is used again like this.

First, you can reassign mime handle 1 because you still own it after using it within a CURLOPT_MIMEPOST setopt, while you could'nt do that if it were previously used in a curl_mime_subparts, as the latter takes ownership of the handle. That said, I don't know why you get error 43 since calling again setopt(CURLOPT_MIMEPOST) clears the parent pointer in the previous mime struct (last line of mime_subparts_unbind).

Next, I don't think you reproduced the initial error, which has code 26 (caused by a premature EOF).
Since we do not know how the parts data are initilalized, I can imagine it is with curl_mime_data_cb and there is a problem with the seek/read callback. Just a guess...

@bagder
Copy link
Member

bagder commented Jan 3, 2025

I don't think you reproduced the initial error, which has code 26 (caused by a premature EOF).

Yes I do. Try the last code snippet I show you here a few comments up and you should see a 26.

Since we do not know how the parts data are initilalized

My example reproduces it and we know how they are initialized!

@monnerat
Copy link
Contributor

monnerat commented Jan 3, 2025

Yes I do. Try the last code I show you here and you should see a 26.

Sorry, I did not realized code erroring 43 was not the one listed.

@bagder
Copy link
Member

bagder commented Jan 3, 2025

Nah, that was a bit unclear on my behalf. My first one was not doing the same thing, the second one does.

@bagder
Copy link
Member

bagder commented Jan 3, 2025

My code example can also, exactly as @Qriist mentioned, be modified so that the first transfer is skipped and then magically the second transfer works (which then is the only transfer of course). Indicating that it is the transfer itself and use of the mime data that somehow makes the second use of it break.

@bagder
Copy link
Member

bagder commented Jan 3, 2025

Here's a third version of mime-15842.c that also has a debug callback to display the sent formpost data.

@Qriist
Copy link
Author

Qriist commented Jan 3, 2025

Yes, I'll get back. Just need some time to get to the bottom of this while juggling some other things at the same time.

No rush at all! :)

Next, I don't think you reproduced the initial error, which has code 26 (caused by a premature EOF). Since we do not know how the parts data are initilalized, I can imagine it is with curl_mime_data_cb and there is a problem with the seek/read callback. Just a guess...

No, I was definitely using the non-callback version, curl_mime_data().

That said, I just tested curl_mime_filedata(). It reported the same issue.

On a lark I also tried without adding any data to mime1 and it, too, failed.

Literally just: init mime1 -> perform -> init mime2 -> subparts -> perform returns 26

One thing I noticed was that the easy handle error buffer string is reporting seemingly very high buffer sizes compared to the data passed. My error handler I posted above reported 226/426 bytes read, when I passed 26 total bytes (13 each for title/data), and the 168 byte binary file I attempted just now reported 316/766 bytes read. (The size discrepency could just be encoding, though.)

monnerat added a commit to monnerat/curl that referenced this issue Jan 4, 2025
Subparts may have been previouly used as a top-level mime structure and
thus not rewound.

New test 695 checks the proper functioning in these particular conditions.

Fixes curl#15842
monnerat added a commit to monnerat/curl that referenced this issue Jan 4, 2025
Subparts may have been previously used as a top-level mime structure and
thus not rewound.

New test 695 checks the proper functioning in these particular conditions.

Fixes curl#15842
monnerat added a commit to monnerat/curl that referenced this issue Jan 4, 2025
Subparts may have been previously used as a top-level mime structure and
thus not rewound.

New test 695 checks the proper functioning in these particular conditions.

Reported-by: Qriist on github
Fixes curl#15842
@monnerat
Copy link
Contributor

monnerat commented Jan 4, 2025

I found the problem:

  • After the first structure is used in curl_easy_perform(), it is not at beginning of data anymore.
  • When attached to the second structure, it is left in its state.
  • At 2nd curl_easy_perform(), the 2nd structure (which has not yet been used) is rewound, but not descended recursively because it is already positioned at BOD, being freshly created.

As a result, the first structure is not rewound.
PR #15911 is a fix for this, consisting of explicitly rewinding subparts when attached by curl_mime_subparts().

monnerat added a commit to monnerat/curl that referenced this issue Jan 4, 2025
Subparts may have been previously used as a top-level mime structure and
thus not rewound.

New test 695 checks the proper functioning in these particular conditions.

Reported-by: Qriist on github
Fixes curl#15842
monnerat added a commit to monnerat/curl that referenced this issue Jan 4, 2025
Subparts may have been previously used as a top-level mime structure and
thus not rewound.

New test 695 checks the proper functioning in these particular conditions.

Reported-by: Qriist on github
Fixes curl#15842
monnerat added a commit to monnerat/curl that referenced this issue Jan 4, 2025
Subparts may have been previously used as a top-level mime structure and
thus not rewound.

New test 695 checks the proper functioning in these particular conditions.

Reported-by: Qriist on github
Fixes curl#15842
monnerat added a commit to monnerat/curl that referenced this issue Jan 4, 2025
Subparts may have been previously used as a top-level mime structure and
thus not rewound.

New test 695 checks the proper functioning in these particular conditions.

Reported-by: Qriist on github
Fixes curl#15842
@bagder
Copy link
Member

bagder commented Jan 4, 2025

@Qriist are you able to try out the PR against your case to check if it fixes your problem the same way it fixed my reproducer? If so, then we can wait for your confirmation before we proceed.

@Qriist
Copy link
Author

Qriist commented Jan 4, 2025

@Qriist are you able to try out the PR against your case to check if it fixes your problem the same way it fixed my reproducer? If so, then we can wait for your confirmation before we proceed.

I use vcpkg to build since it basically automates the process, but I'll try to manually build the PR later tonight after work. I'll let you know if I'm successful, and if the successful build solves the problem.

Also, side note, thank you very much for your patience with my C inexperience!

@Qriist
Copy link
Author

Qriist commented Jan 5, 2025

I return to report confusing mixed results: I managed to get a seemingly succesful build with mingw following the indepth instructions from the curl.dev site, but the output dll contained no functions whatsoever.

I went about it in this order:

  1. Installed MSYS2 and got the gcc toolchain running
  2. Installed the "default" build as instructed by https://everything.curl.dev/install/windows/win-msys2.html to get the dependencies together
  3. Cloned https://github.com/monnerat/curl.git, checked out rewind-reused-mime to get the PR included
  4. ran autoreconf -fi
    4b. Incidentally needed to install libtool to resolve the undefined macro/please use m4_pattern_allow issue
  5. ran ./configure --with-openssl, make, and make install, which all completed without issue:
    image
  6. pointed my script at the newly generated dll, which immediately failed:
    image

Some quick internet searching suggests OpenSSL is incorrectly configured within the MSYS2 enviornment so I tried bypassing curl_global_sslset to just call curl_global_init and I got a similar function-not-found error.

I'm kind of lost on where to go from here.

@bagder
Copy link
Member

bagder commented Jan 6, 2025

I'm kind of lost on where to go from here.

I wouldn't know either. Getting things built and used on Windows goes beyond what I know.

I do however believe that the fix is correct so I will go ahead and merge that and close this issue as fixed. If you, after getting this fix tested, find that there is something left to fix, this bug or something else, then feel free to create a new issue.

@bagder bagder closed this as completed in 1b3f00f Jan 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging a pull request may close this issue.

3 participants