Skip to content

SMTP binary file transfer using a callback results in corrupted file #4860

@RuurdBeerstra

Description

@RuurdBeerstra

I did this

Transfer a binary file using SMTP. Use the curl_mime_data_cb function to obtain
the binary data. When the file exceeds some 64KB in size, the resulting MIME attachment
at the receiving end is exactly the right size, but the contents is corrupted.

When other means are used to pass the data to curl (name of a file, text string)
the problem does NOT occur, only when using curl_mime_data_cb.

The internal buffering in cURL seems to send a few buffered bytes twice, omitting others.
I have written a small reproduction program, using an example program as a start.
It fails predictably, every time, using the latests cURL version.
I have used a test file produced with this:

    $ awk 'BEGIN {for(i = 0; i < 50000; i++) printf("Small line %06d\n",i);}' > many_lines.txt

That produces a file like this :
-rw-rw-r-- 1 rbeerstra bea-tools 900000 Jan 27 16:14 many_lines.txt

The attached mime.c example program mails that to me using SMTP. The received mail has an
attachment, which is exactly 900000 bytes (which is correct). But when opened with a text editor, looking at line 3615:

Small line 003613
Small line 003614
Small line 00361ne 003612
Small line 003613
Small line 003614
Small line 003615
Small line 003616
Small line 003617
Small line 003618
Small line 003619
Small line 003620
Small line 003621
Small line 003622
Smal03626
Small line 003627
Small line 003628

There is corruption at that point, always. Part is repeated, another part is missing.
When I delete all proper lines like so (in VI): :g/^Small line ....../d
it should have deleted all lines, but I end up with:
Smal03626
Sm 007267
ne 010908
Small lineall line 014535
Small liSmall line 018176
Small 6
Smal457
Small lineSmall line 025472
Sm29098
Small li2
032739
Small 753
Smal36394
Sm 040035
ne 043676
Small lineall line 047303

So every 3600-ish lines the problem repeats, showing a buffering problem.
Obviously, when the binary data is less predictable, errors are harder to spot or, worse,
end up being undetected. It was spotted in our case by a failing 'gunzip' of the attachment.

When the "binary" in the test program is changed into "text" (see lines 101/102), the problem does NOT occur.

I expected the following

Binary files should be transferred correctly :-)

curl/libcurl version

I tested with 2 versions: The first is the one we currently incorporate in our producte: curl-7.65.1.

curl -V output:
curl 7.65.1 (x86_64-pc-linux-gnu) libcurl/7.65.1 OpenSSL/1.1.1c zlib/1.2.7
Release-Date: 2019-06-05
Protocols: dict file ftp ftps gopher http https imap imaps pop3 pop3s rtsp smb smbs smtp smtps telnet tftp
Features: AsynchDNS HTTPS-proxy IPv6 Largefile libz NTLM NTLM_WB SSL TLS-SRP UnixSockets

The 2nd is the latest one from https://curl.haxx.se/download.html: curl-7.68.0
curl -V output:
curl 7.68.0 (x86_64-pc-linux-gnu) libcurl/7.68.0 OpenSSL/1.1.1c zlib/1.2.7
Release-Date: 2020-01-08
Protocols: dict file ftp ftps gopher http https imap imaps pop3 pop3s rtsp smb smbs smtp smtps telnet tftp
Features: AsynchDNS HTTPS-proxy IPv6 Largefile libz NTLM NTLM_WB SSL TLS-SRP UnixSockets

Both fail in exactly the same way.

operating system

Bug was spotted on Windows, reproduced on Linux. I think it is OS independent.
mime.c.txt

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions