Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

parallel downloaded files with some of them to be 0 byte #4937

Closed
flashercs opened this issue Feb 17, 2020 · 9 comments
Closed

parallel downloaded files with some of them to be 0 byte #4937

flashercs opened this issue Feb 17, 2020 · 9 comments
Labels

Comments

@flashercs
Copy link

@flashercs flashercs commented Feb 17, 2020

I did this

When I tried to download some html files with this command curl --config config.txt,the last 4 downloaded html files are 0 byte,and other files are all right.
However it will be allright without --parallel-max 3 in config.txt.
The config.txt content:

--remote-name-all
--location
--compressed
--user-agent "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.3) Gecko/2008092416 Firefox/3.0.3"
--url https://xing.911cha.com/xing_%E3%91%8E.html
--url https://xing.911cha.com/xing_%E3%92%92.html
--url https://xing.911cha.com/xing_%E3%94%A9.html
--url https://xing.911cha.com/xing_%E3%9C%A8.html
--url https://xing.911cha.com/xing_%E3%A1%93.html
--url https://xing.911cha.com/xing_%E3%A4%A9.html
--url https://xing.911cha.com/xing_%E3%A8%85.html
--url https://xing.911cha.com/xing_%E3%AC%A5.html
--url https://xing.911cha.com/xing_%E3%AF%81.html
--url https://xing.911cha.com/xing_%E3%B2%BB.html
--url https://xing.911cha.com/xing_%E3%B5%9F.html
--parallel
--parallel-max 3

I expected the following

I expect all downloaded files are OK in parallel mode,with no 0 byte files.
curl 7.66 does not have this problem.

curl/libcurl version

curl 7.68.0 (x86_64-pc-win32) libcurl/7.68.0 OpenSSL/1.1.1d (Schannel) zlib/1.2.11 brotli/1.0.7 WinIDN libssh2/1.9.0 nghttp2/1.40.0 Release-Date: 2020-01-08 Protocols: dict file ftp ftps gopher http https imap imaps ldap ldaps pop3 pop3s rtsp scp sftp smb smbs smtp smtps telnet tftp Features: AsynchDNS HTTP2 HTTPS-proxy IDN IPv6 Kerberos Largefile MultiSSL NTLM SPNEGO SSL SSPI TLS-SRP brotli libz

operating system

Windows 10 17134

@bagder bagder added the cmdline tool label Feb 17, 2020
@emilengler
Copy link
Contributor

@emilengler emilengler commented Mar 3, 2020

I can confirm the bug on

curl 7.64.0 (x86_64-pc-linux-gnu) libcurl/7.64.0 OpenSSL/1.1.1d zlib/1.2.11 libidn2/2.0.5 libpsl/0.20.2 (+libidn2/2.0.5) libssh2/1.8.0 nghttp2/1.36.0 librtmp/2.3
Release-Date: 2019-02-06
Protocols: dict file ftp ftps gopher http https imap imaps ldap ldaps pop3 pop3s rtmp rtsp scp sftp smb smbs smtp smtps telnet tftp 
Features: AsynchDNS IDN IPv6 Largefile GSS-API Kerberos SPNEGO NTLM NTLM_WB SSL libz TLS-SRP HTTP2 UnixSockets HTTPS-proxy PSL 

Also this problem is not because of the compressed flag

@jay
Copy link
Member

@jay jay commented Mar 3, 2020

I can confirm the bug on

What happens if you do multiple runs (delete the files before each iteration), is it timing sensitive? Is it the server, perhaps? Did you check the verbose output to see what was actually being received?

@emilengler
Copy link
Contributor

@emilengler emilengler commented Mar 4, 2020

@jay It is not timing sensitive and has nothing to do with the server.
I also downloaded the files with 0 bytes size with easy instead of multiple and it worked there just fine

@flashercs
Copy link
Author

@flashercs flashercs commented Mar 4, 2020

@jay @emilengler Thanks for reply.
I found out that if the value of --parallel-max is less than the total count of urls, some downloaded files will be 0 bytes,and the output is like:
DL% UL%  Dled  Uled  Xfers  Live   Qd Total     Current  Left    Speed
--  --  75818     0    11     0     3 --:--:-- --:--:-- --:--:--  189k

Note that the Qd value is equal to the count of files with 0 bytes.

Otherwise,when the --parallel-max is equal to or greater than the total count of urls,all downloaded files will be OK,and the output like:
DL% UL%  Dled  Uled  Xfers  Live   Qd Total     Current  Left    Speed
--  --   106k     0    11     0     0 --:--:-- --:--:-- --:--:--  151k

I hope this information would be helpful.

@jay
Copy link
Member

@jay jay commented Mar 7, 2020

I hope this information would be helpful.

Thanks it is.

Bisected to e59371a. @bagder queued handles seem to be indefinitely skipped at the beginning of parallel transfers, was there a reason for that?

curl/src/tool_operate.c

Lines 2041 to 2061 in b8d1366

/*
* add_parallel_transfers() sets 'morep' to TRUE if there are more transfers
* to add even after this call returns. sets 'addedp' to TRUE if one or more
* transfers were added.
*/
static CURLcode add_parallel_transfers(struct GlobalConfig *global,
CURLM *multi,
CURLSH *share,
bool *morep,
bool *addedp)
{
struct per_transfer *per;
CURLcode result = CURLE_OK;
CURLMcode mcode;
*addedp = FALSE;
*morep = FALSE;
result = create_transfer(global, share, addedp);
if(result || !*addedp)
return result;
for(per = transfers; per && (all_added < global->parallel_max);
per = per->next) {

diff --git a/src/tool_operate.c b/src/tool_operate.c
index 4b3caa8..ab06b71 100644
--- a/src/tool_operate.c
+++ b/src/tool_operate.c
@@ -2055,7 +2055,7 @@ static CURLcode add_parallel_transfers(struct GlobalConfig
   *addedp = FALSE;
   *morep = FALSE;
   result = create_transfer(global, share, addedp);
-  if(result || !*addedp)
+  if(result)
     return result;
   for(per = transfers; per && (all_added < global->parallel_max);
       per = per->next) {
@bagder
Copy link
Member

@bagder bagder commented Mar 7, 2020

I really can't recall my thinking with that logic off the top of my head...

@bagder
Copy link
Member

@bagder bagder commented Mar 15, 2020

@flashercs most/all of those links give 404s with no body content so I presume they no longer reproduce the problem? Can you provide a set that does repro the problem again?

@flashercs
Copy link
Author

@flashercs flashercs commented Mar 15, 2020

@flashercs most/all of those links give 404s with no body content so I presume they no longer reproduce the problem? Can you provide a set that does repro the problem again?

404,maybe because it's a Chinese website,but the problem can be reproduced easily.For example,several pictures source from different websites.

--remote-name-all
--location
--compressed
--user-agent "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.3) Gecko/2008092416 Firefox/3.0.3"
--url https://assets3.thrillist.com/v1/image/1679696/size/tmg-facebook_social.jpg
--url https://weneedfun.com/wp-content/uploads/2016/09/Amazing-Photos-Of-Nature-15.jpg
--url https://cdn.lolwot.com/wp-content/uploads/2016/04/10-scenic-valleys-that-you-need-to-visit-1.jpg
--url https://www.iliketowastemytime.com/sites/default/files/south-africa-hougaard-malan12.jpg
--url https://weneedfun.com/wp-content/uploads/2016/07/Most-Beautiful-Sunset-Pictures-15.jpg
--url https://s2.best-wallpaper.net/wallpaper/1280x1024/1209/Latvian-autumn-forest-river-mist-in-the-morning_1280x1024.jpg
--url https://weneedfun.com/wp-content/uploads/2016/07/Most-Beautiful-Sunset-Pictures-3-1024x631.jpg
--url https://cdn.thecrazytourist.com/wp-content/uploads/2018/07/ccimage-shutterstock_178412015.jpg
--url https://weneedfun.com/wp-content/uploads/2016/08/Beautiful-Ocean-Sunrise-Sunset-Photos-10.jpg
--url http://beautyharmonylife.com/wp-content/uploads/2014/04/distant-rain-cloud-on-highway.jpg
--url http://wallpapersdsc.net/wp-content/uploads/2016/09/Rio-De-Janeiro-High-Definition-Wallpapers.jpg
--parallel
--parallel-max 3
@bagder
Copy link
Member

@bagder bagder commented Mar 16, 2020

Thanks

bagder added a commit that referenced this issue Mar 16, 2020
Trying to return early from the funtion if no new transfers were added
would break the "morep" argument and cause issues. This could to zero
content "transfers" (within quotes since they would never be started)
when parallel-max was reduced.

Reported-by: Gavin Wong
Analyzed-by: Jay Satiro
Fixes #4937
@bagder bagder closed this in 95c36ff Mar 16, 2020
@lock lock bot locked as resolved and limited conversation to collaborators Jun 24, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

4 participants
You can’t perform that action at this time.