-
Notifications
You must be signed in to change notification settings - Fork 439
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
md5_file: Too many open files #1175
Comments
Commented by Nicolas on 12 May 42647863 23:38 UTC |
Commented by davea on 7 Oct 42647966 16:43 UTC |
Commented by Nicolas on 25 Jun 42647991 02:28 UTC |
Commented by smoe on 11 Oct 42714773 09:42 UTC
where the only bad tool is indeed iceweasel for all the images etc. While googling about it, I got across
which pointed to a never released shared memory. Is this what is happening? Some wild polling/pushing on shared memory in a threaded environment that somewhat has gone wild? Please kindly review respective communication code for any such evidence. [[BR]]Steffen |
Commented by smoe on 11 Oct 42714780 15:24 UTC Cheers, Steffen Replying to comment:4 Nicolas:
|
Commented by davea on 23 Feb 42726792 15:27 UTC |
related to #1114 but this time BOINC is causing the problem on it's own. --- client/app_control.cpp (revision 26057)
+++ client/app_control.cpp (working copy)
@@ -818,6 +818,7 @@
int ACTIVE_TASK::read_stderr_file() {
char* buf1, *buf2;
char path[MAXPATHLEN];
+ int retval;
// truncate stderr output to the last 63KB;
// it's unlikely that more than that will be useful
@@ -825,9 +826,9 @@
int max_len = 63*1024;
sprintf(path, "%s/%s", slot_dir, STDERR_FILE);
if (!boinc_file_exists(path)) return 0;
- if (read_file_malloc(path, buf1, max_len, !config.stderr_head)) {
- return ERR_MALLOC;
- }
+
+ retval = read_file_malloc(path, buf1, max_len, !config.stderr_head)
+ if (retval) return retval;
// if it's a vbox app, check for string in stderr saying
// the job failed because CPU VM extensions disabled |
I made that change, but it doesn't involve leaking file descriptors. |
Reported by smoe on 8 Jan 42640604 02:59 UTC
I found the boinc-client to have stopped for no apparent reason. It was working only with a local self-built SETI client. I had seen this once a long time before, though, back then with the WCG.
From stderrdae.txt:
No protocol specified[protocol specified[BR]No protocol specified[protocol specified[BR]...
dir_open: Could not open directory 'slots/0'.[Could not open directory 'slots/18'.[BR]dir_open: Could not open directory 'slots/17'.[Could not open directory 'slots/12'.[BR]dir_open: Could not open directory 'slots/7'.[Could not open directory 'slots/4'.[BR]dir_open: Could not open directory 'slots/22'.[Could not open directory 'slots/19'.[BR]dir_open: Could not open directory 'slots/9'.[Could not open directory 'slots/16'.[BR]dir_open: Could not open directory 'slots/14'.[Could not open directory 'slots/20'.[BR]dir_open: Could not open directory 'slots/8'.[Could not open directory 'slots/3'.[BR]dir_open: Could not open directory 'slots/23'.[Could not open directory 'slots/11'.[BR]...
dir_open: Could not open directory 'slots/7'.[Could not open directory 'slots/7'.[BR]dir_open: Could not open directory 'slots/7'.[can't open projects/einstein.phys.uwm.edu/einstein_S6LV1_1.10_i686-pc-linux-gnu!__SSE2[BR]md5_file: Too many open files[Could not open directory 'projects/setiathome.berkeley.edu'.[BR]dir_open: Could not open directory 'slots/24'.[can't open projects/setiathome.berkeley.edu/14ja12ac.18155.67.4.10.61_1_0[BR]md5_file: Too many open files[Could not open directory 'slots/14'.[BR]dir_open: Could not open directory 'slots/14'.[can't open projects/einstein.phys.uwm.edu/hsgamma_FGRP1_0.23_i686-pc-linux-gnu[BR]md5_file: Too many open files[can't open projects/boinc.bakerlab.org_rosetta/minirosetta_3.26_x86_64-pc-linux-gnu[BR]md5_file: Too many open files[Could not open directory 'projects/docking.cis.udel.edu'.[BR]dir_open: Could not open directory 'projects/spin.fh-bielefeld.de'.[Could not open directory 'projects/boinc.fzk.de_poem'.[BR]dir_open: Could not open directory 'projects/qah.uni-muenster.de'.[Could not open directory 'projects/www.rechenkraft.net_yoyo'.[BR]dir_open: Could not open directory 'projects/www.worldcommunitygrid.org'.[Could not open directory 'slots/21'.[BR]dir_open: Could not open directory 'slots/21'.[Could not open directory 'slots/21'.[BR]md5_file: can't open projects/www.worldcommunitygrid.org/wcg_faah_autodock_6.40_i686-pc-linux-gnu[Too many open files[BR]dir_open: Could not open directory 'projects/lhcathomeclassic.cern.ch_sixtrack'.[Could not open directory 'slots/0'.[BR]dir_open: Could not open directory 'slots/1'.[Could not open directory 'slots/2'.[BR]....
dir_open: Could not open directory 'slots/21'.[Could not open directory 'slots/22'.[BR]dir_open: Could not open directory 'slots/23'.[Could not open directory 'slots/4'.[BR]md5_file: can't open projects/setiathome.berkeley.edu/30dc09aj.1678.25025.13.10.226_2_0[Too many open files[BR]
From stdoutdae.txt:
21-Aug-2012 10:08:37 [Temporarily failed download of 23jn11ad.13583.17249.14.10.205: transient HTTP error[BR]21-Aug-2012 10:08:37 [Backing off 4 min 36 sec on download of 23jn11ad.13583.17249.14.10.205[BR]21-Aug-2012 10:08:37 [Temporarily failed download of 31oc10ac.1632.15183.4.10.58: transient HTTP error[BR]21-Aug-2012 10:08:37 [Backing off 5 min 27 sec on download of 31oc10ac.1632.15183.4.10.58[BR]21-Aug-2012 10:09:01 [Project communication failed: attempting access to reference site[BR]21-Aug-2012 10:09:02 [Internet access OK - project servers may be temporarily down.[BR]21-Aug-2012 10:13:40 [Started download of 05my12ad.31349.14382.3.10.249[BR]21-Aug-2012 10:13:40 [Started download of 05my12ad.31349.14382.3.10.255[BR]21-Aug-2012 10:13:53 [Finished download of 05my12ad.31349.14382.3.10.249[BR]21-Aug-2012 10:13:53 [Started download of 23jn11ad.13583.17249.14.10.241[BR]21-Aug-2012 10:13:54 [Finished download of 05my12ad.31349.14382.3.10.255[BR]21-Aug-2012 10:13:54 [Started download of 23jn11ad.13583.17249.14.10.205[BR]21-Aug-2012 10:14:02 [Finished download of 23jn11ad.13583.17249.14.10.241[BR]21-Aug-2012 10:14:02 [Started download of 05my12ad.31349.14382.3.10.224[BR]21-Aug-2012 10:14:12 [Finished download of 23jn11ad.13583.17249.14.10.205[BR]21-Aug-2012 10:14:12 [Finished download of 05my12ad.31349.14382.3.10.224[BR]21-Aug-2012 10:14:12 [Started download of 31oc10ac.1632.15183.4.10.58[BR]21-Aug-2012 10:14:12 [Started download of 30dc09aj.1678.25025.13.10.226[BR]21-Aug-2012 10:14:29 [Finished download of 30dc09aj.1678.25025.13.10.226[BR]21-Aug-2012 10:14:29 [Started download of 31oc10ac.1632.15183.4.10.64[BR]21-Aug-2012 10:14:30 [Finished download of 31oc10ac.1632.15183.4.10.58[BR]21-Aug-2012 10:14:34 [Finished download of 31oc10ac.1632.15183.4.10.64[BR]21-Aug-2012 10:17:46 [Started download of 30jn10ab.1159.23777.7.10.6.vlar[BR]21-Aug-2012 10:17:55 [Starting task 23jn11ad.13583.17249.14.10.241_1 using setiathome_enhanced version 612 in slot 0[BR]21-Aug-2012 10:17:55 [Starting task 05my12ad.31349.14382.3.10.247_0 using setiathome_enhanced version 612 in slot 1[BR]21-Aug-2012 10:17:55 [Starting task 23jn11ad.13583.17249.14.10.229_1 using setiathome_enhanced version 612 in slot 2[BR]21-Aug-2012 10:17:55 [Starting task 05my12ad.31349.14382.3.10.224_1 using setiathome_enhanced version 612 in slot 3[BR]21-Aug-2012 10:17:55 [Starting task 30dc09aj.1678.25025.13.10.226_2 using setiathome_enhanced version 612 in slot 4[BR]21-Aug-2012 10:17:55 [Starting task 23jn11ad.13583.17249.14.10.228_0 using setiathome_enhanced version 612 in slot 5[BR]21-Aug-2012 10:17:55 [Starting task 23jn11ad.13583.17249.14.10.248_0 using setiathome_enhanced version 612 in slot 6[BR]21-Aug-2012 10:17:55 [Starting task 30dc09aj.1678.25025.13.10.220_2 using setiathome_enhanced version 612 in slot 7[BR]21-Aug-2012 10:17:55 [Starting task 05my12ad.31349.14382.3.10.255_0 using setiathome_enhanced version 612 in slot 8[BR]21-Aug-2012 10:17:55 [Starting task 05my12ad.31349.14382.3.10.249_0 using setiathome_enhanced version 612 in slot 9[BR]21-Aug-2012 10:17:55 [Starting task 23jn11ad.13583.17249.14.10.205_1 using setiathome_enhanced version 612 in slot 10[BR]21-Aug-2012 10:17:55 [Starting task 05my12ad.31349.14382.3.10.246_0 using setiathome_enhanced version 612 in slot 11[BR]21-Aug-2012 10:17:55 [Starting task 27my10ac.18052.55637.5.10.1_2 using setiathome_enhanced version 612 in slot 12[BR]21-Aug-2012 10:17:55 [Starting task 31oc10ac.1632.15183.4.10.53_0 using setiathome_enhanced version 612 in slot 13[BR]21-Aug-2012 10:17:55 [Starting task 31oc10ac.1632.15183.4.10.41_1 using setiathome_enhanced version 612 in slot 14[BR]21-Aug-2012 10:17:55 [Starting task 31oc10ac.1632.15183.4.10.58_0 using setiathome_enhanced version 612 in slot 15[BR]21-Aug-2012 10:17:55 [Starting task 31oc10ac.1632.15183.4.10.49_0 using setiathome_enhanced version 612 in slot 16[BR]21-Aug-2012 10:17:55 [Starting task 31oc10ac.1632.15183.4.10.52_0 using setiathome_enhanced version 612 in slot 17[BR]21-Aug-2012 10:17:55 [Starting task 31oc10ac.1632.15183.4.10.64_0 using setiathome_enhanced version 612 in slot 18[BR]21-Aug-2012 10:17:55 [Starting task 31oc10ac.1632.15183.4.10.36_1 using setiathome_enhanced version 612 in slot 19[BR]21-Aug-2012 10:17:55 [Starting task 31oc10ac.1632.15183.4.10.28_1 using setiathome_enhanced version 612 in slot 20[BR]21-Aug-2012 10:17:55 [Starting task 31oc10ac.1632.15183.4.10.47_0 using setiathome_enhanced version 612 in slot 21[BR]21-Aug-2012 10:17:59 [Finished download of 30jn10ab.1159.23777.7.10.6.vlar[BR]21-Aug-2012 10:17:59 [Starting task 30jn10ab.1159.23777.7.10.6.vlar_3 using setiathome_enhanced version 612 in slot 22[BR]21-Aug-2012 10:18:26 [Started download of 19se10ac.457.271346.15.10.37.vlar[BR]21-Aug-2012 10:18:35 [Finished download of 19se10ac.457.271346.15.10.37.vlar[BR]21-Aug-2012 10:18:35 [Starting task 19se10ac.457.271346.15.10.37.vlar_3 using setiathome_enhanced version 612 in slot 23[BR]21-Aug-2012 10:48:33 [Can't get task disk usage: opendir() failed[BR]21-Aug-2012 10:48:33 [Can't get task disk usage: opendir() failed[BR]21-Aug-2012 10:48:33 [Can't get task disk usage: opendir() failed[BR]21-Aug-2012 10:48:33 [Can't get task disk usage: opendir() failed[BR]21-Aug-2012 10:48:33 [Can't get task disk usage: opendir() failed[BR]....
1-Aug-2012 11:38:37 [Can't get task disk usage: opendir() failed[BR]21-Aug-2012 11:38:37 [Can't get task disk usage: opendir() failed[BR]21-Aug-2012 11:38:37 [Can't get task disk usage: opendir() failed[BR]21-Aug-2012 11:38:37 [Can't get task disk usage: opendir() failed[BR]21-Aug-2012 11:45:55 [read_stderr_file(): malloc() failed[BR]21-Aug-2012 11:45:55 [Computation for task 30dc09aj.1678.25025.13.10.226_2 finished[BR]21-Aug-2012 11:45:55 [Can't open client_state_next.xml: fopen() failed[BR]21-Aug-2012 11:45:55 [Couldn't write state file: fopen() failed; giving up[BR]
Migrated-From: http://boinc.berkeley.edu/trac/ticket/1203
The text was updated successfully, but these errors were encountered: