SignalP jobs failing with "error running HOW" #24

peterjc · 2017-09-21T13:41:05Z

We had SignalP working nicely in Galaxy on our old instance running on the cluster as the Galaxy user, but on our new Galaxy instance running on the same cluster as the associated user's Linux account this can happen:

Fatal error: Exit code 1 ()
open: can't stat file
apparent state: unit 3 named input.how
lately reading sequential formatted external IO
One or more tasks failed, e.g. 1 from 'signalp -short -t euk /tmp/86666.1.ln.q/tmpJo4GgL/signalp.0.tmp > /tmp/86666.1.ln.q/tmpJo4GgL/signalp.0.tmp.out' gave:
error running HOW
 
Error 256 from SignalP:
python /mnt/du-synology/v1shr1/galaxy/shed_tools/toolshed.g2.bx.psu.edu/repos/peterjc/tmhmm_and_signalp/7de64c8b258d/tmhmm_and_signalp/tools/protein_analysis/signalp3.py euk 0 8 /mnt/du-synology/v1shr1/galaxy/galaxy-dist/database/job_working_directory/003/3599/galaxy_dataset_7109.dat.fasta.tmp /mnt/du-synology/v1shr1/galaxy/galaxy-dist/database/job_working_directory/003/3599/galaxy_dataset_7109.dat.tabular.tmp

Error is being raised here:

pico_galaxy/tools/protein_analysis/signalp3.py

Line 206 in 37d5b47

    
           sys.exit("One or more tasks failed, e.g. %i from %r gave:\n%s" % (error_level, cmd, output),

My script signalp3.py breaks up the input FASTA file into chunks of 500 sequences and by default uses four worker threads at once calling SignalP (which is single threaded).

This is on top of the optional Galaxy parallelisation setting which breaks up the parent FASTA input file into chunks of 2000 sequences (i.e. 4 times 500):

pico_galaxy/tools/protein_analysis/signalp3.xml

Line 5 in 37d5b47

    
           <parallelism method="basic" split_inputs="fasta_file" split_mode="to_size" split_size="2000" merge_outputs="tabular_file"></parallelism>

I've not pinned it down but think it is something about SignalP using predictable temp file names clashing when running child processes on a cluster node (and we expect sets of four jobs to get started around the same time on the same nodes).

CC @peterthorpe5

The text was updated successfully, but these errors were encountered:

peterjc · 2017-09-21T14:10:16Z

Testing with the latest code showed another error, my mistake with past sys.exit(...) clean up:

$ python signalp3.py euk 70 4 /mnt/du-synology/v1shr1/galaxy/galaxy-dist/database/files/005/dataset_5286.dat /mnt/du-synology/v1shr1/galaxy/galaxy-dist/database/job_working_directory/003/3600/galaxy_dataset_7110.dat
/bin/sh: signalp: command not found
Traceback (most recent call last):
  File "signalp3.py", line 207, in <module>
    error_level)
TypeError: exit expected at most 1 arguments, got 2

That's fixed in e1a43a4 and this now gives:

$ python signalp3.py euk 70 4 /mnt/du-synology/v1shr1/galaxy/galaxy-dist/database/files/005/dataset_5286.dat /mnt/du-synology/v1shr1/galaxy/galaxy-dist/database/job_working_directory/003/3600/galaxy_dataset_7110.dat
/bin/sh: signalp: command not found
One or more tasks failed, e.g. 127 from 'signalp -short -t euk /tmp/tmpNw4EKX/signalp.0.tmp > /tmp/tmpNw4EKX/signalp.0.tmp.out' with no output

I should perhaps special case not being able to find the signalp binary on the $PATH for a more helpful error, but this is not the problem here.

peterjc · 2017-10-18T13:14:20Z

This also affects the RXLR Galaxy tool which calls SignalP via this signalp.py script, reported by @peterthorpe5

peterjc · 2017-10-18T15:43:54Z

I now strongly suspect this is a file system issue, where the temporary FASTA file I have created is not ready for reading when SignalP is launched.

peterjc · 2017-10-19T15:58:52Z

The temporary FASTA files are probably not involved. Running this single threaded and testing with a single temporary FASTA file (with sleeps after creating it), I am currently seeing about 80% failure to 20% success for the same job via Galaxy.

Adding debugging to the signalp bash script itself has narrowed this down:

$ diff signalp signalp.backup
361,368d360
< 	echo "DEBUG: Will run HOW step p=$p, TYPE=$TYPE, SYN=$SYN and HOWFILE=$HOWFILE, PWD=$PWD" 1>&2;
< 	echo "DEBUG: Checking CS file, $SYN/CS.$TYPE.$p.syn" 1>&2;
< 	stat $SYN/CS.$TYPE.$p.syn 1>&2;
< 	echo "DEBUG: Checking SP file, $SYN/SP.$TYPE.$p.syn" 1>&2;
< 	stat $SYN/SP.$TYPE.$p.syn 1>&2;
< 	echo "DEBUG: Checking HOW file, $HOWFILE" 1>&2;
< 	stat $HOWFILE 1>&2;
< 	echo "DEBUG: $TESTHOW -w $SYN/CS.$TYPE.$p.syn $HOWFILE >$NNOUTRAW.C.$p && $TESTHOW -w $SYN/SP.$TYPE.$p.syn $HOWFILE >$NNOUTRAW.S.$p || ..." 1>&2;

The problem is inside bin/testhow which is also a bash script, so time for more debugging,

$ diff testhow testhow.backup
2,3d1
< echo "DEBUG starting testhow..." 1>&2;
< 
124,125d121
< echo "DEBUG: Setting HOW variable to $HOW" 1>&2;
< 
158,159d153
< echo "DEBUG: About to read vars from synaps-file" 1>&2;
< 
223,228d216
< echo "DEBUG: About to run: $HOW <<END_OF_HOW ... END_OF_HOW" 1>&2;
< echo "DEBUG: SYNFIL = $SYNFIL, stat:" 1>&2;
< stat $SYNFIL 1>&2;
< echo "DEBUG: DATA = $DATA, stat:" 1>&2;
< stat $DATA 1>&2;
< 
440,441d427
< echo "DEBUG: Finished HOW. Data cleanup? RMDATA=$RMDATA" 1>&2;
<

The failing step is this multi-line command using a bash here-document to pipe text into the black box binary how:

$HOW <<END_OF_HOW | $AWK -v head=$HEAD '                                                                                                                          
        BEGIN {if (head) out=1}         # Get everything                                                                                                          
        /^ T\*SAMPLE\*/ {out=1}         # Get default output                                                                                                      
        /^ #/ {out=1}                   # Get -w or -s output                                                                                                     
        /^ *\*\**[^*]/ {out=1;error=1}  # Get error messages always!                                                                                              
        out==1                                                                                                                                                    
        END { if (!out) error=1         # No output = error                                                                                                       
                exit(error)                                                                                                                                       
        }                                                                                                                                                         
' || exit 1
...
END_OF_HOW

Both files $SYNFIL (model specifc static file under syn/...) and $DATA (input.how in working directory) exist and can be stat'ed.

It is unclear if the problem is one of these, the stdin to the how binary, or something else.

ibebio · 2024-04-03T15:45:47Z

We ran into this error message on a nextflow pipeline, predector (https://github.com/ccdmb/predect/)
The file system issue guess from @peterjc helped me to solve this, by setting the working directory of the SignalP process to a local /tmp file system, instead of a networked one. Then, this error disappeared.

xref: peterjc/pico_galaxy#24

peterjc mentioned this issue Oct 11, 2017

Update signalp to 4.1 #25

Open

peterjc added the bug label Jul 9, 2018

Neato-Nick mentioned this issue Jun 17, 2021

rxlr_motifs or rxlr_venn_workflow? #37

Closed

bgruening added a commit to usegalaxy-eu/infrastructure-playbook that referenced this issue Jul 27, 2024

Provide a local filesystem for singnal P

90d05da

xref: peterjc/pico_galaxy#24

bgruening mentioned this issue Jul 27, 2024

Provide a local filesystem for singnal P usegalaxy-eu/infrastructure-playbook#1282

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SignalP jobs failing with "error running HOW" #24

SignalP jobs failing with "error running HOW" #24

peterjc commented Sep 21, 2017

peterjc commented Sep 21, 2017

peterjc commented Oct 18, 2017

peterjc commented Oct 18, 2017

peterjc commented Oct 19, 2017

ibebio commented Apr 3, 2024

SignalP jobs failing with "error running HOW" #24

SignalP jobs failing with "error running HOW" #24

Comments

peterjc commented Sep 21, 2017

peterjc commented Sep 21, 2017

peterjc commented Oct 18, 2017

peterjc commented Oct 18, 2017

peterjc commented Oct 19, 2017

ibebio commented Apr 3, 2024