Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

more processes than available cores #42

Open
EricDeveaud opened this issue Jan 24, 2023 · 9 comments
Open

more processes than available cores #42

EricDeveaud opened this issue Jan 24, 2023 · 9 comments

Comments

@EricDeveaud
Copy link

Hello

using ClinSV-1.0.0 we noticed some potential problem when ClinSV is run through the use of taskset, batch system like slurm, etc...

clinsv have some harcoded number of process for xargs

xargs -P 15, xargs -P 12 which may be over the available procs for the process.

eg if I run ClinSV trough our standard slurm allocation (4 procs) we will have at least 15 processes that compete on 4 cores leading to some over subscription.

I dunno how to get the number of available procs for a process (not number of physical process on host) in perl
like len(os.sched_getaffinity(0)))" in python or
omp_get_max_threads(); in C++

it would be nice to have xargs -P value limited to the number of available procs

anyway a shell solution will be to compute the nnumber of available cpus npoc is adequat for that and replace harcoded values by the computed number
something like

compute ncpus

(( ncpus = `nproc` < 15 ?  `nproc` : 15 )) 

then replace xargs -P 15 by xargs -P ${ncpus}

regards

Eric

@EricDeveaud
Copy link
Author

here's how I patched my copy of clinsv.

ncpus 15, 12 and 1 (not really usefull in case of 1, but anyway) allow to limit the nimber of parallel jobs when submited in a slurm allocation or in taskset context.
NB I alos used $^X in place of plain perl call just to be sure to use the same interpreter.
see:https://perldoc.perl.org/perlvar#$%5EX

@@ -319,20 +316,25 @@
                open(OUT,">$$cJ{rShStem}.sh");  
                print OUT "$r_head\n\n";        
                print OUT "cd $r_OutDir/\n";
+               # avoid  overcommit. be sure to limit to available number of pro
cs
+               print OUT "(( ncpus15 = `nproc` < 15 ?  `nproc` : 15 ))\n";
+               print OUT "(( ncpus12 = `nproc` < 12 ?  `nproc` : 12 ))\n ";
+               print OUT "(( ncpus1 = `nproc` < 1 ?  `nproc` :1))\n";
                
                foreach $cT (("q0","q20","mq")){ # create bw of s1 for MQ>=0 and
 MQ>=20
                        if($cT eq "mq"){
-                               print OUT "\n\nawk '\$2>=100000 {print \$1\":1-\
"\$2}' $refFasta.chrom.sizes | xargs -P 15 -t -i{} perl $S_SV_scriptDir/bam2wigM
Q.pl ";
+                               # use same perl than the one clinsv run with
+                               print OUT "\n\nawk '\$2>=100000 {print \$1\":1-\
"\$2}' $refFasta.chrom.sizes | xargs -P \${ncpus15} -t -i{} $^X  $S_SV_scriptDir
/bam2wigMQ.pl ";
                                print OUT "-s 1 -r \"{}\" -o $r_TmpDir/$cSample.
$cT -f $refFasta -b $rAlnDir/$cSample.bam\n";
                                
-                               print OUT "\n\nawk '\$2<100000 {print \$1\":1-\"
\$2}' $refFasta.chrom.sizes | xargs -P 1 -t -i{} perl $S_SV_scriptDir/bam2wigMQ.
pl ";
+                               print OUT "\n\nawk '\$2<100000 {print \$1\":1-\"
\$2}' $refFasta.chrom.sizes | xargs -P \${ncpus1} -t -i{} $^X $S_SV_scriptDir/ba
m2wigMQ.pl ";
                                print OUT "-s 1 -r \"{}\" -o $r_TmpDir/$cSample.
$cT.small_contigs -f $refFasta -b $rAlnDir/$cSample.bam -a \n";
 
                        }else{
-                               print OUT "\n\nawk '\$2>=100000 {print \$1\":1-\
"\$2}' $refFasta.chrom.sizes | xargs -P 15 -t -i{} perl $S_SV_scriptDir/bam2wig.
pl ";
+                               print OUT "\n\nawk '\$2>=100000 {print \$1\":1-\
"\$2}' $refFasta.chrom.sizes | xargs -P \${ncpus15} -t -i{} $^X $S_SV_scriptDir/
bam2wig.pl ";
                                print OUT "-s 1 -q $cT -r \"{}\" -o $r_TmpDir/$c
Sample.$cT -f $refFasta -b $rAlnDir/$cSample.bam\n";
                                
-                               print OUT "\n\nawk '\$2<100000 {print \$1\":1-\"
\$2}' $refFasta.chrom.sizes | xargs -P 1 -t -i{} perl $S_SV_scriptDir/bam2wig.pl
 ";
+                               print OUT "\n\nawk '\$2<100000 {print \$1\":1-\"
\$2}' $refFasta.chrom.sizes | xargs -P \${ncpus1} -t -i{} $^X $S_SV_scriptDir/ba
m2wig.pl ";
                                print OUT "-s 1 -q $cT -r \"{}\" -o $r_TmpDir/$c
Sample.$cT.small_contigs -f $refFasta -b $rAlnDir/$cSample.bam -a\n";
                        }       
                        print OUT "cat $r_TmpDir/$cSample.$cT.*.wig > $r_TmpDir/$cSample.$cT.wig\n\n";

illustration.

rpm_maker://tmp > cat no_overcommit.sh
#!/bin/sh 
(( ncpus15 = `nproc` < 15 ?  `nproc` : 15 ))
(( ncpus12 = `nproc` < 12 ?  `nproc` : 12 ))
(( ncpus1 = `nproc` < 1 ?  `nproc` :1))

echo "ncpus15: ${ncpus15} - ncpus12: ${ncpus12} - ncpus1: ${ncpus1}"

and execution in various conditions

rpm_maker://tmp > nproc 
64
rpm_maker://tmp > sh no_overcommit.sh 
ncpus15: 15 - ncpus12: 12 - ncpus1: 1

now limit to 12 available procs

rpm_maker://tmp > taskset -c 0-11 sh no_overcommit.sh 
ncpus15: 12 - ncpus12: 12 - ncpus1: 1

even less

rpm_maker://tmp > taskset -c 0-3 sh no_overcommit.sh 
ncpus15: 4 - ncpus12: 4 - ncpus1: 1

regards

Eric

@drmjc
Copy link
Member

drmjc commented Feb 9, 2023 via email

@EricDeveaud
Copy link
Author

as testing with NA12878_b38.bam is a long time consumer I did not tested with less that the requested ressources.
I ran it using plain 64 CPUs ie axrgs -P 15, -P 12 as initialy
so sorry I won't be abble to comment on performances

it would be nice to have a small dataset to test

@drmjc
Copy link
Member

drmjc commented Feb 11, 2023 via email

@J-Bradlee
Copy link
Collaborator

@EricDeveaud
Copy link
Author

thanks,

as soon as Ican. get my hands on the my computer I'll try with this ones.

stay tunned

@EricDeveaud
Copy link
Author

maybee you can chnage bam file permissions to Anyone with the link.
currently I can't get it from the cluster (different id) using gdown

@J-Bradlee
Copy link
Collaborator

J-Bradlee commented Feb 14, 2023

Apologies I thought I had set it to anyone with link. Try again now

@EricDeveaud
Copy link
Author

it's OK now
thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants