-
Notifications
You must be signed in to change notification settings - Fork 354
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
error in bcbio structural variant calling #653
Comments
Shangqian; I pushed a fix which resolves the issue by ensuring we use the Python installed with bcbio, which does contain numpy. If you upgrade with:
it will grab the latest code and should work cleanly now. Thanks again. |
Hi Brad,
|
Shangqian; |
sorry for my typing mistake in three caller :). Thanks again for your helpful suggestion and contribution. The bcbio is great and useful for me. |
Hi Brad, when I run above code, there exists following error: I know the problem is in VCF file, so I open the two VCF files and find the header sample names are different: the order in gatk-hc is sample10/sample8/sample9, but sample8/sample10/sample9 in samtols. finally, this problem is solved since I correct the same order. However, I don't think this is a good way to manually modify every time. So, is there an automatic way for the same header by just modifying the input yaml file or bcbio-nextgen. kind regards, |
…with identical file names. Thanks to Severine Catreux. Resort multisample inputs that have inconsistent sample orders. Fixes bcbio/bcbio-nextgen#653
Shangqian;
and re-run it should hopefully work cleanly now. Thanks again for the reports. |
Hi Brad, Thanks for your reponse. I had updated the bcbio-nextgen. Thanks a lot. [2014-11-10 17:06] ##### ERROR MESSAGE: There was a failure because you did not provide enough memory to run this program. See the -Xmx JVM argument to adjust the maximum heap size provided to Java Kind regards, |
Shangqian; |
Thank you, Brad,.The cancer pipeline was done well. Many thanks for your big help every time. By the way, does the bcbio require the same length of paired-end Read1 and Read2 for bwa-men alignment? Because the different length Read1 and Read 2 that were trimed by trimmomatic showed the error: "paired reads have different names". In my view the bwa-men may be normal for different length read alignment. So, is there some special setting or some parameters than I didn't mention in bcbio. Thanks again. :) |
Shangqian; http://www.usadellab.org/cms/index.php?page=trimmomatic It sounds like you may have trimmed separately or added the unpaired reads in which creates non-identical pair names in your fastq files. Hope this helps. |
Hi Brad, [2014-11-20 18:16] [mem_sam_pe] paired reads have different names: "FFECCFHG>DEGCGGABGBCGEIDCFGGH:DF######", "AF@?EEFDB>B3<>FCD?BFBCGGGGFEHGEHE7GHHHHDEIHGE=>FECDE=AECCH/7?>DC@IH8DCFCFDC" I think the error is because of the paired reads have different names, so I grep the "FFECCFHG>DEGCGGABGBCGEIDCFGGH:DF######" and "AF@?EEFDB>B3<>FCD?BFBCGGGGFEHGEHE7GHHHHDEIHGE=>FECDE=AECCH/7?>DC@IH8DCFCFDC" that are both in line 20000000 from NA12891 R1 and R2 fastq file. the result are : The result showed the same length of R1 and R2 sequence also had error. So I think may be is the data problem. Then I awk the 1M fastq from line 19999997-20999996 as the test NA12891_test R1 and R2 file. When I run the test file that just line 19999997-20999996 from original files and also include the @206B4ABXX100825:6:61:6782:130154/1 and /2 reads. There is normal and work well without any error. So I am uncertain where is the problem. Any advice woulde be appreciated for me. Thanks again. Shangqian |
Shangqian;
You may need to also remove |
Brad,
my yaml file is :
Thanks again for your helpful advice. Best, |
Hi Shangqian, Sorry about the DEXSeq issue; linking will fix it, our pre-built indices have the wrong extension. For the scheduler and queue, on your HPC, is there a job scheduler that you submit your jobs to that distributes the jobs over the nodes? There are a bunch of different types of scheduler, LSF is one, there are others like SLURM and SGE. If you can find out what scheduler your HPC has running then you put that as the scheduler, and then the queue you are allowed to submit jobs to as the queue. |
If you explicitly set a get_x or set_x function, it will not be overwritten. This lets special complicated cases get handled inside there without changing the function signature everywhere. Fix to allow for .gff and .gff3 extensions for the DEXseq file. Addresses an issue raised in #653.
Shangqian, I fixed this DEXSeq behavior so now it will find either .dexseq.gff or .dexseq.gff3 files here: 0e9c746. |
Hi Roryk, Thanks for your promptly response, It helps me so much. Thanks a lot. best, |
RoryK, The gff problem had been fixed,but the other issue existed. the error showed: So how can I to install DEXSeq packages in bcbio, Thanks. |
Hi Shangqian, Hm-- it should be getting installed automatically. If you fire up R and do:
it should install it. |
Hi Roryk, I install DEXSeq in the node with R. But when I run bcbio, it also can't find this packages, I think it maybe package DEXSeq was not add into the bcbio running. Shangqian |
Hi Shangqian, I agree, that seems like it should work, thanks for helping to debug this. Hmm-- if you type:
Does it output a directory or say the package cannot be found? If it works, does it work also with R_LIBRARY_PATH unset? |
…ndle GFF retrieval when no GTF file in DEXSeq unit tests. Fixes #653
Shangqian;
it should hopefully work cleanly now. Thanks for the bug report and hope this fixes it for you. |
Brad and Roryk, By the way, in former test of whole genome SV, the bcbio was normal. But five days ago, I submitted a true lung cancer data to analysis SV, and the error happened today morning : [2014-12-03 09:01] Index BAM file: 1_2014-11-28_ceu-sort-7_78680799_94537032-prep.bam I can't find the cause for this error, because the same command was normal for another bam file. Can you help me. Thanks. Shangqian |
…ut bai file) since not supported in samtools 1.1 (samtools/samtools#199). Fixes #653
Shangqian; |
Brad, I run 5 times ,every times has the same error.Can you check this? |
Shangqian;
|
Hi Brad, Is this problem caused by sambamba package, How can I fix this. |
Shangqian;
which should pull it in. Hope this helps. |
Bran, [root@compute-0-15 bin]# /public/software/bcbio-nextgen/tools/bin/brew update should I rm homebrew/science and re-upgrade? |
Shangqian;
then you should be able to re-run the updater and find everything working. Hope this helps figure it out. |
Brad, Thanks for your advice, I had upgraded the bcbio, and the dexseq is working now, but I test the exome pipeline ,under the command: it yields the following info: and at least one day to do this command (now is still running this command). Shangqian |
Shangqian; If you upgrade with |
Thanks,Brad, It had upgraded and solved the problem. Thanks again. |
Brad and Roryk, Our HPC has 32 nodes(each node has 20 cores) and PBS scheduler is used for submitting mission. I formerly submitted a pbs file to run bcbio just under one node and it worked well. But now I need to analyse many samples in bcbio. So I think I can use the parallel for crossing multi nodes just like Roryk had told me. The following is my test pbs file for node13 and node17: #PBS -N exome_s10 when I qsub this file, it yields error like this: [2014-12-10 11:45] compute-0-13.local: Resource requests: bwa, sambamba, samtools; memory: 2.0; cores: 16, 1, 16 So, I have a little confusion, is my understand your advice right? Should I need to modify my PBS file or the bcbio system. By the way, the mission just works on node13 but not in node17 after I qsub pbs file. Thanks again for your kindly response. Shangqian |
Hi Shanqian, Is your HPC busy? If you have to wait a long time to get a job, bcbio-nextgen will time out. When you tried to submit it, were the jobs pending for a long time or did they move to running status? If they are pending and bcbio-nextgen is timing out, you can have bcbio-nextgen wait for longer by adding --timeout time-in-minutes to the bcbio-nextgen command, so it won't time out while it is waiting. Hope that helps, let us know how it goes. Best, Rory |
Hi Rory, |
Hi Shangqian, If the nodes were idle then it might be an issue running on Torque. When you submit the job does everything get to the running state and it still times out, or are the jobs pending? If the jobs are in the running state but it still times out that would be very helpful to know. |
Hi Rory, |
Great, when the jobs are in the running state, is there a controller job and an engine job both running too? There should be three jobs running, one that is the bcbio_nextgen job in the script you wrote to submit to the scheduler. The other two should be a controller and a set of engines. Were all of those running, or just the one bcbio_nextgen job? If the controller and engine jobs were running too, are there files that have engine and ipcontroller in them that are in your directory? I think if you look at those, you should see some errors talking about heartbeats between the engine and controller. Do you see something like that? |
Thanks, Rory and the top shows the "bcbio-nextgene.py" is runnning. |
They should be appearing on there if they are running, so it seems like they aren't starting. Are there engine and ipcontroller files in the directory? There should be job submission scripts for each of them. |
yes, there are in the lib and pkgs folders /public/software/bcbio-nextgen/anaconda/pkgs/ipython-2.3.1-py27_0/bin/ipcontroller |
Hm-- nothing in the work directory? They should look like torque_controller with a bunch of letters and numbers after them. If we can track down those files we can try to figure out why they aren't getting run. |
ye, I had found them in the work directory. The contents are: Engines: |
Shangqian;
does it provide any useful error messages? It sounds like something with the submission is problematic with your setup and maybe this will provide a clue. Thanks again. |
Hi, I would try to submit one of this files. Normally they don't start It happened to me in one queue where you should submit a job with more On 12/10/2014 03:18 AM, shang-qian wrote:
|
Hi lpantano, |
Hi, if you run it just a file with the same header but with another thing, #!/bin/sh cd /SOME/PATH/WITH/FILES On 12/10/2014 09:57 AM, shang-qian wrote:
|
Hi Brad,
Thanks for your help. I want to call structural variants, but get an error: the parallel, svtyper, cnvnator_wrapper.py, cnvnator-multi, annotate_rd.py are not found in PATH, like this:
[2014-10-27 23:05] Uncaught exception occurred
Traceback (most recent call last):
File "/public/software/bcbio-nextgen/anaconda/lib/python2.7/site-packages/bcbio/provenance/do.py", line 20, in run
_do_run(cmd, checks, log_stdout)
File "/public/software/bcbio-nextgen/anaconda/lib/python2.7/site-packages/bcbio/provenance/do.py", line 93, in _do_run
raise subprocess.CalledProcessError(exitcode, error_msg)
CalledProcessError: Command 'set -o pipefail; speedseq sv -v -B ......
Sourcing executables from /public/software/bcbio-nextgen/tools/bin/speedseq.config ...
which: no parallel in (/public/software/bcbio-nextgen/tools/bin:/public/software/bcbio-nextgen/anaconda/bin:.....)
which: no svtyper in (/public/software/bcbio-nextgen/tools/bin:/public/software/bcbio....
which: no cnvnator_wrapper.py in (/public/software/bcbio-nextgen/tools/bin:/public/software/bcbio....
which: no cnvnator-multi in (/public/software/bcbio-nextgen/tools/bin:/public/software/bcbio-....
which: no annotate_rd.py in((/public/software/bcbio-nextgen/tools/bin:/....)
Calculating alignment stats... sambamba-view: (Broken pipe)
Traceback (most recent call last):
File "/public/software/bcbio-nextgen/tools/share/lumpy-sv/pairend_distro.py", line 12, in
import numpy as np
ImportError: No module named numpy
How can I fix this, thanks again.
Shangqian
The text was updated successfully, but these errors were encountered: