Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed to complete command task: 'rm_individual_seg_files' launched from master workflow, #6

Open
yu052 opened this issue Nov 18, 2020 · 11 comments

Comments

@yu052
Copy link

yu052 commented Nov 18, 2020

Dear author,

I built the singularity environment from the Docker image recently. I got this errror as mentioned in the title when I run it. The detailed error information is as follows:
[2020-11-17T17:07:15.734560] [node127.cm.cluster] [68840_1] [WorkflowRunner] [ERROR] Worklow terminated due to the following task errors:
[2020-11-17T17:07:15.735768] [node127.cm.cluster] [68840_1] [WorkflowRunner] [ERROR] Failed to complete command task: 'rm_individual_seg_files' launched from master workflow, error code: 1, command: 'rm'
[2020-11-17T17:07:15.736237] [node127.cm.cluster] [68840_1] [WorkflowRunner] [ERROR] [rm_individual_seg_files] Error Message:
[2020-11-17T17:07:15.736848] [node127.cm.cluster] [68840_1] [WorkflowRunner] [ERROR] [rm_individual_seg_files] Last 2 stderr lines from task (of 2 total lines):
[2020-11-17T17:07:15.736848] [node127.cm.cluster] [68840_1] [WorkflowRunner] [ERROR] [2020-11-17T15:50:23.975587] [node127.cm.cluster] [68840_1] [rm_individual_seg_files] rm: missing operand
[2020-11-17T17:07:15.736848] [node127.cm.cluster] [68840_1] [WorkflowRunner] [ERROR] [2020-11-17T15:50:23.975877] [node127.cm.cluster] [68840_1] [rm_individual_seg_files] Try 'rm --help' for more information.

Do you have any clue why this error happened? Can you please help me to solve it?

Additional information: I was implementing the program on a pair of WGS of canine tumour and normal tissue. The required reference files were properly made, except the snp_sites.gz file. But, I removed the option --callRegions of the SNP calling step using the Strelka from the main.py file. So the program can still work without the snp_sites file.

Regards,
Yun

@polyactis
Copy link
Owner

polyactis commented Nov 18, 2020 via email

@yu052
Copy link
Author

yu052 commented Nov 18, 2020

pyflow_log.zip

Thanks for your response!
I attach the pyflow log files.

I have a SNP-sites file for dogs, but I decided to disable the --callRegions because I got a strange error from that file which was that the chr33 in snp_sites.gz was not found in the reference genome. Maybe disabling that option was a bad idea. But do you have any clue why that particular chr33 can not be found in the reference genome? I checked the reference genome file, the chr33 was there I think.

Thanks for your help in advance!

Regards,
Yun

@polyactis
Copy link
Owner

Did you check if chr33 is in your genome index files (i.e. genome.dict and etc.)?

@fanxinping
Copy link
Collaborator

We saw this in the pyflow_tasks_stdout_log.txt, which suggests all reads in your bam fail to pass our filters:

Reading in genome coverage from "/home/WUR/yu052/DogWUR108_rh.dedup_st.reA.bam" ...

Reading and smoothing of coverage from "/home/WUR/yu052/DogWUR108_rh.dedup_st.reA.bam" is Done. 0 unique chromosomes, 902274498 reads.
Genome wide mean coverage is NaN
Reading in genome coverage from "/home/WUR/yu052/DogWUR115_rh.dedup_st.reA.bam" ...
Reading and smoothing of coverage from "/home/WUR/yu052/DogWUR115_rh.dedup_st.reA.bam" is Done. 0 unique chromosomes, 174036637 reads.
Genome wide mean coverage is NaN

Our rust program contains these filters. Can you check your bam to see why all reads fail to pass these filters?

            if record.mapq()<30 {
                continue
            }
            if record.is_paired() && ( record.insert_size()<0 || record.insert_size()>self.max_fragment_len as i64 ||
                !record.is_proper_pair() || record.is_mate_unmapped() || !record.is_first_in_template() ||
                record.is_secondary() || record.is_duplicate() || record.is_supplementary() ) {
                continue;
            }

@yu052
Copy link
Author

yu052 commented Nov 24, 2020

I checked that chromosome 33 is indeed in the genome.fa, genome.dict, and genome.fa.fai.
It is weird that all the reads in the bam failed to pass the filters, isn't it? I confirmed that most reads have mapq 60.
Do you have any other clue why I got these errors?

KInd regards

@polyactis
Copy link
Owner

polyactis commented Nov 25, 2020 via email

@yu052
Copy link
Author

yu052 commented Nov 25, 2020

Thanks for your response!
Actually, I am pretty sure I have good bam files. I checked them in Jbrowse and IGV. Of course, there are mismapped reads, but not much. It makes no sense that all the reads just failed in the filters, right?
The Strelka didn't work in your accucopy pipeline at the first step. But I succeed in running the Strelka independently. This is also confusing me.

@polyactis
Copy link
Owner

You can copy the independent Strelka into the docker and overwrite the docker version and ran it inside the docker to see if anything strange.

You bam files identify chromsomes as "chr1", not "1", right? Our program assumes "chr1", not "1".

@yu052
Copy link
Author

yu052 commented Dec 5, 2020

Is it possible to make your program compatible for both format?

@polyactis
Copy link
Owner

polyactis commented Dec 5, 2020 via email

@yu052
Copy link
Author

yu052 commented Dec 5, 2020

Sorry that I didn't make it clear. I mean the format of the name of the chromosome, chr1 and 1.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants