Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Count process never ends #19

Open
ferrasa opened this issue Jul 23, 2018 · 20 comments
Open

Count process never ends #19

ferrasa opened this issue Jul 23, 2018 · 20 comments

Comments

@ferrasa
Copy link

ferrasa commented Jul 23, 2018

I'm trying to process the public sample (https://www.ncbi.nlm.nih.gov/sra/?term=SRR6325536). The Count process never ends (now the subprocess stringtie is running more than 80 hours). I tried to redo the process for 3 times (in distinct machines) but the issue persists. No log messages was reported by SQuiRE. Thanks.

@wyang17
Copy link
Owner

wyang17 commented Jul 23, 2018

  • Are you running SQuIRE on a cluster or on a computer?
  • What version of Linux are you using? (you can use this command: cat /proc/version)
  • How much memory and how many cores have you allocated?
    This will help us replicate the issue. Thank you!

@ferrasa
Copy link
Author

ferrasa commented Jul 23, 2018 via email

@mariuswalter
Copy link

Hi, it seems that I have the same issue, count is taking forever, is that normal?
I am running on a cluster, with -p 10.
Monitoring with top, It looks like stringtie and samtools/bedtools run just fine, but then it's been stuck with awk for 2 days
Also, it really looks like it's only using one CPU

Thank you!

@cpacyna
Copy link
Collaborator

cpacyna commented Feb 15, 2019

Could you share the arguments you're using with Count? I can troubleshoot from there. Thanks!

@mariuswalter
Copy link

Thanks for helping!

My commands were:
For map:
squire Map -1 sample_1.fastq.gz -2 sample_2.fastq.gz -b hg38 -g -p 10 -r 125
and then, for count:
squire Count -b hg38 -n sample -p 10 -r 125

Is there something wrong here?
Thanks again! I was excited when this new tool came out, and very impatient to see what the result looks like!

@mariuswalter
Copy link

After letting the script run for like a week, it looks like it stopped, but I don't see anything that look like an output file...

Any thoughts?

@cpacyna
Copy link
Collaborator

cpacyna commented Feb 21, 2019

Hi, so sorry for the delay in response! Strange that you're not getting any output; your commands look ok. Did map run well? Could you try running count with the '-- verbosity' argument so we can see what seems to be going wrong? Thanks for your patience!

Regards,
Chloe

@mariuswalter
Copy link

Hi Chloe,

Thanks a lot for your help!

The results from maps looked good to me, the files i’m getting from maps are:

6.0G Feb 11 18:09 SRR6515353_1.fastq.bam
3.3M Feb 11 18:13 SRR6515353_1.fastq.bam.bai
0 Feb 11 17:25 SRR6515353_1.fastqChimeric.out.junction
7.6K Feb 11 17:25 SRR6515353_1.fastqChimeric.out.sam
1.9K Feb 11 17:52 SRR6515353_1.fastq.log
7.9M Feb 11 17:52 SRR6515353_1.fastqSJ.out.tab
172 Feb 11 17:00 SRR6515353_1.fastq_STARgenome
45 Feb 11 17:23 SRR6515353_1.fastq_STARpass1

One of the count process finally ended up with an error, here is what -- verbosity was saying:

Creating temporary files2019-02-14 09:48:22.953155

Creating unique and multiple alignment bedfiles 2019-02-14 09:48:22.953420

Identifying properly paired reads 2019-02-14 09:48:22.953438

Intersecting bam files with TE bedfile 2019-02-14 10:32:20.962550

Splitting into read1 and read 2 2019-02-14 10:43:05.599882

Combining adjacent TEs with same read alignment 2019-02-14 10:43:32.953759

Getting genomic coordinates of read2019-02-14 10:48:22.120913

Identifying and labeling unique and multi reads2019-02-14 10:49:38.834429

Matching paired-end mates and merging coordinates2019-02-20 00:14:14.806573

join: multi-character tab '$\\t'
Traceback (most recent call last):
File "/home/mwalter/miniconda3/envs/squire/bin/squire", line 11, in <module>
load_entry_point('SQuIRE', 'console_scripts', 'squire')()
File "/home/mwalter/SQuIRE/squire/cli.py", line 156, in main
subargs.func(args = subargs)
File "/home/mwalter/SQuIRE/squire/Count.py", line 1743, in main
match_reads(paired_tempfile1_ulabeled,paired_tempfile2_ulabeled,strandedness,paired_matched_tempfile,paired_unmatched1, paired_unmatched2,debug) #match pairs between paired files
File "/home/mwalter/SQuIRE/squire/Count.py", line 494, in match_reads
sp.check_call(["/bin/sh","-c",joincommand])
File "/home/mwalter/miniconda3/envs/squire/lib/python2.7/subprocess.py", line 186, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['/bin/sh', '-c', "join -j 12 -t $'\\t' -o 1.1,1.2,1.3,1.4,1.5,1.6,1.7,1.8,1.9,1.10,2.1,2.2,2.3,2.4,2.5,2.6,2.7,2.8,2.9,2.10 /work/mwalter/temp/SRR6515349_paired_ulabeled_1.tmpR0EGvB_newread_v1 /work/mwalter/temp/SRR6515349_paired_ulabeled_2.tmpNlSS3Q_newread_v1 > /work/mwalter/temp/SRR6515349_paired_matched.tmp9QGnUZ_10k_v1"]' returned non-zero exit status 1

I hope we can find a way to make it work, I’m really excited about using it!!

marius

@mariuswalter
Copy link

So I (well my roommate) managed to solve the issue at least in my platform.
It looks like there was a problem in the way the join command was read by bash
So in lines 492 and 536, we replaced "$'\\t'" by '"$(printf \'\t\')"' and it looked like it worked

@savytskanatalia
Copy link

So I (well my roommate) managed to solve the issue at least in my platform.
It looks like there was a problem in the way the join command was read by bash
So in lines 492 and 536, we replaced "$'\\t'" by '"$(printf \'\t\')"' and it looked like it worked

Hi, Marius!
Thank you and your roommate very much for sharing your solution! I had the same problem, that is now solved owing to you.
Best regards,
Natalia.

@cpacyna
Copy link
Collaborator

cpacyna commented Feb 27, 2019

Yes, thanks to you and your roommate! We'll take a look at the issue — sorry it held up your results. Let us know if you have other problems!

Regards,
Chloe

@cpacyna
Copy link
Collaborator

cpacyna commented Mar 1, 2019

Hi Marius and Natalia, hope you've had smooth sailing after fixing the bug. Could you let me know what version of bash you're using? We haven't come across this issue in beta testing and want to replicate/fix it!

@mariuswalter
Copy link

mariuswalter commented Mar 1, 2019

marius:~$ bash --version
GNU bash, version 4.4.12(1)-release (x86_64-pc-linux-gnu)
marius:~$ join --version
join (GNU coreutils) 8.26

Smooth sailing so far! From what we've read, the solution we came up with should be platform independent, so I'm curious to see if it works on yours at least!

@savytskanatalia
Copy link

I have same bash and join versions as Marius.
Tested the bug fix with 10 EM iterations only, it worked perfectly.

@MaxwellShih
Copy link

marius:~$ bash --version
GNU bash, version 4.4.12(1)-release (x86_64-pc-linux-gnu)
marius:~$ join --version
join (GNU coreutils) 8.26

Smooth sailing so far! From what we've read, the solution we came up with should be platform independent, so I'm curious to see if it works on yours at least!

Works here! Thank you so very much~
Ubuntu 18.04
GNU bash, version 4.4.20(1)-release (x86_64-pc-linux-gnu)
Copyright (C) 2016 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html

@alfonsosaera
Copy link

alfonsosaera commented Jan 22, 2021

Same problem here. However the solution posted here did not work.

$ cat /proc/version
Linux version 4.15.0-130-generic (buildd@lcy01-amd64-018) (gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04)) #134-Ubuntu SMP Tue Jan 5 20:46:26 UTC 2021
$ bash --version
GNU bash, version 4.4.20(1)-release (x86_64-pc-linux-gnu)
$ join --version
join (GNU coreutils) 8.28

I was able of fixing it by changing the shell in lines 494 and 538 from

sp.check_call(["/bin/sh","-c",joincommand])

to

sp.check_call(["/bin/bash","-c",joincommand])

@TDLewin
Copy link

TDLewin commented Mar 2, 2021

Same problem here. However the solution posted here did not work.

$ cat /proc/version
Linux version 4.15.0-130-generic (buildd@lcy01-amd64-018) (gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04)) #134-Ubuntu SMP Tue Jan 5 20:46:26 UTC 2021
$ bash --version
GNU bash, version 4.4.20(1)-release (x86_64-pc-linux-gnu)
$ join --version
join (GNU coreutils) 8.28

I was able of fixing it by changing the shell in lines 494 and 538 from

sp.check_call(["/bin/sh","-c",joincommand])

to

sp.check_call(["/bin/bash","-c",joincommand])

Thanks a lot for posting this. This fixed it for me too.

@RRebo
Copy link

RRebo commented May 26, 2021

Hello, my question is related to the subject "count process never ends" and I was wondering if with your fixes, did the awk step ran faster? I have a genome with ~70% of TEs and many are transcribed. The "Identifying and labeling unique and multi reads" step is running for a long time now (more than 100h) and is just stuck on the awk step with 1 CPU. Any ideas on how to increase the speed? I did not add the "solutions" posted here since I don't really get an error message.

@savytskanatalia
Copy link

Hello, my question is related to the subject "count process never ends" and I was wondering if with your fixes, did the awk step ran faster? I have a genome with ~70% of TEs and many are transcribed. The "Identifying and labeling unique and multi reads" step is running for a long time now (more than 100h) and is just stuck on the awk step with 1 CPU. Any ideas on how to increase the speed? I did not add the "solutions" posted here since I don't really get an error message.

No, the solution does not speed up matters. Some of the runs I had lasted literally for months. TELocal from Hammel`s lab is much faster, though, of course, it does not take into account unique mappers counts when fractions for multimappers are calculated, in contrast to SQuIRE.

@RRebo
Copy link

RRebo commented May 26, 2021

Thank you for the suggestions!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants