Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SeqSero2 "freezes" on particular sample #48

Open
eam12 opened this issue Jul 31, 2023 · 10 comments
Open

SeqSero2 "freezes" on particular sample #48

eam12 opened this issue Jul 31, 2023 · 10 comments

Comments

@eam12
Copy link

eam12 commented Jul 31, 2023

Thank you for creating SeqSero2. I use it almost everyday! I'm currently running SeqSero2 (v.1.1.1) on thousands of samples, but for a particular sample SeqSero2 freezes every time at the assembling... step. I've waited up to 12 hours for the process to finish, but nothing happens. All other samples have run successfully within minutes so I'm not sure what it is about this particular one. The FASTQ files look completely normal, as does the assembly. In the data_log.txt file the last few lines printed are as follows:

  0:00:03.150    17M / 32M   INFO    General                 (kmer_coverage_model.cpp   : 259)   Fitting coverage model
  0:00:03.154    17M / 32M   INFO    General                 (kmer_coverage_model.cpp   : 295)   ... iteration 2
  0:00:03.161    17M / 32M   INFO    General                 (kmer_coverage_model.cpp   : 295)   ... iteration 4
  0:00:03.171    17M / 32M   INFO    General                 (kmer_coverage_model.cpp   : 295)   ... iteration 8
  0:00:03.191    17M / 32M   INFO    General                 (kmer_coverage_model.cpp   : 295)   ... iteration 16
  0:00:03.223    17M / 32M   INFO    General                 (kmer_coverage_model.cpp   : 295)   ... iteration 32
  0:00:03.285    17M / 32M   INFO    General                 (kmer_coverage_model.cpp   : 295)   ... iteration 64

I can send the paired FASTQ files upon request.

@tongzhouxu
Copy link

Hi,

Could you please provide the command you used and share the FASTQ files so we can try reproducing the error?

Thanks!

@eam12
Copy link
Author

eam12 commented Jul 31, 2023

The two FASTQ files are 70MB each so GitHub won't let me upload them. Is there an email I could send them to?

Command I used:

SeqSero2_package.py -s -t 2 -p 12 -i SRR17736741_trim_R1_paired.fastq.gz SRR17736741_trim_R2_paired.fastq.gz -d SRR17736741_seqsero2 -n SRR17736741

A couple more things:

  • I just updated to SeqSero2 v.1.2.1 and I'm still running into the same issue.
  • When I use these same two FASTQ files and assemble using Shovill with the Spades assembler, I have no issues.
  • Serovar should come back as Thompson.

@tongzhouxu
Copy link

It might be a bug with Spades. Could you please try updating spades to V3.9.0 and see if you still have the same issue. If the problem persists, please send the raw reads to tongzhou.xu@uga.edu and we will try to reproduce the error.

Thanks!

@eam12
Copy link
Author

eam12 commented Aug 1, 2023

It worked! Well, it didn't give me a serovar prediction (No serotype antigens were detected. This is an atypical result that should be further investigated.), but at least it didn't freeze! Many, many thanks for the suggestion.

@denglab
Copy link
Owner

denglab commented Aug 1, 2023 via email

@eam12
Copy link
Author

eam12 commented Aug 22, 2023

To follow-up on this a few weeks later, I'm now receiving the following message for every sample I try to run:

Note:	No serotype antigens were detected. This is an atypical result that should be further investigated. 

When I look at the log file I see the error message:

== Error ==  python version 3.7 is not supported!
Supported versions are 2.4, 2.5, 2.6, 2.7, 3.2, 3.3, 3.4, 3.5

So it looks like version 3.9.0 of spades (the version you suggested I downgrade to) isn't compatible with the version of python I'm using (v.3.8). To fix this, I tried downgrading the version of python to 3.5, but that led to a number of additional package conflicts that couldn't be solved:

Problem: package biopython-1.73-py37h14c3975_0 requires python >=3.7,<3.8.0a0, but none of the providers can be installed

I did try running standalone spades (v.3.15.5) on the troublesome sample and it ran to completion so could the issue be with how SeqSero2 interacts with spades?

At this point, I'm not really sure how to fix this issue. Do you have any further suggestions?
Thanks so much!

@tongzhouxu
Copy link

Hi,
Downgrading python could lead to conflicts. I would suggest creating a new conda environment and reinstall seqsero2 using:
conda create -n seqsero2 python=3.6
conda install -c bioconda seqsero2=1.2.1
This should install seqsero2 along with the latest compatible version of spades. Please let me know if the problem persists.

Thanks,
Tongzhou

@eam12
Copy link
Author

eam12 commented Aug 23, 2023

Hi Tongzhou,
Thanks so much for your reply. Unfortunately, it still seems to be "freezing" during assembly. The version of spades being used is v.3.14.1. Perhaps an older version of spades is required? I'm now trying to play around with what version of spades to use that's greater than v.3.9, but less than at least v.3.14.1.

@tongzhouxu
Copy link

tongzhouxu commented Aug 23, 2023

Hi,
I downloaded SRR17736741_1.fastq and SRR17736741_2.fastq from NCBI and ran SeqSero2 with no issue. I am testing with spades.py v3.14.1 and v3.15.2 on ubuntu. I noticed a similar bug reported here ablab/spades#372. So it might be a bug that is specific to certain spades versions. I would suggest try updating spades.py using conda instead of downgrading.

Thanks,
Tongzhou

@eam12
Copy link
Author

eam12 commented Aug 23, 2023

Hi Tongzhou,

I did process the SRR17736741 FASTQ files through Trimmomatic so perhaps that is why you have been able to successfully run SRR17736741 through SeqSero2 and I have not.

I've already tried SeqSero2 with the most recent spades release (v.3.15.5), in addition to v.3.14.1, and it still hangs for SRR17736741:

% SeqSero2_package.py -s -t 2 -p 12 -i ~/SRR17736741_trim_R1_paired.fastq.gz ~/SRR17736741_trim_R2_paired.fastq.gz -d ~/SRR17736741_seqsero2 -n SRR17736741
building database...
mapping...
check samtools version: 1.17
[bam_sort_core] merging from 0 files and 12 in-memory blocks...
assembling...

When I run spades.py by itself (both v.3.14.1 and v.3.15.5, independently of SeqSero2), SRR17736741 runs with no problems:

% spades.py -1 ~/SRR17736741_trim_R1_paired.fastq.gz -2 ~/SRR17736741_trim_R2_paired.fastq.gz -o ~/SRR17736741_spades -t 12
======= SPAdes pipeline finished.

SPAdes log can be found here: ~/SRR17736741_spades/spades.log

Thank you for using SPAdes!

I guess I'm confused as to why spades would run perfectly on its own, but not within the context of SeqSero2.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants