Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error message: Whitespace is not allowed in the sequence. #3

Closed
mokrobial opened this issue Feb 25, 2024 · 1 comment
Closed

Error message: Whitespace is not allowed in the sequence. #3

mokrobial opened this issue Feb 25, 2024 · 1 comment

Comments

@mokrobial
Copy link

Steps:

  1. install mTags via conda
  2. download mTags database
  3. run profile

Error message:
2024-02-24 17:35:26,237 INFO: Processed reads: 36000000
Traceback (most recent call last):
File "/Applications/miniconda3/envs/mtags/bin/mtags", line 8, in
sys.exit(main())
File "/Applications/miniconda3/envs/mtags/lib/python3.7/site-packages/mTAGs/mtags.py", line 1284, in main
execute_mtags_profile(sys.argv[2:])
File "/Applications/miniconda3/envs/mtags/lib/python3.7/site-packages/mTAGs/mtags.py", line 1219, in execute_mtags_profile
ssu_files = _mtags_extract_grouped(input_seqfiles_r1, input_seqfiles_r2, input_seqfiles_s, output_folder, threads)
File "/Applications/miniconda3/envs/mtags/lib/python3.7/site-packages/mTAGs/mtags.py", line 536, in _mtags_extract_grouped
(reads_s, ssu_files_s, lsu_file_s) = mtags_extract(pathlib.Path(input_seqfile_s), output_folder, readnames, threads=threads)
File "/Applications/miniconda3/envs/mtags/lib/python3.7/site-packages/mTAGs/mtags.py", line 375, in mtags_extract
for number_of_sequences, fasta in enumerate(stream_fa(input_seq_file), 1):
File "/Applications/miniconda3/envs/mtags/lib/python3.7/site-packages/mTAGs/mtags.py", line 162, in stream_fa
for header, sequence, qual in Bio.SeqIO.QualityIO.FastqGeneralIterator(handle):
File "/Applications/miniconda3/envs/mtags/lib/python3.7/site-packages/Bio/SeqIO/QualityIO.py", line 955, in FastqGeneralIterator
raise ValueError("Whitespace is not allowed in the sequence.")
ValueError: Whitespace is not allowed in the sequence.

This is the exact command I used:
% mtags profile -s /Volumes/CanalData2021/METAGENOME.fastq.gz -t 16 -o /Volumes/CanalData2021/test-run-one -n Metagenome_1_Mcd

Note: 2 duplicate fasta files, F and R were generated in the output folder
Note: Metagenomic filtered raw fastq was downloaded from JGI IMG

@hjruscheweyh
Copy link
Member

Hi @mokrobial

We're using biopython to read sequence files. And biopython seems to complain about your input reads having a space in the sequence (not the header). This indicates that the input file is somewhat corrupted. I suggest you to redownload the files from JGI.

Note 2: Hmmer requires uncompressed fasta files with forward and reverse orientation which can create huge temporary files which should be deleted if mTAGs finished successfully.

Best
Hans

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants