-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GraftM graft on big metagenome error #277
Comments
Hi, I can't tell exactly since I don't have the command you used or the data, but the error message (found 446 reads when expected 223) suggests to me that the read sets are interleaved, since 223*2=446. Does that help? |
Thank you very much for the quick response. The command I used was: If the reads are interleaved, what can I do to make them compatible with the graft command? |
Hi,
You can either use the --interleaved flag instead of --forward. You can tell whether they are interleaved easily just by looking at the head of the file - they'll have 2 reads with the same name. Alternatively you can split the file up - there's plenty of tools out there for doing that out there.
ben
Ben WoodcroftMicrobial informatics group leader, ARC Future Fellow
(+617) 3443 7334
Centre for Microbiome Research, Level 3, Translational Research Institute, School of Biomedical Sciences, Faculty of Health, Queensland University of Technology
https://research.qut.edu.au/cmr/team/ben-woodcroft
…On Apr 27 2022, at 11:26 am, steff1088 ***@***.***> wrote:
Thank you very much for the quick response.
The command I used was:
graftM graft --threads 8 --evalue 0.000000001 --forward 11774.2.218915.CGAACTG-ACAGTTC.filter-METAGENOME.fastq --graftm_package 500PSI_mcrAs_refined.gpkg --output_directory GraftM_output_11774.2.218915.CGAACTG-ACAGTTC_500PSI_mcrAs_refined_package --force
If the reads are interleaved, what can I do to make them compatible with the graft command?
—
Reply to this email directly, view it on GitHub (#277 (comment)), or unsubscribe (https://github.com/notifications/unsubscribe-auth/AAADX5HD7CIFV4BJ6O7JZHLVHCJTPANCNFSM5UMM6IXA).
You are receiving this because you were mentioned.
|
Thanks Ben, that did the trick! -steffen |
Hi all,
I ran into issues running my mcrA package on a big 45 GB metagenome in fastq format. I can't really interpret the error message so I was wondering if you had any ideas. The package runs fine on other metagenomes in fasta and fastq format. @wwood @geronimp
GraftM 0.13.1
04/23/2022 01:38:19 PM INFO: Working on 11774.2.218915.CGAACTG-ACAGTTC.filter-METAGENOME
Traceback (most recent call last):
File "/home/users/sbuessec/.local/bin/graftM", line 415, in
Run(args).main()
File "/home/users/sbuessec/.local/lib/python3.6/site-packages/graftm/run.py", line 613, in main
self.graft()
File "/home/users/sbuessec/.local/lib/python3.6/site-packages/graftm/run.py", line 388, in graft
diamond_db
File "/home/users/sbuessec/.local/lib/python3.6/site-packages/graftm/timeit.py", line 10, in timed
result = method(*args, **kw)
File "/home/users/sbuessec/.local/lib/python3.6/site-packages/graftm/sequence_searcher.py", line 851, in aa_db_search
hit_reads_orfs_fasta)
File "/home/users/sbuessec/.local/lib/python3.6/site-packages/graftm/sequence_searcher.py", line 943, in search_and_extract_orfs_matching_protein_database
hits
File "/home/users/sbuessec/.local/lib/python3.6/site-packages/graftm/sequence_searcher.py", line 534, in _extract_from_raw_reads
extern.run(extract_cmd, stdin='\n'.join(input_reads))
File "/home/users/sbuessec/.local/lib/python3.6/site-packages/extern/init.py", line 41, in run
raise ExternCalledProcessError(process, command)
extern.ExternCalledProcessError: Command mfqe --output-uncompressed --fasta-read-name-lists /dev/stdin --input-fasta <(awk '{print ">" substr($0,2);getline;print;getline;getline}' '11774.2.218915.CGAACTG-ACAGTTC.filter-METAGENOME.fastq') --output-fasta-files '/tmp/_raw_extracted_reads.famb1zbzrb' returned non-zero exit status 101.
STDERR was: b"[2022-04-23T20:45:46Z INFO mfqe] Read in 223 read names from /dev/stdin\n[2022-04-23T20:45:46Z INFO mfqe] Iterating input FASTQ file\n[2022-04-23T20:47:38Z INFO mfqe] Extracted 446 reads from 120829412 total\nthread 'main' panicked at 'Mismatching numbers of read names were observed. Expected:\n[223]\nbut found\n[446]', src/main.rs:333:9\nnote: run with
RUST_BACKTRACE=1
environment variable to display a backtrace\n"STDOUT was: b''The text was updated successfully, but these errors were encountered: