Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vg map errors #4280

Closed
santhanakrishnanb opened this issue Apr 30, 2024 · 5 comments
Closed

vg map errors #4280

santhanakrishnanb opened this issue Apr 30, 2024 · 5 comments

Comments

@santhanakrishnanb
Copy link

1. What were you trying to do?
2. What did you want to happen?

I was trying to map a new fastq file to a graph generated using a few genome sequences.

Step 1: vg construct -r Ref_genes.fasta > output.vg
was succesful.

Step 2: vg index -x output.xg -g output.gcsa output.vg
was succesful.

Step 3: vg map -x output.xg -g output.gcsa -f CP019206.1.fastq > mapped_reads.gam
is where it got stuck.

3. What actually happened?

4. If you got a line like Stack trace path: /somewhere/on/your/computer/stacktrace.txt, please copy-paste the contents of that file here:

sb3700@cvm-Lambda-Vector: vg map -x output.xg -g output.gcsa -f trimmed_CP045063.1.fastq > mapped_reads.gam

terminate called after throwing an instance of 'std::runtime_error'
what(): Found unexpected delimiter in fastq/fasta input
━━━━━━━━━━━━━━━━━━━━
Crash report for vg v1.56.0 "Collalto"
Stack trace (most recent call last) in thread 2077364:
#14 Object "", at 0xffffffffffffffff, in
#13 Object "/usr/lib/x86_64-linux-gnu/libc.so.6", at 0x7f719a52684f, in __clone3
Source "../sysdeps/unix/sysv/linux/x86_64/clone3.S", line 81, in __clone3 [0x7f719a52684f]
#12 Object "/usr/lib/x86_64-linux-gnu/libc.so.6", at 0x7f719a494ac2, in start_thread
Source "./nptl/pthread_create.c", line 442, in start_thread [0x7f719a494ac2]
#11 Object "/usr/lib/x86_64-linux-gnu/libgomp.so.1.0.0", at 0x7f719abdbc0d, in
#10 Object "/home/sb3700/Adam_Data/pggb/vg/bin/vg", at 0x56289717fb29, in unsigned long vg::io::unpaired_for_each_parallelvg::Alignment(std::function<bool (vg::Alignment&)>, std::function<void (vg::Alignment&)>, unsigned long) [clone ._omp_fn.0]
| Source "/home/sb3700/Adam_Data/pggb/vg/include/vg/io/alignment_io.hpp", line 146, in operator()
| 144: for (int i = 0; i < batch_size; i++) {
| 145:
| > 146: more_data = get_read_if_available(aln);
| 147:
| 148: if (more_data) {
Source "/usr/include/c++/11/bits/std_function.h", line 590, in _ZN2vg2io26unpaired_for_each_parallelINS_9AlignmentEEEmSt8functionIFbRT_EES3_IFvS5_EEm._omp_fn.0 [0x56289717fb29]
587: {
588: if (_M_empty())
589: __throw_bad_function_call();
> 590: return _M_invoker(_M_functor, std::forward<_ArgTypes>(__args)...);
591: }
592:
593: #if __cpp_rtti
#9 Object "/home/sb3700/Adam_Data/pggb/vg/bin/vg", at 0x5628967de82c, in vg::get_next_alignment_from_fastq(gzFile_s*, char*, unsigned long, vg::Alignment&) [clone .cold]
| Source "src/alignment.cpp", line 220, in ~basic_string
| Source "/usr/include/c++/11/bits/basic_string.h", line 672, in ~_Alloc_hider
| 670: /
| 671: ~basic_string()
| > 672: { _M_dispose(); }
| 673:
| 674: /
*
| Source "/usr/include/c++/11/bits/basic_string.h", line 158, in ~allocator
| 157: // Use empty-base optimization: http://www.cantrip.org/emptyopt.html
| > 158: struct _Alloc_hider : allocator_type // TODO check __is_final
| 159: {
| 160: #if __cplusplus < 201103L
| Source "/usr/include/c++/11/bits/allocator.h", line 174, in ~new_allocator
| 172: constexpr
| 173: #endif
| > 174: ~allocator() _GLIBCXX_NOTHROW { }
| 175:
| 176: #if __cplusplus > 201703L
Source "/usr/include/c++/11/ext/new_allocator.h", line 89, in get_next_alignment_from_fastq [0x5628967de82c]
86: new_allocator(const new_allocator<_Tp1>&) _GLIBCXX_USE_NOEXCEPT { }
87:
88: #if __cplusplus <= 201703L
> 89: ~new_allocator() _GLIBCXX_USE_NOEXCEPT { }
90:
91: pointer
92: address(reference __x) const _GLIBCXX_NOEXCEPT
#8 Object "/usr/lib/x86_64-linux-gnu/libgcc_s.so.1", at 0x7f719abb52dc, in _Unwind_Resume
#7 Object "/usr/lib/x86_64-linux-gnu/libgcc_s.so.1", at 0x7f719abb4883, in
#6 Object "/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.30", at 0x7f719a8ad958, in __gxx_personality_v0
#5 Object "/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.30", at 0x7f719a8ad1e8, in
#4 Object "/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.30", at 0x7f719a8ae20b, in
#3 Object "/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.30", at 0x7f719a8a2b9d, in
#2 Object "/usr/lib/x86_64-linux-gnu/libc.so.6", at 0x7f719a4287f2, in abort
Source "./stdlib/abort.c", line 79, in abort [0x7f719a4287f2]
#1 Object "/usr/lib/x86_64-linux-gnu/libc.so.6", at 0x7f719a442475, in raise
Source "../sysdeps/posix/raise.c", line 26, in raise [0x7f719a442475]
#0 Object "/usr/lib/x86_64-linux-gnu/libc.so.6", at 0x7f719a4969fc, in pthread_kill@@GLIBC_2.34
| Source "./nptl/pthread_kill.c", line 89, in __pthread_kill_internal
| Source "./nptl/pthread_kill.c", line 78, in __pthread_kill_implementation
Source "./nptl/pthread_kill.c", line 44, in __pthread_kill [0x7f719a4969fc]
ERROR: Signal 6 occurred. VG has crashed. Visit https://github.com/vgteam/vg/issues/new/choose to report a bug.
Please include this entire error log in your bug report!


**5. What data and command can the vg dev team use to make the problem happen?**
Ref genome used the following: CP019405.1, CP019409.1, CP019410.1, CP019412.1, CP019413.1, CP019414.1, CP019417.1

**6. What does running `vg version` say?**

vg version v1.56.0 "Collalto"
Compiled with g++ (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 on Linux
Linked against libstd++ 20230528
Built by sb3700@cvm-Lambda-Vector
@jeizenga
Copy link
Contributor

That looks like a FASTQ formatting error. Can you share the results of head -n 20 trimmed_CP045063.1.fastq?

@santhanakrishnanb
Copy link
Author

LOCUS CP045063 4930420 bp DNA circular BCT 12-NOV-2019
DEFINITION Salmonella enterica subsp. enterica serovar Muenchen strain LG26
chromosome, complete genome.
ACCESSION CP045063
VERSION CP045063.1
DBLINK BioProject: PRJNA576706
BioSample: SAMN13002973
KEYWORDS .
SOURCE Salmonella enterica subsp. enterica serovar Muenchen
ORGANISM Salmonella enterica subsp. enterica serovar Muenchen
Bacteria; Pseudomonadota; Gammaproteobacteria; Enterobacterales;
Enterobacteriaceae; Salmonella.
REFERENCE 1 (bases 1 to 4930420)
AUTHORS Tran,T.D., McGarvey,J.A., Huynh,S. and Parker,C.T.
TITLE Genome sequence of Salmonella Muenchen str. LG26
JOURNAL Unpublished
REFERENCE 2 (bases 1 to 4930420)
AUTHORS Tran,T.D., McGarvey,J.A., Huynh,S. and Parker,C.T.
TITLE Direct Submission
JOURNAL Submitted (09-OCT-2019) Foodborne Toxin Detection and Prevention

@jeizenga
Copy link
Contributor

Okay, yeah, this is not at all a FASTQ file. It looks like maybe you saved a request for a FASTQ file instead of the file itself?

@santhanakrishnanb
Copy link
Author

yeah, this is not at all a FASTQ file. It looks like maybe you saved a request for a FASTQ file instead of the file itself?

The above output is the header to it. Attached is the complete file.
I have tried with other fastq files downloaded from NCBI, but with similar results.

CP045063.1.zip

@jeizenga
Copy link
Contributor

jeizenga commented May 1, 2024

That isn't a FASTQ either. These seem to be raw NCBI data pages. Check out the wiki to see some examples of what FASTQ files look like: https://en.wikipedia.org/wiki/FASTQ_format

@jeizenga jeizenga closed this as completed May 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants