Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IndexError: list index out of range #4

Closed
yi1873 opened this issue Apr 2, 2022 · 8 comments
Closed

IndexError: list index out of range #4

yi1873 opened this issue Apr 2, 2022 · 8 comments

Comments

@yi1873
Copy link

yi1873 commented Apr 2, 2022

Error in ".../anaconda3/envs/SuperPang/lib/python3.8/site-packages/SuperPang/lib/Assembler.py", line 829, in reconstruct_sequence
if k[:-1] == kmers[-1][1:]:
IndexError: list index out of range

@fpusan
Copy link
Owner

fpusan commented Apr 2, 2022

Hi!
Can you share the input genomes and the command you used so I can try to reproduce the error?

@yi1873
Copy link
Author

yi1873 commented Apr 7, 2022

Species: Mycobacterium tuberculosis
Taxid: 1773
Genome:316 complete genomes in GenBank (https://ftp.ncbi.nlm.nih.gov/genomes/genbank/assembly_summary_genbank.txt)
Command: SuperPang.py --fasta genome/*.fa --output-dir 1773/pangenome --force-overwrite -t 20 --assume-complete -b 0.95 -i 0.95 -k 301

@yi1873
Copy link
Author

yi1873 commented Apr 7, 2022

You can also test E.coli with all complete genomes in GenBank, get the same error.

@fpusan
Copy link
Owner

fpusan commented Apr 8, 2022

Can reproduce with the Mycobacterium tuberculosis dataset

@yi1873
Copy link
Author

yi1873 commented Apr 11, 2022

Full errors:

multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File ".../anaconda3/envs/SuperPang/lib/python3.8/multiprocessing/pool.py", line 125, in worker
result = (True, func(*args, **kwds))
File ".../anaconda3/envs/SuperPang/lib/python3.8/multiprocessing/pool.py", line 48, in mapstar
return list(map(*args))
File ".../anaconda3/envs/SuperPang/lib/python3.8/site-packages/SuperPang/lib/Assembler.py", line 846, in reconstruct_sequence
if k[:-1] == kmers[-1][1:]:
IndexError: list index out of range
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File ".../anaconda3/envs/SuperPang/bin/SuperPang.py", line 291, in
main(parse_args())
File ".../anaconda3/envs/SuperPang/bin/SuperPang.py", line 144, in main
contigs = Assembler(input_minimap2, args.ksize, args.threads).run(args.minlen, args.mincov, args.bubble_identity_threshold, args.genome_assignment_threshold, args.threads)
File ".../anaconda3/envs/SuperPang/lib/python3.8/site-packages/SuperPang/lib/Assembler.py", line 259, in run
NBP2seq = dict(zip(NBPs, self.multimap(self.reconstruct_sequence, threads, NBPs, self.vertex2hash, self.compressor)))
File ".../anaconda3/envs/SuperPang/lib/python3.8/site-packages/SuperPang/lib/Assembler.py", line 55, in multimap
res = pool.map(fun, range(len(iterable)))
File ".../anaconda3/envs/SuperPang/lib/python3.8/multiprocessing/pool.py", line 364, in map
return self._map_async(func, iterable, mapstar, chunksize).get()
File ".../anaconda3/envs/SuperPang/lib/python3.8/multiprocessing/pool.py", line 771, in get
raise self._value
IndexError: list index out of range

@fpusan
Copy link
Owner

fpusan commented Apr 11, 2022

Thanks! I'm running it right now in my computers, adding some traces to see what could be happening.
Also, wild guess here, but does this also happen if you call it like
SuperPang.py --fasta genome/*.fa --output-dir 1773/pangenome --force-overwrite -t 20 --assume-complete -b 0.95 -i 0.95 -k 301 -n 20 ?

@fpusan
Copy link
Owner

fpusan commented Apr 11, 2022

Nevermind, I now think that the problem is due to the presence of ambiguous bases in the input fasta files
(e.g. "M" to represent either "A" or "C"). This happens for these genomes and would definitely break my code in a few places. Will think of a fix

@fpusan fpusan closed this as completed in fa0e57b Apr 12, 2022
@fpusan
Copy link
Owner

fpusan commented Apr 13, 2022

Hi again! This issue is hopefully fixed in the new version 0.9.3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants