-
Notifications
You must be signed in to change notification settings - Fork 64
Description
Description
I wish to use the latest NanoSim v3.0.1 (supports reading .gz sequence files, and bam files), but read_analysis.py does not complete. The primary.bam file is not indexed, an that might be an issue for pysam, but there seem to be more, this might be related to #129.
Error
2021-07-14 12:59:14: Processing alignment file: bam
[W::hts_idx_load3] The index file is older than the data file: analysis/Nanopore.bam.bai
021-07-14 12:59:16: Aligned reads analysis
[E::idx_find_and_load] Could not retrieve index file for 'analysis/nanosim_model/sim_primary.bam'
and further down
2021-07-14 12:59:17: match and error models
[E::idx_find_and_load] Could not retrieve index file for 'analysis/nanosim_model/sim_primary.bam'
Traceback (most recent call last):
File "/beegfs/homes/eboileau/.miniconda3/envs/scNapBar-dev/bin/besthit_to_histogram.py", line 318, in hist
cs_string = alnm.get_tag('cs')
File "pysam/libcalignedsegment.pyx", line 2399, in pysam.libcalignedsegment.AlignedSegment.get_tag
File "pysam/libcalignedsegment.pyx", line 2438, in pysam.libcalignedsegment.AlignedSegment.get_tag
KeyError: "tag 'cs' not present"
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/eboileau/.miniconda3/envs/scNapBar-dev/bin/read_analysis.py", line 720, in <module>
main()
File "/home/eboileau/.miniconda3/envs/scNapBar-dev/bin/read_analysis.py", line 710, in main
error_model.hist(prefix, alnm_ext)
File "/beegfs/homes/eboileau/.miniconda3/envs/scNapBar-dev/bin/besthit_to_histogram.py", line 320, in hist
cs_string = get_cs(alnm.original_sam_line.split()[5], alnm.get_tag('MD'))
AttributeError: 'pysam.libcalignedsegment.AlignedSegment' object has no attribute 'original_sam_line'
Expected behavior
read_analysis.py completes successfully, and generates the model to be used as input for simulator.py.
To reproduce
read_analysis.py genome -i Nanopore.fq.gz -ga Nanopore.bam -o nanosim_model/sim
and this also occurs with all uncompressed input files (I guess this is expected, since NanoSim now outputs compressed files anyway)
read_analysis.py genome -i Nanopore.fq -ga Nanopore.sam -o nanosim_model/sim
However, using NanoSim 2.5.0, the latest command is successful.
Environment
Python 3.7.6
conda 4.9.2
NanoSim 3.0.1 ( but Note that the version has not been updated in some scripts, e.g. read_analysis.py --version return NanoSim 3.0.0, although I am using 3.0.1 )
pysam 0.16.0.1 (samtools 1.10, using htslib 1.10.2)