Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VADR breaks processing AY394999.1: "Use of uninitialized value $uapos" #21

Closed
taltman opened this issue Jul 17, 2020 · 4 comments
Closed

Comments

@taltman
Copy link

taltman commented Jul 17, 2020

Hi Eric,

I was trying to process a GenBank virus genome using VADR, and I got an odd break. Output below. Please let me know if I can provide any further information to help you debug this issue.

Thanks!

~Tomer

(base) ubuntu@ip-172-31-65-128:~/repos/darth$ time make test-taxon-prot-gen
mkdir -p test/taxon-prots
cd test/taxon-prots
for acc in AY394999.1
do
echo "Processing genome $acc:"
mkdir -p $acc
pushd $acc
if esl-sfetch /home/ubuntu/repos/darth/data/cov3ma.fa $acc > $acc.fa
then
sudo docker run -it --rm -m 13GB -v `pwd`:/output taltman/darth:maul \
                        darth.sh \
                                $acc \
                                /output/$acc.fa \
                                none \
                                /root/data \
                                /output \
                                2
fi
popd
echo "... complete!"
done
Processing genome AY394999.1:
~/repos/darth/test/taxon-prots/AY394999.1 ~/repos/darth/test/taxon-prots
WARNING: Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap.
# v-annotate.pl :: classify and annotate sequences using a CM library
# VADR 1.1 (May 2020)
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# date:              Fri Jul 17 08:06:40 2020
# $VADRBIOEASELDIR:  /root/vadr/Bio-Easel
# $VADRBLASTDIR:     /root/vadr/ncbi-blast/bin
# $VADREASELDIR:     /root/vadr/infernal/binaries
# $VADRINFERNALDIR:  /root/vadr/infernal/binaries
# $VADRMODELDIR:     /root/vadr/vadr-models
# $VADRSCRIPTSDIR:   /root/vadr/vadr
#
# sequence file:                                                                  /output/AY394999.1.fa
# output directory:                                                               /output/AY394999.1
# force directory overwrite:                                                      yes [-f]
# .cm, .minfo, blastn .fa files in $VADRMODELDIR start with key <s>, not 'vadr':  corona [--mkey]
# model files are in directory <s>, not in $VADRMODELDIR:                         /root/data/vadr-models-corona-1.1-1 [--mdir]
# set max allowed memory for cmalign to <n> Mb:                                   64000 [--mxsize]
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# Validating input                                                                        ... done. [    6.6 seconds]
# Classifying sequences (1 seq)                                                           ... done. [  129.4 seconds]
# Determining sequence coverage (NC_004718: 1 seq)                                        ... done. [  255.4 seconds]
# Aligning sequences (NC_004718: 1 seq)                                                   ... done. [  586.9 seconds]
# Determining annotation                                                                  ... Use of uninitialized value $uapos in concatenation (.) or string at /root/vadr/vadr/v-annotate.pl line 3757.
done. [    0.3 seconds]
# Validating proteins with blastx (NC_004718: 1 seq)                                      ...
ERROR in seq_Overlap start1 > end1 (27702 > 27701)

Makefile:278: recipe for target 'test-taxon-prot-gen' failed
make: *** [test-taxon-prot-gen] Error 1

real    16m21.518s
user    0m0.107s
sys     0m0.053s
@taltman
Copy link
Author

taltman commented Jul 19, 2020

Another accession with the same issue: AY395000.1

@nawrockie
Copy link
Member

Thanks, Tomer. I'm working on a fix and will update you when I have more to report.

@nawrockie
Copy link
Member

Thanks for reporting this, @taltman. This was due to a bug that is demonstrated if an entire internal CDS feature is deleted, as is the case with AY34999 and AY35000 (the hypothetical protein sars8a CDS is deleted).

I've fixed this in the develop branch and the fix will go into the next release: v1.1.1 which should happen within the next few weeks. As a temporary workaround, using --skip_pv with v-annotate.pl to avoid vadr exiting in error, but note that the sars8a CDS will be incorrectly annotated in the .ftr table with a length of '0' (start coordinate is 1 more than stop coordinate, e.g for AY394999: 27702 to 27701). With the fix, deleted features are not annotated in the .ftr file, but a new DELETION_OF_FEATURE error is reported.

The iss21-delcds test in testfiles/github-issues/ exercises the fix for this bug.

@taltman
Copy link
Author

taltman commented Jul 21, 2020

Thanks for looking into this! I appreciate the work-around, and look forward to the upcoming version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants