Skip to content

Commit

Permalink
newreference: If gene is given but not found in genbank, error out
Browse files Browse the repository at this point in the history
Currently, if the --gene option is used and the gene name is not found,
the script will use the entire genome which may cause some silent undesired
behaviors.

This commit changes that such that the script will error out if the gene
is not found in the GenBank file as this indicates the gene name may be
misspelled or the user may be using the wrong GenBank file.

If the --gene option is not used, the script will continue to process the
entire genome as expected.
  • Loading branch information
j23414 committed Mar 8, 2024
1 parent 570daa6 commit 3400636
Showing 1 changed file with 7 additions and 0 deletions.
7 changes: 7 additions & 0 deletions scripts/newreference.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
from Bio.SeqFeature import SeqFeature, FeatureLocation, Seq
import shutil
import argparse
import sys

def new_reference(referencefile, outgenbank, outfasta, gene):
ref = SeqIO.read(referencefile, "genbank")
Expand All @@ -22,6 +23,12 @@ def new_reference(referencefile, outgenbank, outfasta, gene):
qualifiers=ref_source_feature.qualifiers)
record.features.append(source_feature)

# If user provides a --gene 'some name' is not found, print a warning and use the entire genome.
# Otherwise do not print a warning.
if(gene != None and startofgene==None and endofgene==None):

This comment has been minimized.

Copy link
@huddlej

huddlej Mar 8, 2024

Thanks for adding this, @j23414! I'd move this conditional immediately after the for loop above, just to avoid trying to extract the sequence feature when you're going to exit with an error anyway.

print(f"ERROR: No '{gene}' was found under 'gene' or 'CDS' features in the GenBank file.", file=sys.stderr)
sys.exit(1)

SeqIO.write(record, outgenbank, 'genbank')
SeqIO.write(record, outfasta, "fasta")

Expand Down

0 comments on commit 3400636

Please sign in to comment.