Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError: missing molecule_type in annotations #41

Closed
tauqeer9 opened this issue Sep 12, 2020 · 5 comments
Closed

ValueError: missing molecule_type in annotations #41

tauqeer9 opened this issue Sep 12, 2020 · 5 comments
Labels

Comments

@tauqeer9
Copy link

Hi
I get following error with --output_choice 4 or --output_choice 8. I don't get bacteria.fasta, bacteria.gbk and phage.gbk. I do get phage.fasta. What could be the possible reason? All other options work fine.

PhiSpy.py Streptococcus_pyogenes_M1_GAS.gbk -o M.phages -p M1 --threads 4 --log M1.log --output_choice 4

ValueError: missing molecule_type in annotations

Thank you so much.

@pdec
Copy link
Collaborator

pdec commented Sep 16, 2020

Hi,

thanks for using PhiSpy!
Can you tell us with which version of PhiSpy you got this error?

simply run
PhiSpy.py --version

Also, can you provide the whole error message or a log file?
Is the Streptococcus_pyogenes_M1_GAS.gbk the file we're providing in tests directory?
Do you get this error with any other GenBank file?

In case of --output_choice 8 you should only get prophage_information.tsv file based on the code table. While checking that I found a typo in code specifically for --output_choice 8 that is fixed in version 4.2.4 on master branch.

Thanks,
Przemek

@tauqeer9
Copy link
Author

tauqeer9 commented Sep 18, 2020

Thank you so much.

$ PhiSpy.py --version
4.1.22

$ PhiSpy.py Streptococcus_pyogenes_M1_GAS.gbk -o M1.phages -p M1 --threads 4 --log M1_choice4.log --output_choice 4

Processing 1 contigs
Making Testing Set...
Start Classification Algorithm...
Using the following metric(s): {'at_skew', 'gc_skew', 'orf_length_med', 'shannon_slope', 'max_direction'}.
Running the random forest classifier with 500 trees and 4 threads
As the training flag is zero, down-weighting unknown functions
Evaluating...
Checking prophages we might have found
Potential prophages (sorted highest to lowest)
Contig Start Stop Number of potential genes Status
NC_002737 778642 820599 54 Kept
NC_002737 1191309 1222549 47 Kept
NC_002737 529631 569288 45 Kept
NC_002737 1770150 1785658 22 Kept
NC_002737 892723 893805 4 Dropped. Not enough genes
NC_002737 176004 177861 2 Dropped. Not enough genes
NC_002737 1808123 1810396 1 Dropped. Not enough genes
NC_002737 1665186 1665887 1 Dropped. Not enough genes
NC_002737 1544661 1545035 1 Dropped. Not enough genes
NC_002737 980732 981802 1 Dropped. Not enough genes
NC_002737 711922 712041 1 Dropped. Not enough genes
NC_002737 449392 449844 1 Dropped. Not enough genes
NC_002737 361432 362130 1 Dropped. Not enough genes
NC_002737 315251 315997 1 Dropped. Not enough genes
NC_002737 190526 191230 1 Dropped. Not enough genes
NC_002737 49621 51264 1 Dropped. Not enough genes
PROPHAGE: 1 Contig: NC_002737 Start: 529631 Stop: 569288
PROPHAGE: 2 Contig: NC_002737 Start: 778642 Stop: 820599
PROPHAGE: 3 Contig: NC_002737 Start: 1191309 Stop: 1222549
There were 3 repeats with the same length as the best. One chosen somewhat randomly!
PROPHAGE: 4 Contig: NC_002737 Start: 1770150 Stop: 1785658
Creating output files
Writing bacterial and phage DNA as fasta
Traceback (most recent call last):
File "/opt/anaconda3/bin/PhiSpy.py", line 125, in
main(sys.argv)
File "/opt/anaconda3/bin/PhiSpy.py", line 117, in main
PhiSpyModules.write_all_outputs(**vars(args_parser))
File "/opt/anaconda3/lib/python3.8/site-packages/PhiSpyModules/writers.py", line 361, in write_all_outputs
write_phage_and_bact(self)
File "/opt/anaconda3/lib/python3.8/site-packages/PhiSpyModules/writers.py", line 151, in write_phage_and_bact
SeqIO.write(pp_gbk, phage_genbank, "genbank")
File "/opt/anaconda3/lib/python3.8/site-packages/Bio/SeqIO/init.py", line 530, in write
count = writer_class(handle).write_file(sequences)
File "/opt/anaconda3/lib/python3.8/site-packages/Bio/SeqIO/Interfaces.py", line 244, in write_file
count = self.write_records(records, maxcount)
File "/opt/anaconda3/lib/python3.8/site-packages/Bio/SeqIO/Interfaces.py", line 218, in write_records
self.write_record(record)
File "/opt/anaconda3/lib/python3.8/site-packages/Bio/SeqIO/InsdcIO.py", line 981, in write_record
self._write_the_first_line(record)
File "/opt/anaconda3/lib/python3.8/site-packages/Bio/SeqIO/InsdcIO.py", line 744, in _write_the_first_line
raise ValueError("missing molecule_type in annotations")
ValueError: missing molecule_type in annotations

Following two examples also give error-

PhiSpy.py CP015626.gbk -o CP015626.phages -p CP015626 --threads 4 --log CP015626_choice4.log --output_choice 4
PhiSpy.py CP016072.gbk -o CP016072.phages -p CP016072 --threads 4 --log CP016072_choice4.log --output_choice 4

--output_choice 4 : does not work as it says, only generates M1_prophage.fasta, other 3 files are empty
--output_choice 8 : does not work as it says, only generates M1_prophage.fasta, other 3 files are empty
--output_choice 11 : Interestingly, it works and generates M1_prophage_coordinates.tsv, M1_prophage_information.tsv and M1_Streptococcus_pyogenes_M1_GAS.gbk

@pdec
Copy link
Collaborator

pdec commented Sep 23, 2020

Hey,

thanks for the note!

I could reproduce your error after updating Biopython to v1.78. Starting from this version the Bio.Alphabet is removed and requires some changes in the code. More about that here.

To avoid your error you can either switch to previous Biopython version (eg. conda install biopython=1.77) or use the newest PhiSpy version v4.2.5.

I recommend the second option as we also fixed the --output_choice 8. In version 4.1.22 code 8 works as code 7 due to ">" instead of ">=" typo. Note that if you want several different output files you must add code numbers, e.g. code 11 will provide the output of codes 8, 2 and 1.

Let us know whether it fixes your error.

Przemek

@tauqeer9
Copy link
Author

Thank you very much. It is working perfectly fine now. Installed the latest version using git. I will eventually install through bioconda when latest version is available.

Thanks again for fixing those errors.

Tauqeer

@pdec
Copy link
Collaborator

pdec commented Sep 25, 2020

That's great!
Thanks for letting us know about the error.

Przemek

@pdec pdec closed this as completed Sep 25, 2020
@pdec pdec added the bug label Sep 25, 2020
@pdec pdec pinned this issue Sep 25, 2020
@pdec pdec unpinned this issue Sep 25, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants