Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

'NoneType' object has no attribute 'strand' #5

Open
housw opened this issue Apr 21, 2019 · 6 comments
Open

'NoneType' object has no attribute 'strand' #5

housw opened this issue Apr 21, 2019 · 6 comments

Comments

@housw
Copy link

housw commented Apr 21, 2019

Hi,

I got the following error while testing viga using my own viral contigs:

... 
Detected 8 ORFs in CONTIG_996.fna
Detected 12 ORFs in CONTIG_997.fna
Detected 8 ORFs in CONTIG_998.fna
Detected 28 ORFs in CONTIG_999.fna
Done: protein prediction took 1270.81176209 seconds

Running DIAMOND to predict the protein function according to homology inference using default parameters
Done: function prediction based on DIAMOND took 1106.41461515 seconds

Running BLAST to predict the genes according to homology inference using default parameters
Done: function prediction based on BLAST took 13182.612757 seconds

Running HMMER to enrich the annotations for all viral proteins using PVOGs
Done: function prediction based on HMMER took 2681.79227614 seconds

Creating all output files
Cleaning all intermediate files
/opt/conda/lib/python2.7/site-packages/Bio/GenBank/__init__.py:1181: BiopythonParserWarning: Couldn't parse feature location: '-1..86'
  % location_line))
Traceback (most recent call last):
  File "/viga/VIGA.py", line 1121, in <module>
    if line.strand == 1:
  File "/opt/conda/lib/python2.7/site-packages/Bio/SeqFeature.py", line 167, in _get_strand
    return self.location.strand
AttributeError: 'NoneType' object has no attribute 'strand'

The genbank, csv and tbl files look great, though. Any idea?

Thanks,
Shengwei

@EGTortuero
Copy link
Owner

Hi Shengwei,

First of all, sorry for not to reply before to your message. Unfortunately, I did not receive the corresponding notification in my email and I discovered today your issue concerning VIGA. Thank you for using this Python script to annotate your viral sequences.

Now, concerning the issue, this error means that, one of the sequences was "empty". However, you said that the files looked great. May I kindly ask you to put the last lines for the new fasta and the tbl lines, please? Maybe it will allow to see what was wrong...

Best,

Enrique

@housw
Copy link
Author

housw commented Jun 21, 2019

Hi Enrique,

sure, here is the last 20 lines of the tbl output:

$ tail -n 20 test_output.tbl
			note	PVOGs: VOG4548
>Feature PierDTS_4244
486	656	gene
			locus_tag	PierDTS_4244_1
486	656	CDS
			locus_tag	PierDTS_4244_1
			product	Hypothetical protein
			protein_id	PierDTS_4244_1
1004	837	gene
			locus_tag	PierDTS_4244_2
1004	837	CDS
			locus_tag	PierDTS_4244_2
			product	Hypothetical protein
			protein_id	PierDTS_4244_2
3655	1034	gene
			locus_tag	PierDTS_4244_3
3655	1034	CDS
			locus_tag	PierDTS_4244_3
			product	Hypothetical protein
			protein_id	PierDTS_4244_3

and here is the last 20 lines of the output fasta:

$ tail -n 20 test_output.fasta
TATAAACAGTAGATGAAGTAGCAGCAAAACTACTTAATGCTCCACCAGAAACTGTAATGT
CAGCTTCTGCAAAGTCAGTTGTTGCTTCACTCGAGGTGAATGTGAGGGATAGTGTGCCAT
CATTAGATGTATCTCCATCACTGACTTCAGTTGCAGTGATTGCCATCGTTGGTGCAGTGG
TATCTACCACAGGGCTAGTGTATTCAGCCACACCAGTATCAGACACATTTGCAGTGGTGT
TTGTAGTAAAGATAACCACTTGCTCATCTGCTGCCGTAGCAGTGTACCAGTCGTTTTGAA
GAATAACACCAGACGTAAGAGTAGGTGCAGTAGCGCCTGTACCCGCTATTGCCATGGGAG
TTCCGTGTAGTTTGCCTGAACCTGGAAAATCGTGGAATAGTACGACATCACGGTCAAAGT
TAGTAGAGAAACTAGCATTTTGAACGGTATAGCCGTTAACAATGGTACTAACATACTTAT
AGTGAGTACCACTTATGACTTCTTTCCAGAACTTATAACCAAACTTAAGATGATGTGCAC
CTGTAGTTGCCCAGTTTTGACCAGTCTTTTGAAAACCAATATAGACAGAGTTGTTAGAGG
TTCCAGGACTTCCCATTGCATTAAAGATATTATCGTTAATCCAGTCTTTGGTGAATACAA
CCCGTTTACCGGACTCTGTTACATCGTCAAAGGTCATCGCTGAACCAGCATCCATTGTAT
CGCTATCAACCATAGGTGCTGAAGCCGCTATGTGAGTAAAGTTAGCATAAGCCTTTGTGC
TTCCAAAAATGAAAACTGTGATTATAAAAATGACTTTAATAATAAGCTGCTGCATCTTTT
TTCTTACTAAATTACTAAAAGTTTATAAACACACGAATTAGCTTTTCGCCGCATTGTAAG
GTACTCAATATCAATATAAAATCAATNNTAAATGAATGACTTTTACGCAACGTTCATCTA
TTTCTAGTTATATTAAAGTAATGCATTCTAGAATGACTCTCAGACTCGAAATAAGCGTGT
CTGGAGTCTACCTGGCTGAGCATATACCCAAGCATAGGGAATACCGAAAACCCATTTAAG
TACTTCAAAACTTCGCATACTTGCAAAATCTGATGCTTGATCTACTTAGTCAGTCGACAC
CGCCACTTAAATAAGTTAGTTTCG

Any idea?

Thanks,
Shengwei

@EGTortuero
Copy link
Owner

Hi Shengwei,

Sorry, I meant for the last headers in both the fasta file and the tbl file. I want to check if the names on both files are the same or not. If they are the same, I would say that the files are ok and I need to check why this error appears.

Best,

Enrique

@housw
Copy link
Author

housw commented Jul 2, 2019

Hi Enrique,

yes, here it is:

tail -n 95  test_output.tbl 
548	587	repeat_region
>Feature PierDTS_4242
3672	1123	gene
			locus_tag	PierDTS_4242_1
3672	1123	CDS
			locus_tag	PierDTS_4242_1
			product	Hypothetical protein
			protein_id	PierDTS_4242_1
			note	PVOGs: VOG9296
3599	3985	repeat_region
>Feature PierDTS_4243
357	205	gene
			locus_tag	PierDTS_4243_1
357	205	CDS
			locus_tag	PierDTS_4243_1
			product	Hypothetical protein
			protein_id	PierDTS_4243_1
615	439	gene
			locus_tag	PierDTS_4243_2
615	439	CDS
			locus_tag	PierDTS_4243_2
			product	Hypothetical protein
			protein_id	PierDTS_4243_2
852	643	gene
			locus_tag	PierDTS_4243_3
852	643	CDS
			locus_tag	PierDTS_4243_3
			product	Hypothetical protein
			protein_id	PierDTS_4243_3
983	852	gene
			locus_tag	PierDTS_4243_4
983	852	CDS
			locus_tag	PierDTS_4243_4
			product	Hypothetical protein
			protein_id	PierDTS_4243_4
1729	1292	gene
			locus_tag	PierDTS_4243_5
1729	1292	CDS
			locus_tag	PierDTS_4243_5
			product	Hypothetical-Protein | belonging to T4-LIKE GC: 66 [Synechococcus phage S-PM2]
			protein_id	PierDTS_4243_5
			note	PVOGs: VOG1085
2341	2045	gene
			locus_tag	PierDTS_4243_6
2341	2045	CDS
			locus_tag	PierDTS_4243_6
			product	Hypothetical protein
			protein_id	PierDTS_4243_6
			note	PVOGs: VOG4505
2504	2334	gene
			locus_tag	PierDTS_4243_7
2504	2334	CDS
			locus_tag	PierDTS_4243_7
			product	Hypothetical protein
			protein_id	PierDTS_4243_7
			note	PVOGs: VOG9879
2653	2504	gene
			locus_tag	PierDTS_4243_8
2653	2504	CDS
			locus_tag	PierDTS_4243_8
			product	Hypothetical protein
			protein_id	PierDTS_4243_8
			note	PVOGs: VOG4043
2965	2708	gene
			locus_tag	PierDTS_4243_9
2965	2708	CDS
			locus_tag	PierDTS_4243_9
			product	helicase [Synechococcus phage S-CAM7]
			protein_id	PierDTS_4243_9
3888	2965	gene
			locus_tag	PierDTS_4243_10
3888	2965	CDS
			locus_tag	PierDTS_4243_10
			product	helicase [Synechococcus phage S-CAM7]
			protein_id	PierDTS_4243_10
			note	PVOGs: VOG4548
>Feature PierDTS_4244
486	656	gene
			locus_tag	PierDTS_4244_1
486	656	CDS
			locus_tag	PierDTS_4244_1
			product	Hypothetical protein
			protein_id	PierDTS_4244_1
1004	837	gene
			locus_tag	PierDTS_4244_2
1004	837	CDS
			locus_tag	PierDTS_4244_2
			product	Hypothetical protein
			protein_id	PierDTS_4244_2
3655	1034	gene
			locus_tag	PierDTS_4244_3
3655	1034	CDS
			locus_tag	PierDTS_4244_3
			product	Hypothetical protein
			protein_id	PierDTS_4244_3

and the fasta file:

grep '^>' test_output.fasta  | tail -n 10 
>PierDTS_4236 [organism=Serratia marcescens subsp. marcescens] [sub-species=marcescens] [strain=AH0650_Sm1] [moltype=DNA] [tech=wgs] [gcode=11] [country=Australia] 
>PierDTS_4237 [organism=Serratia marcescens subsp. marcescens] [sub-species=marcescens] [strain=AH0650_Sm1] [moltype=DNA] [tech=wgs] [gcode=11] [country=Australia] 
>PierDTS_4238 [organism=Serratia marcescens subsp. marcescens] [sub-species=marcescens] [strain=AH0650_Sm1] [moltype=DNA] [tech=wgs] [gcode=11] [country=Australia] 
>PierDTS_4239 [organism=Serratia marcescens subsp. marcescens] [sub-species=marcescens] [strain=AH0650_Sm1] [moltype=DNA] [tech=wgs] [gcode=11] [country=Australia] 
>PierDTS_424 [organism=Serratia marcescens subsp. marcescens] [sub-species=marcescens] [strain=AH0650_Sm1] [moltype=DNA] [tech=wgs] [gcode=11] [country=Australia] 
>PierDTS_4240 [organism=Serratia marcescens subsp. marcescens] [sub-species=marcescens] [strain=AH0650_Sm1] [moltype=DNA] [tech=wgs] [gcode=11] [country=Australia] 
>PierDTS_4241 [organism=Serratia marcescens subsp. marcescens] [sub-species=marcescens] [strain=AH0650_Sm1] [moltype=DNA] [tech=wgs] [gcode=11] [country=Australia] 
>PierDTS_4242 [organism=Serratia marcescens subsp. marcescens] [sub-species=marcescens] [strain=AH0650_Sm1] [moltype=DNA] [tech=wgs] [gcode=11] [country=Australia] 
>PierDTS_4243 [organism=Serratia marcescens subsp. marcescens] [sub-species=marcescens] [strain=AH0650_Sm1] [moltype=DNA] [tech=wgs] [gcode=11] [country=Australia] 
>PierDTS_4244 [organism=Serratia marcescens subsp. marcescens] [sub-species=marcescens] [strain=AH0650_Sm1] [moltype=DNA] [tech=wgs] [gcode=11] [country=Australia]

Cheers,
Shengwei

@EGTortuero
Copy link
Owner

Hi Shengwei,

In principle I would say that the results are fine as you indicated in the first comment. I would need to polish a little bit the source code to not to print weird errors when everything was fine (maybe I would need to check the last lines for the source code of the program to see why this error was printed).

Best,

Enrique

@housw
Copy link
Author

housw commented Jul 8, 2019

Hi Enrique,

great, thanks for confirming it.

Cheers,
Shengwei

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants