You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, I'm correcting a query assembly based on a reference assembly, and updating the query gff for the corrections. But, I'm hitting the "Inconsistent input files" error.
Here's my command and the error from updategff:
ragtag.py updategff PHRA102.gff PR-102_v3.1_cor-PA-u/ragtag.correct.agp
Thu Sep 29 09:15:16 2022 --- VERSION: RagTag v2.1.0
Thu Sep 29 09:15:16 2022 --- CMD: ragtag.py updategff PHRA102.RXLRs.CRNs.gff PR-102_v3.1_cor-PA-u/ragtag.correct.agp
##gff-version 3
Phyram_PR-102_s0001 AUGUSTUS gene 133386 134116 . + . ID=PHRA102_1;Name=hprt1_2;locus_tag=KRP23_1
Traceback (most recent call last):
File "/nfs5/BPP/Grunwald_Lab/home/carleson/opt/conda/envs/ragtag/bin/ragtag_update_gff.py", line 162, in <module>
main()
File "/nfs5/BPP/Grunwald_Lab/home/carleson/opt/conda/envs/ragtag/bin/ragtag_update_gff.py", line 156, in main
sup_update(gff_file, agp_file)
File "/nfs5/BPP/Grunwald_Lab/home/carleson/opt/conda/envs/ragtag/bin/ragtag_update_gff.py", line 114, in sup_update
raise ValueError("Inconsistent input files.")
ValueError: Inconsistent input files.
##gff-version 3
Phyram_PR-102_s0001 AUGUSTUS gene 133386 134116 . + . ID=PHRA102_1;Name=hprt1_2;locus_tag=KRP23_1
Phyram_PR-102_s0001 AUGUSTUS mRNA 133386 134116 . + . ID=PHRA102_1.1;Parent=PHRA102_1;Dbxref=CDD:cd06223,InterPro:IPR000836,InterPro:IPR005904,PFAM:PF00156,TIGRFAM:TIGR01203,UniProtKB/Swiss-Prot:Q6WIT9;Name=hprt1_2;Ontology_term=GO:0009116,GO:0004422,GO:0006166;locus_tag=KRP23_1;product=Hypoxanthine-guanine phosphoribosyltransferase
Phyram_PR-102_s0001 AUGUSTUS exon 133386 133391 . + . ID=PHRA102_1.1-exon1;Parent=PHRA102_1.1;locus_tag=KRP23_1
Phyram_PR-102_s0001 AUGUSTUS exon 133454 134116 . + . ID=PHRA102_1.1-exon2;Parent=PHRA102_1.1;locus_tag=KRP23_1
Phyram_PR-102_s0001 AUGUSTUS CDS 133386 133391 1 + 0 ID=PHRA102_1.1-cds1;Parent=PHRA102_1.1;locus_tag=KRP23_1;product=Hypoxanthine-guanine phosphoribosyltransferase
Phyram_PR-102_s0001 AUGUSTUS CDS 133454 134116 1 + 0 ID=PHRA102_1.1-cds2;Parent=PHRA102_1.1;locus_tag=KRP23_1;product=Hypoxanthine-guanine phosphoribosyltransferase
Phyram_PR-102_s0001 AUGUSTUS intron 133392 133453 . + . ID=PHRA102_1.1-intron1;Parent=PHRA102_1.1;locus_tag=KRP23_1
Phyram_PR-102_s0001 AUGUSTUS start_codon 133386 133388 . + 0 ID=PHRA102_1.1-start_codon1;Parent=PHRA102_1.1;locus_tag=KRP23_1
Phyram_PR-102_s0001 AUGUSTUS stop_codon 134114 134116 . + 0 ID=PHRA102_1.1-stop_codon1;Parent=PHRA102_1.1;locus_tag=KRP23_1
And first 10 lines of the AGP:
## agp-version 2.1
# AGP created by RagTag v2.1.0
Phyram_PR-102_s0001 1 152786 1 W Phyram_PR-102_s0001_1_152786_+ 1 152786 +
Phyram_PR-102_s0001 152787 381574 2 W Phyram_PR-102_s0001_152787_381574_+ 1 228788 +
Phyram_PR-102_s0001 381575 422454 3 W Phyram_PR-102_s0001_381575_422454_+ 1 40880 +
Phyram_PR-102_s0001 422455 556621 4 W Phyram_PR-102_s0001_422455_556621_+ 1 134167 +
Phyram_PR-102_s0001 556622 711649 5 W Phyram_PR-102_s0001_556622_711649_+ 1 155028 +
Phyram_PR-102_s0001 711650 727677 6 W Phyram_PR-102_s0001_711650_727677_+ 1 16028 +
Phyram_PR-102_s0001 727678 800993 7 W Phyram_PR-102_s0001_727678_800993_+ 1 73316 +
Phyram_PR-102_s0001 800994 1033585 8 W Phyram_PR-102_s0001_800994_1033585_+ 1 232592 +
Based on the code for updategff I'm getting this error because column 6 of the AGP don't have a match in the GFF I provided, but I thought that I'm using this script for the exactly described purpose. Let me know if I'm reading the documentation wrong.
Thanks for any help!
The text was updated successfully, but these errors were encountered:
I realized I needed to use the -c flag so the tool looks for the AGP object in the gff instead of looking for the AGP component.
Then I found that updategff requires no gaps are in the agp. Got around that error by adding a new flag -s for when splitasm was used, maintaining current functionality by allowing gaps only if -s was set.
I still don't have a complete fix because I ran into the issue that some of the genes in the gff overlap stretches of N that were removed by splitasm. I could either
Change the ValueError to a warning and just drop any feature overlapping Ns
Add a -gff option to splitasm just as implement in the correct module
Hi, I'm correcting a query assembly based on a reference assembly, and updating the query gff for the corrections. But, I'm hitting the "Inconsistent input files" error.
Here's my command and the error from updategff:
Here was my command for correction:
And here are the first 10 lines of my GFF:
And first 10 lines of the AGP:
Based on the code for updategff I'm getting this error because column 6 of the AGP don't have a match in the GFF I provided, but I thought that I'm using this script for the exactly described purpose. Let me know if I'm reading the documentation wrong.
Thanks for any help!
The text was updated successfully, but these errors were encountered: