-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Converts to embl but with interrogation marks in SQ #35
Comments
Your gff annotation contains annotations attached to a sequence that is absent from the fasta file provided. So it creates a unknown sequence (with ??? because it cannot guess the DNA sequence) with the lenght corresponding to the position of the last feature found on it. Check your fasta file, it shoiuld be the same as used for the annotation. Otherwise you can filter the annotations not attached to sequences present in the fasta file using the script |
Thanks, that worked. Just one question more, when i was tryng to obtain the embl for a different species i encountered the following error:
Is the problem due to the files I'm using as input? |
The original question is resolved. I close this issue. I created a new issue reporting the other problem you are encountered => see #38 |
I'm working with yeasts and I really need their embl files, but when i run the programm (using bash and python), i encounter with the next problem:
The current warnings pops on the terminal (though I dont think they are the cause of the problem):
17:25:17 ERROR feature: >>trna<< is not a valid EMBL feature type. You can ignore this message if you don't need the feature.
Otherwise tell me which EMBL feature it corresponds to by adding the information within the json mapping file.
17:25:17 WARNING feature: Unknown qualifier 'NAME' - skipped
17:25:17 ERROR feature: >>trna_exon<< is not a valid EMBL feature type. You can ignore this message if you don't need the feature.
Otherwise tell me which EMBL feature it corresponds to by adding the information within the json mapping file.
17:25:30 ERROR feature: >>UTR<< is not a valid EMBL feature type. You can ignore this message if you don't need the feature.
Otherwise tell me which EMBL feature it corresponds to by adding the information within the json mapping file.
Conversion done
And, as it says on the final line, the conversion is done, when I open the generated embl the features are fine, but the sequence is all interrogation marks.
FT /transl_table=12
XX
SQ Sequence 2596028 BP; 0 A; 0 C; 0 G; 0 T; 2596028 other;
?????????? ?????????? ?????????? ?????????? ?????????? ?????????? 60
?????????? ?????????? ?????????? ?????????? ?????????? ?????????? 120
And if I keep scrolling, it is as if the conversion had started again:
//
ID XXX; XXX; linear; genomic DNA; XXX; XXX; 2596667 BP.
XX
AC XXX;
XX
AC * SOME_YEAST
XX
PR Project:XXX;
After that the only existing feature is "gap" and the sequence (SQ) is now like it is suppossed to be:
FT gap 2556681..2556981
FT /estimated_length=301
XX
SQ Sequence 2596667 BP; 806943 A; 475017 C; 477105 G; 804281 T; 33321 other;
AATCTGCTCA GTAAGGCCCA TAAATCGGCT CTGCATTTCT TCTGTGGGCA TTTTGCCGTA 60
CTTTTTTAAT TATGTTGCAG ACGAAACTGA ATCAAGCTCG TCGACAGCTT CGTACAGCCT 120
I have no idea why this would happen, I really hope you can help me figure out what is happening,
I really need those emb files.
The text was updated successfully, but these errors were encountered: