-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ERROR: Fail to convert seq IDs to less than 15 character #14
Labels
Comments
intirules
changed the title
RROR: Fail to convert seq IDs to less than 15 character
ERROR: Fail to convert seq IDs to less than 15 character
May 4, 2018
Hello,
This message tells that sequence names in the genome is too long for
RepeatMasker, which could only take up to 15 characters. LTR_retriever
takes two approaches to convert long sequence names to fit the requirement,
which is successful most of the time but could also fail occasionally (such
as your 1/22 genome). In such cases, you may need to apply some command
line skills to convert sequence names manually, then feed the converted
genome to LTR_retriever. You may not need to rerun LTRharvest if your
sequence order is not changed.
For sequence name conversion, I had successful experiences using Perl
one-liners such as:
perl -nle 's/PATTERN//g; print $_' genome.fa > genome.fa.modified
Simply replace 'PATTERN' with the shared string among long sequence names.
Let me know if you have further questions.
Shujun
…On Fri, May 4, 2018 at 3:58 PM, Agus ***@***.***> wrote:
So i wanna run my harvest results in Retriever and 1 of 22 genomes give me
this problem.
$$$ ERROR: Fail to convert seq IDs to less than 15 characters! Please
provide a genome with shorter seq IDs.
In harvest i used:
gt ltrharvest -index 1.fna -seqids Yes tabout no -seed 30 -xdrop 5 -mat 2
-mis -2 -ins -3 -del -3 -minlenltr 100 -maxlenltr 7000 -mindistltr 1000
-maxdistltr 15000 -similar 80.0 -overlaps no -mintsd 4 -maxtsd 20 -motif
TGCA -motifmis 1 -vic 60 > 1.harvest.scn
in Retriever:
perl LTR_retriever -genome /medicina/wocana/Tesis/Secuencias/Retriever/1/1.fna
-inharvest /medicina/wocana/Tesis/Secuencias/Harvest/1/1.harvest.scn
Any idea how i can fix it?
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#14>, or mute the thread
<https://github.com/notifications/unsubscribe-auth/AFt-NJHZf41dlAfgwT5LtDCseYiXfTKrks5tvLL5gaJpZM4TzMMh>
.
|
Thx at the end we solve it with: $vi EditFasta File='1.fna' f = open(File,'r') $python EditFasta |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
So i wanna run my harvest results in Retriever and 1 of 22 genomes give me this problem.
$$$ ERROR: Fail to convert seq IDs to less than 15 characters! Please provide a genome with shorter seq IDs.
In harvest i used:
gt ltrharvest -index 1.fna -seqids Yes tabout no -seed 30 -xdrop 5 -mat 2 -mis -2 -ins -3 -del -3 -minlenltr 100 -maxlenltr 7000 -mindistltr 1000 -maxdistltr 15000 -similar 80.0 -overlaps no -mintsd 4 -maxtsd 20 -motif TGCA -motifmis 1 -vic 60 > 1.harvest.scn
in Retriever:
perl LTR_retriever -genome /medicina/wocana/Tesis/Secuencias/Retriever/1/1.fna -inharvest /medicina/wocana/Tesis/Secuencias/Harvest/1/1.harvest.scn
Any idea how i can fix it?
The text was updated successfully, but these errors were encountered: