-
Notifications
You must be signed in to change notification settings - Fork 168
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Quantify the expression of transposable elements #380
Comments
I think you should use all sequences for the kallisto run, because TEs are
very diverse and many sequences will not have a good alignment if you align
to the consensus (this depends on the type of TE to some extent). Then,
after alignment, you'd map the reads from the TE copy to a consensus, one
per type.
Such a mapping for human is described here, the accompanying website has
the consensus and a tool for the mapping:
https://mobilednajournal.biomedcentral.com/articles/10.1186/s13100-020-00208-w
and the methods point to a script that can make them for other organisms. I
am also aware of an R module for this task (and can point you to it)
Let me know if that doesn't work.
best
Max
…On Thu, Mar 23, 2023 at 9:12 AM Tao ***@***.***> wrote:
Hi,
We are interested in the expression of transposable elements (TE).
An intact TE contains LTR regions and CDS region that encode transposases,
which behave like polycistronic mRNA.
I was wondering if I could use these CDS regions of each TE as the
reference to quantify TE expression using Kallisto.
Another question is should I use all TE CDS as references, or TE CDS plus
entire predicted genes as the reference.
The RNA-seq data we used is just normal illumina-based RNA-seq data.
Thanks very much!
Best, Tao
—
Reply to this email directly, view it on GitHub
<#380>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AACL4TJ4BINSXFPJC72DPXLW5QAYPANCNFSM6AAAAAAWE3A7EY>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
Thank you so much Max for the response. |
If you mean "coding sequence" with CDS, I dont know why you would restrict
yourself to that. I'd use the entire sequence of all TEs.
And yes, I would first align to the TE genome sequences and then map from
that to the consensus, not align to the consensus with kallisto.
…On Tue, Apr 4, 2023 at 10:55 AM Tao ***@***.***> wrote:
Thank you so much Max for the response.
So you suggested we use all predicted TE plus predicted genes for kallisto
quantification, and for the predicted TEs, we use only the coding regions
(exclude the flanking LTR regions).
And after this, I am not quite aware of the purpose you've proposed, we
then further classify the mapped reads (those mapped onto TE CDS) to
families, using a consensus sequence for each TE family(?) In order to have
an idea such as which clade of TE highly expressed?
Thanks very much!
Best, Tao
—
Reply to this email directly, view it on GitHub
<#380 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AACL4TNMNHOONESMYOJYZOLW7POYZANCNFSM6AAAAAAWE3A7EY>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
Oh LTRs probably not, but I imagine for many other elements, they have
sequences that are neither LTR nor coding and I'd keep them. Yes, probably
a detail, but also makes it easier to take the entire sequence from the
repeatmasker output rather than start to filter on CDS annotations in there.
…On Tue, Apr 4, 2023 at 12:40 PM Tao ***@***.***> wrote:
I see, thanks very much.
Because I think the entire TE contains LTR regions and the coding regions,
something similar as showed in this figure.
In theory, the LTR regions are not transcibed ?
[image: image]
<https://user-images.githubusercontent.com/16197676/229766823-d7fa054a-0f63-4f79-940a-790fd9cdf766.png>
—
Reply to this email directly, view it on GitHub
<#380 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AACL4TJSXAP2SLUZTOC2AJDW7P3DPANCNFSM6AAAAAAWE3A7EY>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
I see, thanks! |
Hi,
We are interested in the expression of transposable elements (TE).
An intact TE contains LTR regions and CDS region that encode transposases, which behave like polycistronic mRNA.
I was wondering if I could use these CDS regions of each TE as the reference to quantify TE expression using Kallisto.
Another question is should I use all TE CDS as references, or TE CDS plus entire predicted genes as the reference.
The RNA-seq data we used is just normal illumina-based RNA-seq data.
Thanks very much!
Best, Tao
The text was updated successfully, but these errors were encountered: