You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
First of all thank you for your work on publishing the code and pipeline.
I was wondering if you could share more details on how you have processed your data.
I have downloaded the data you generated for COV413A cell line and processed it according to your pipeline. Of course, some additional preprocessing steps were necessary, including generating individual fastq files from interleaved format, running STAR and STRINGTIE.
These are the candidate transcripts you recover (Supplementary Table 9 filtered on COV413A):
</style>
Transcript ID
Class
Family
Subfam
Chr TE
Start TE
End TE
Location TE
Gene
Splice Target
Strand
Cell Line
CAGE TPM
TCONS_00027238
DNA
hAT-Charlie
MER1B
chr12
130340312
130340636
intron_1
PIWIL1
exon_2
+
COV413A
0,396505544
TCONS_00034780
LINE
L1
L1PA2
chr14
71842964
71848996
Intergenic
RGS6
exon_2
+
COV413A
3,105960093
TCONS_00055478
LINE
L1
L1PA2
chr18
34552378
34558395
Intergenic
DTNA
exon_2
+
COV413A
0,660842573
TCONS_00086600
LINE
L1
L1PA2
chr3
58842154
58848179
Intergenic
FAM3D
exon_2
-
COV413A
0,396505544
TCONS_00098838
LINE
L1
L1PA2
chr5
102671229
102677260
Intergenic
SLCO6A1
exon_2
-
COV413A
0,72692683
TCONS_00103663
LINE
L1
L1PB1
chr6
7347074
7349650
intron_8
CAGE1
exon_9
-
COV413A
0,396505544
TCONS_00107032
LINE
L1
L1HS
chr7
12497211
12500000
Intergenic
AC005281.1
exon_2
+
COV413A
0,72692683
TCONS_00107035
LINE
L1
L1HS
chr7
12497211
12500000
Intergenic
AC005281.1
exon_5
+
COV413A
0,72692683
TCONS_00107037
LINE
L1
L1HS
chr7
12497211
12500000
Intergenic
SCIN
exon_2
+
COV413A
0,72692683
TCONS_00116734
LINE
L1
L1PA2
chr8
66949103
66955119
intron_3
TCF24
exon_4
-
COV413A
0,660842573
TCONS_00119408
LINE
L1
L1PA2
chr9
94089082
94095103
intron_4
PTPDC1
exon_5
+
COV413A
0,330421286
TCONS_00070187
LTR
ERV1
LTR7
chr2
38086114
38086512
Intergenic
CYP1B1
exon_2
-
COV413A
15,92630601
TCONS_00074167
LTR
ERV1
LTR2B
chr20
15985767
15986246
intron_13
MACROD2
exon_14
+
COV413A
0,396505544
TCONS_00089490
LTR
ERV1
LTR2B
chr4
37546188
37546669
intron_1
C4orf19
exon_2
+
COV413A
0,859095345
TCONS_00105271
LTR
ERVL
LTR18A
chr6
79313214
79313548
Intergenic
HMGN3
exon_1
-
COV413A
2,841623064
TCONS_00016149
SINE
Alu
AluY
chr10
101729855
101730163
Intergenic
FBXW4
exon_1
-
COV413A
0,991263859
TCONS_00016150
SINE
Alu
AluY
chr10
101729855
101730163
Intergenic
FBXW4
exon_2
-
COV413A
0,991263859
TCONS_00030551
SINE
Alu
AluJo
chr12
121847358
121847535
intron_9
HPD
exon_10
-
COV413A
0,330421286
TCONS_00041268
SINE
Alu
AluY
chr15
51603584
51603891
intron_1
DMXL2
exon_2
-
COV413A
1,652106432
I recover these - sorry for the truncated output.
I have used the hg38 reference genome and gtf, your reference data download and and your pre-defined arguments.txt.
I hope we together can get to the bottom of why I don't recover any of the same TE chimers as you.
Best regards
Nanna
The text was updated successfully, but these errors were encountered:
This pipeline is to be used with short-read (ideally paired-end) RNA sequencing data to help find potential TE promoters. The data that you downloaded was nanoCAGE data, which can help validate promoter locations. Thus, you should not use this pipeline on the nanoCAGE data itself. The nanoCAGE data will help define promoters accurately, but it will normally not be able to assemble the full-length transcript.
In addition, we used the cell lines to validate the TE-gene chimeras seen in the tumor samples. There could be TE-gene chimeras in the cell lines that were not part of our reference that could be new.
Hi there
First of all thank you for your work on publishing the code and pipeline.
I was wondering if you could share more details on how you have processed your data.
</style>I have downloaded the data you generated for COV413A cell line and processed it according to your pipeline. Of course, some additional preprocessing steps were necessary, including generating individual fastq files from interleaved format, running STAR and STRINGTIE.
These are the candidate transcripts you recover (Supplementary Table 9 filtered on COV413A):
I recover these - sorry for the truncated output.
I have used the hg38 reference genome and gtf, your reference data download and and your pre-defined arguments.txt.
I hope we together can get to the bottom of why I don't recover any of the same TE chimers as you.
Best regards
Nanna
The text was updated successfully, but these errors were encountered: