Proposed Analysis: quantify telomerase activity across pediatric brain tumors #148

syzheng · 2019-10-04T21:36:24Z

Scientific goals

The goal is to quantify telomerase activity and correlate them with telomere length and molecular alterations (TERTp mutation, ATRX mutation, etc)

Proposed methods

We will use our newly developed method EXTEND (EXpression based Telomerase ENzymatic activity Detection)

Required input data

Gene expression from RNAseq (either of TPM, RPKM, or counts)

Proposed timeline

One to two weeks.

Relevant literature

Barthel et al. Nat Genet, 2017; Zheng et al. Cancer Cell, 2016; Ackermann et al. Science, 2016

cgreene · 2019-10-05T08:23:57Z

This sounds exciting! As I think about potential caveats, does it matter if the RNA-seq samples are poly-A selected or rRNA depleted?

syzheng · 2019-10-07T15:05:23Z

This sounds exciting! As I think about potential caveats, does it matter if the RNA-seq samples are poly-A selected or rRNA depleted?

that is a great point. we currently use data from regular polyA enriched protocol, mostly because our primary input data is TCGA. this does impact the method, because a key gene in our signature, TERC, is a non-coding RNA that is not properly captured by polyA methods. PCR shows this gene is abundantly expressed across tissues; however, RNAseq data from TCGA and GTEx only show very low expression of this gene. EXTEND demonstrates reasonable performance with both TCGA, CCLE and GTEx, but We have not tested data from total RNAseq or rRNA depletion. Great point.

cgreene · 2019-10-07T15:45:23Z

@syzheng : Ok! This dataset contains both poly-A and rRNA depleted samples. #120 and https://github.com/AlexsLemonade/OpenPBTA-analysis/tree/master/analyses/selection-strategy-comparison took a dive into the implications for gene expression analyses based on some earlier work by @cbethell.

I'm a bit confused by "a key gene in our signature, TERC, is a non-coding RNA that is not properly captured by polyA methods" and also "we currently use data from regular polyA enriched protocol, mostly because our primary input data is TCGA". Did you mean that you are better off with poly-A? There are many fewer poly-A samples here than rRNA-depleted.

As something that may be helpful in extending an analysis across both sets: @jharenza is looking to determine whether or not we can generate some that are matched (sequenced with both protocols).

syzheng · 2019-10-07T15:56:36Z

@syzheng : Ok! This dataset contains both poly-A and rRNA depleted samples. #120 and https://github.com/AlexsLemonade/OpenPBTA-analysis/tree/master/analyses/selection-strategy-comparison took a dive into the implications for gene expression analyses based on some earlier work by @cbethell.

I'm a bit confused by "a key gene in our signature, TERC, is a non-coding RNA that is not properly captured by polyA methods" and also "we currently use data from regular polyA enriched protocol, mostly because our primary input data is TCGA". Did you mean that you are better off with poly-A? There are many fewer poly-A samples here than rRNA-depleted.

As something that may be helpful in extending an analysis across both sets: @jharenza is looking to determine whether or not we can generate some that are matched (sequenced with both protocols).

EXTEND was developed using data from polyA. We essentially do not know if it works for rRNA depletion, because we did not have this type of data when we benchmarked the method. The key is TERC, a non-coding RNA that constitutes our gene signature as well as the telomerase complex. It would be great if we have a few cases that are sequenced by both methods. Otherwise, we can examine the distribution of TERC expression in the dataset to see if they behave similarly to poly A datasets.

cgreene · 2019-10-07T16:17:57Z

Gotcha! You'll find both sets of files in the data download as processed in a few different ways:

pbta-gene-expression-kallisto.polya.rds
pbta-gene-expression-kallisto.stranded.rds
pbta-gene-expression-rsem-fpkm.polya.rds
pbta-gene-expression-rsem-fpkm.stranded.rds
pbta-gene-counts-rsem-expected_count.polya.rds
pbta-gene-counts-rsem-expected_count.stranded.rds

For now, it will be interesting to see if the distribution is different and/or if TERC matches the estimates from the method in the stranded ones. Hopefully we'll have the set with both in the not terribly distant future, but we shouldn't wait for them to get started. Thanks for taking this on!

jharenza · 2019-10-28T12:41:03Z

Hi @syzheng ! Checking in on this analysis - do you have an idea of when you or your team would file a PR for this? Thanks!

syzheng · 2019-10-28T14:20:07Z

yes, we have finished the score calculation. will update on github once we have more on integration. Siyuan

…

On Mon, Oct 28, 2019 at 7:41 AM Jo Lynne ***@***.***> wrote: Hi @syzheng <https://github.com/syzheng> ! Checking in on this analysis - do you have an idea of when you or your team would file a PR for this? Thanks! — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#148?email_source=notifications&email_token=ADIP5ZEHZFT75F6MX7XLX4DQQ3MWFA5CNFSM4I5UVFSKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOECMXNVA#issuecomment-546928340>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ADIP5ZBXCXDPFBHSVHZRR7DQQ3MWFANCNFSM4I5UVFSA> .

jharenza · 2019-12-12T02:57:45Z

Hi @syzheng! Wanted to update you that with V12 (#326) of the data release, we will provide stranded seq for 45 samples on which we also have polyA rna-seq, so would be interesting to determine whether there are telomerase prediction differences in these two sets of data. Stay tuned end of this week/early next week. Also looking forward to your PR!

syzheng · 2019-12-12T15:27:46Z

Thanks for the heads up! I will update the group by next week. best,

…

On Wed, Dec 11, 2019 at 8:57 PM Jo Lynne ***@***.***> wrote: Hi @syzheng <https://github.com/syzheng>! Wanted to update you that with V12 of the data release, we will provide stranded seq for 45 samples on which we also have polyA rna-seq, so would be interesting to determine whether there are telomerase prediction differences in these two sets of data. Stay tuned end of this week/early next week. Also looking forward to your PR! — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#148?email_source=notifications&email_token=ADIP5ZAKDSELM3ZAIF5EAPTQYGSCXA5CNFSM4I5UVFSKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEGVJOIY#issuecomment-564827939>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ADIP5ZGE4SFAAGIAL6TLSETQYGSCXANCNFSM4I5UVFSA> .

jharenza · 2020-01-04T00:36:16Z

Hi @syzheng ! Happy New Year! Do you think your team will be able to submit a PR on this analysis sometime soon? We are starting to wrap up/finalize analyses and determine manuscript figures. Thanks!

syzheng · 2020-01-04T00:37:49Z

sorry. yes, I will make sure to complete it next few days.

…

On Fri, Jan 3, 2020 at 6:36 PM Jo Lynne ***@***.***> wrote: Hi @syzheng <https://github.com/syzheng> ! Happy New Year! Do you think your team will be able to submit a PR on this analysis sometime soon? We are starting to wrap up/finalize analyses and determine manuscript figures. Thanks! — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#148?email_source=notifications&email_token=ADIP5ZGPN6NRVIPI2H44PG3Q37KYDA5CNFSM4I5UVFSKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEICMUHQ#issuecomment-570739230>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ADIP5ZCWNPU7JQRVL46G5L3Q37KYDANCNFSM4I5UVFSA> .

jharenza · 2020-01-04T00:39:21Z

No worries, glad to hear!

jaclyn-taroni · 2020-03-09T18:49:05Z

Addressed through #494, #506, #511, and #516

syzheng added the proposed analysis label Oct 4, 2019

jharenza added the in progress Someone is working on this issue, but feel free to propose an alternative approach! label Oct 7, 2019

jaclyn-taroni added the transcriptomic Related to or requires transcriptomic data label Oct 26, 2019

jharenza mentioned this issue Nov 8, 2019

Planned Analysis: Molecularly subtype all tumors #19

Closed

7 tasks

jaclyn-taroni closed this as completed Mar 9, 2020

NNoureen mentioned this issue May 11, 2020

HistologicalAnalysis_update #681

Merged

NNoureen mentioned this issue May 20, 2020

Medulloblastoma_EXTENDscores_Comparison #699

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposed Analysis: quantify telomerase activity across pediatric brain tumors #148

Proposed Analysis: quantify telomerase activity across pediatric brain tumors #148

syzheng commented Oct 4, 2019

cgreene commented Oct 5, 2019

syzheng commented Oct 7, 2019

cgreene commented Oct 7, 2019 •

edited

syzheng commented Oct 7, 2019

cgreene commented Oct 7, 2019

jharenza commented Oct 28, 2019

syzheng commented Oct 28, 2019 via email

jharenza commented Dec 12, 2019 •

edited

syzheng commented Dec 12, 2019 via email

jharenza commented Jan 4, 2020

syzheng commented Jan 4, 2020 via email

jharenza commented Jan 4, 2020

jaclyn-taroni commented Mar 9, 2020

Proposed Analysis: quantify telomerase activity across pediatric brain tumors #148

Proposed Analysis: quantify telomerase activity across pediatric brain tumors #148

Comments

syzheng commented Oct 4, 2019

Scientific goals

Proposed methods

Required input data

Proposed timeline

Relevant literature

cgreene commented Oct 5, 2019

syzheng commented Oct 7, 2019

cgreene commented Oct 7, 2019 • edited

syzheng commented Oct 7, 2019

cgreene commented Oct 7, 2019

jharenza commented Oct 28, 2019

syzheng commented Oct 28, 2019 via email

jharenza commented Dec 12, 2019 • edited

syzheng commented Dec 12, 2019 via email

jharenza commented Jan 4, 2020

syzheng commented Jan 4, 2020 via email

jharenza commented Jan 4, 2020

jaclyn-taroni commented Mar 9, 2020

cgreene commented Oct 7, 2019 •

edited

jharenza commented Dec 12, 2019 •

edited