-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cluster.counts #25
Comments
Hi @amanda-fitz, I am not sure if I understand your question completely, but: (1) clonevol doesn't perform clustering. It takes the clustering from pyclone and reconstruct the concensus clonal evolution tree and estimates the clonal admixture for individual samples. (2) clonevol can use/estimate both median/mean CCF. There is a parameter called |
Hi thanks for your reply and explanations.
My question is actually very simple but perhaps I didn't explain well.
I would like a numerical output for the variant cluster plot. So from the example below
[X]
I would like the Cluster number (i.e. cluster number assigned by ClonEvol, on here 1,2,3, etc, which I understand comes from pyclone cluster just assigned a new ID) and for each cluster, the median CCF by sample type. My script generates a 'cluster.counts' data file but it contains only a single median CCF output and the pyclone cluster ID. I imagine it would be straightforward to obtain a data file given this data is used to make the cluster plot?
?
…________________________________
From: Ha X. Dang <notifications@github.com>
Sent: 31 October 2018 18:49
To: hdng/clonevol
Cc: Amanda Fitzpatrick; Mention
Subject: Re: [hdng/clonevol] cluster.counts (#25)
Hi @amanda-fitz<https://github.com/amanda-fitz>,
I am not sure if I understand your question completely, but:
(1) clonevol doesn't perform clustering. It takes the clustering from pyclone and reconstruct the concensus clonal evolution tree and estimates the clonal admixture for individual samples.
(2) clonevol can use/estimate both median/mean CCF. There is a parameter called cluster.center in infer.clonal.models function that takes either a string "mean" or "median".
-
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub<#25 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AqkPftadSP2sl3zWlIMj4zYgJX7MY4vMks5uqfDGgaJpZM4YDhlT>.
The Institute of Cancer Research: Royal Cancer Hospital, a charitable Company Limited by Guarantee, Registered in England under Company No. 534147 with its Registered Office at 123 Old Brompton Road, London SW7 3RP.
This e-mail message is confidential and for use by the addressee only. If the message is received by anyone other than the addressee, please return the message to the sender by replying to it and then delete the message from your computer and network.
|
|
hi @amanda-fitz @hdng best. here are the resultls of the two sample : (2) the cluster results of sample_two : |
Hi there, can anyone help?
I have two things I'd like to add to my cluster.counts data file.
cluster number assigned by ClonEvol (1,2,3,4 etc) to data table cluster.counts - currently my cluster.counts data table only has the PyClone cluster ID.
Median CCF per sample (currently just median.ccf for each cluster)
[LM002_cluster.counts.xlsx]
Attached example of my cluster.counts file
(https://github.com/hdng/clonevol/files/2533239/LM002_cluster.counts.xlsx)
Here is my script:
pyclone.directory <- '/Users/amandafitzpatrick/Library/Mobile Documents/com
appleCloudDocs/DOCUMENTS/E57 exome sequencing/2018-08-30_results_ascat_pyclone/pyclone'output.directory <- '/Users/amandafitzpatrick/Library/Mobile Documents/com
appleCloudDocs/DOCUMENTS/E57 exome sequencing/2018-08-30_results_ascat_pyclone'sample.sheet.file <- 'sample_annotation.txt'
min.mutation.count <- 30
cancer.genes <- scan('/Users/amandafitzpatrick/Library/Mobile Documents/com
appleCloudDocs/DOCUMENTS/E57 exome sequencing/2018-08-30_results/Exome Sequencing/COMBINED list Stratton plus Caldas.txt', what = character())patient.id <- 'LM002'
loci.file <- file.path(pyclone.directory, patient.id, 'output', 'tables', 'annotated_loci.tsv')
loci <- read.table(loci.file, header = TRUE, sep = '\t', stringsAsFactors = FALSE)
sample.sheet <- read.table(sample.sheet.file, header = TRUE, sep = '\t', stringsAsFactors = FALSE)
clonevol.data <- loci %>%
mutate(
vaf = 100*cellular_prevalence/2,
is.driver = symbol %in% cancer.genes & 'exonic' == func & 'synonymous_SNV' != exonic_func
) %>%
select(mutation_id, cluster_id, sample_id, vaf, symbol, is.driver) %>%
spread(sample_id, vaf);
n.samples <- length( unique(loci$sample_id) )
if( 1 == n.samples ) stop('Need more than one sample for ClonEvol!')
cluster.counts <- loci %>%
group_by(cluster_id) %>%
summarize(
count = n()/n.samples,
min.ccf = min(cellular_prevalence),
median.ccf = median(cellular_prevalence),
mean.ccf = mean(cellular_prevalence)
) %>%
ungroup() %>%
filter(count >= min.mutation.count) %>%
arrange(-median.ccf)
recode.values <- 1:nrow(cluster.counts)
names(recode.values) <- as.character(cluster.counts$cluster_id)
clonevol.data <- clonevol.data %>%
select(-mutation_id) %>%
filter(cluster_id %in% cluster.counts$cluster_id) %>%
mutate(cluster = recode.values[ as.character(cluster_id) ] )
The text was updated successfully, but these errors were encountered: