-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to set founding.cluster? #4
Comments
Assuming this is monoclonal cancer initiation, the founding.cluster is often the cluster with the highest CCF (or VAF) in most (or in ideal situation, all) samples. 1000 bootstraps is reasonable for first run. The number of samples and the number of clusters affect the performance time more. How many clusters/samples do you have for your case? You may need some clean-up (eg. removing small or noise clusters). Also, are there many clusters with very small CCF? That may increase the number of models to consider. |
Thanks! I only have two samples, but do have smaller clusters. I removed the cluster with very few mutations (what's the cut-off? , I removed clusters with only 1 mutation). it finished much sooner, but none model were found even at p < 0.1. what is alpha for? x <- infer.clonal.models(variants= clonevol_input,
cluster.col.name="cluster",
model = "polyclonal",
vaf.col.names=vaf.col.names,
subclonal.test="bootstrap",
subclonal.test.model="non-parametric",
cluster.center="mean",
num.boots=1000,
founding.cluster= NULL,
min.cluster.vaf=0.01,
p.value.cutoff= 0.1,
alpha= 0.5,
random.seed=63108)
Sample 1: Pa26T1.vaf <-- Pa26T1.vaf
Sample 2: Pa26T2.vaf <-- Pa26T2.vaf
Using polyclonal model
Note: all VAFs were divided by 100 to convert from percentage to proportion.
Generating non-parametric boostrap samples...
Pa26T1.vaf : Enumerating clonal architectures...
Determining if cluster VAF is significantly positive...
Exluding clusters whose VAF < min.cluster.vaf=0.001
Non-positive VAF clusters:
Pa26T1.vaf : 910 clonal architecture model(s) found
Pa26T2.vaf : Enumerating clonal architectures...
Determining if cluster VAF is significantly positive...
Exluding clusters whose VAF < min.cluster.vaf=0.001
Non-positive VAF clusters:
Pa26T2.vaf : 376 clonal architecture model(s) found
Finding matched clonal architecture models across samples...
Found 0 compatible model(s)
Found 0 compatible evolution models
Pruning merged clonal evolution trees....
Number of unique pruned trees: 0
Scoring models...
0 model(s) with p-value <= 0.1
***WARN: Inra-tumor heterogeneity could result in a clone (eg. founding)
that is not present (no cells) in any samples, although detectable via
clonal marker variants due to that its subclones are distinct across
samples. Therefore, a model with a higher p-value for the CCF of such
a clone can still be biologically consistent, interpretable, and
interesting! Manual investigation of those higher p-value models
is recommended
|
The cutoff depends on how confident you are on the clusters, eg. if your data are deep enough such that you can trust a cluster of >=5 variants, then use 5. However, this is a bit arbitrary though, and there is no formula for this. I would visualize the clusters first (eg. a pairwise plot of CCF of the variants grouped by clusters for the two samples), and check if the clusters look reasonable. Also, it seems that your scaled VAF is from 0-1. The default for clonevol is 0-100 (percentage). This probably does not affect the underlying inference, but it is good to use 1-100 or set vaf.in.percent=FALSE when calling infer.clonal.models. You should also set alpha and p.value.cutoff the same, and try lower value, eg. 0.01. |
Thanks! I did scale variant_prevalence by *100/2 to get VAF ranging from 0-50. Even after playing with different arguments, I still do not have any matched clonal architecture models across samples. what should I do with them? I know the paper is under review, I would like to read to understand more of the tool. Thanks, |
Visualization of the clusters would help identifying problems with the clustering, eg. is there an outlier or abnormal cluster that conflicts between samples or shows sign of polyclonal cancer initiation? Clonevol does require a reasonably good clustering to start with. Alpha is the significant threshold to identify if a clone exists in a sample. I am working on the manual and technical document (takes too long already sorry), and the manuscript should be posted on bioRxiv soon too. Stay tuned. Thanks. |
looking forward to read it. Tommy |
Looks a little odd. Is this the scaled VAF based on CCF estimated via Pyclone? Should the highest VAF be near 50%? |
sorry, when plot the VAF scatter plot, I used the raw output from pyclone, However, when I put the data into clonevol, I did scale to 0-50. Thanks very much for your help! |
Sorry for slow response. I think some models can be inferred, but you need a tweak.
Are their many copy altered variants? If not, you can try to cluster by SciClone as well. |
Thank you for your answer.
I also attached the scaled VAF (pyclone CCF *100/2) for you to test if you can. Thanks again! model-Pa30T2.vaf.pdf |
|
My data have a lot of copy number changes, so I did not use sciclone. For pyclone, one has to provide both the tumor purity (inferred from other tools such as sequenza) and the VAF to the tool. so the end results of CFF always ranges from 0 to 1. |
What CCF range you'll get if you specify tumor purity of 100% and also divide VAF by 2? |
CCF will be always from 0 to 1. pyclone uses the tumor purity and copynumber information in addition to VAF to identify clusters. |
Yes it will be from 0-1, but do you see a smooth range when plotting or a hard cut at 1? If it is smooth, then you can use clonevol. |
Hi,
I have several questions on how to set the parameters for the
infer.clonal.models
function.First, the help page does not explain all the arguments.
e.g. how do I set the founding.cluster?
in the github page example, there is a
cluster.center
argument, but I do not see it in the help.How many times of bootstrap usually is enough? default is 1000, and for around 300 mutations, it is taking several hours to run, and not finish yet..
Thanks for giving more information.
Best,
Tommy
The text was updated successfully, but these errors were encountered: