update readme and vignette

jrs95 · Apr 8, 2024 · 0348bbd · 0348bbd
1 parent 7c821b5
commit 0348bbd
Show file tree

Hide file tree

Showing 3 changed files with 4 additions and 4 deletions.
diff --git a/DESCRIPTION b/DESCRIPTION
@@ -1,7 +1,7 @@
 Package: hyprcoloc
 Title: Hypothesis Prioritisation in multi-trait Colocalization (HyPrColoc) 
 Version: 0.0.2
-Date: 05-04-2024
+Date: 08-04-2024
 Authors@R: c(
     person(
         given = "Christopher",

diff --git a/README.md b/README.md
@@ -7,7 +7,7 @@ HyPrColoc is an efficient deterministic Bayesian divisive clustering algorithm u
 
 ## Functions
 * `hyprcoloc`: identifies clusters of colocalized traits and candidate causal SNPs using the HyPrColoc Bayesian divisive clustering algorithm.  
-* `sensitivity_plot`: plots a heatmap to visuale how stable the clusters of colocalized traits are to variations in the algorithms input parameters.  
+* `sensitivity_plot`: plots a heatmap to visualise how stable the clusters of colocalized traits are to variations in the algorithms input parameters.  
 * `cred.sets`: computes credible sets of SNPs for each cluster of colocalized traits.  
 
 ## Installation

diff --git a/vignettes/hyprcoloc.Rmd b/vignettes/hyprcoloc.Rmd
@@ -135,7 +135,7 @@ Together these two variants explain nearly 90% of the posterior probability of c
 
 It is important that users of the software ackowledge that the performance of HyPrColoc is particularly dependendant on the choice of prior configuration probabilities used (and any associated hyper-parameters) as well as the choice of regional and alignment thresholds, as these combine to quantify a lower bound with which we accept that a cluster of traits colocalize (i.e. clusters are identified when $P_RP_{A}\geq P^{\ast}_{R}P^{\ast}_{A}$). In some situations this senstivity might be modest, whilst in others it might be large. For example, through extensive simulations, we note that in the analysis of large numbers of traits, using only the default algorithm settings, can regularly result in the trait clusters containing (typically only a single) false positive. Avoiding this issue is complex as it is unlikely that there exists a one-size-fits-all approach to setting the prior configuration probabilities and likewise the regional and alignment threshold parameters. Hence, to go someway to addressing these issues we provide an analysis protocol template, to assess the strength of any conclusions. 
 
-At the end of this section we introduce a function "sensitivity.plot" which, on varying the input values for the regional and alignment thresholds as well as the prior probability of colocalization, returns a heat-map that helps us to visuale how stable the clusters of colocalized traits are to variations in the algorithms input parameters. We expect this function to be a part of standard analyses, helping users to pinpoint the best candidate clusters of colocalized traits for follow-up analyses.    
+At the end of this section we introduce a function "sensitivity.plot" which, on varying the input values for the regional and alignment thresholds as well as the prior probability of colocalization, returns a heatmap that helps us to visualise how stable the clusters of colocalized traits are to variations in the algorithms input parameters. We expect this function to be a part of standard analyses, helping users to pinpoint the best candidate clusters of colocalized traits for follow-up analyses.    
 
 ## Assessing sensitivity to changes in the prior configuration parameters 
 
@@ -239,7 +239,7 @@ sensitivity.plot(betas, ses,
 #                  reg.thresh = c(0.6,0.7,0.8,0.9), prior.c = c(0.02, 0.01, 0.005), equal.thresholds = TRUE);
 ```
 
-The output is a heat-map which helps us to visualise the clusters of colocalised traits. Traits which cluster across all values considered are given a score of 1 and traits which never cluster are given a score of 0; traits which ocassionally cluster have a score between 0 and 1. Hence, our confidence in a cluster increases with increasing score - appearing as a darker block of clustered traits in the heat-map. The heat-map reflects what we have already seen from our individual analyses: traits 1-5 form a very strong cluster of colocalized traits; traits 6-8 form a second cluster, however for certain values of the input parameters trait 6 is dropped from this cluster and; traits 9-10 form a thrid cluster, however identification of this cluster is difficult for more stringent values of the thresholds and the prior probability of colocalization. It is clear from our figure that there are three distinct clusters of colocalized traits in our sample, matching the true data generating mechanism. However, our most confident clusters of colocalised traits are traits 1-5 and separately traits 7-8. Although we have set the benchmark quite high, the two clusters 6-8 and 9-10 are present in around 70\% of the (sensitivity) scenarios considered making them reasonable candidates also. To see this we can require that "sensitivity.plot" returns the similarity matrix used to plot the heat-map:
+The output is a heatmap which helps us to visualise the clusters of colocalised traits. Traits which cluster across all values considered are given a score of 1 and traits which never cluster are given a score of 0; traits which ocassionally cluster have a score between 0 and 1. Hence, our confidence in a cluster increases with increasing score - appearing as a darker block of clustered traits in the heatmap. The heatmap reflects what we have already seen from our individual analyses: traits 1-5 form a very strong cluster of colocalized traits; traits 6-8 form a second cluster, however for certain values of the input parameters trait 6 is dropped from this cluster and; traits 9-10 form a thrid cluster, however identification of this cluster is difficult for more stringent values of the thresholds and the prior probability of colocalization. It is clear from our figure that there are three distinct clusters of colocalized traits in our sample, matching the true data generating mechanism. However, our most confident clusters of colocalised traits are traits 1-5 and separately traits 7-8. Although we have set the benchmark quite high, the two clusters 6-8 and 9-10 are present in around 70\% of the (sensitivity) scenarios considered making them reasonable candidates also. To see this we can require that "sensitivity.plot" returns the similarity matrix used to plot the heatmap:
 
 ```{r}
 res <- sensitivity.plot(betas, ses,