tweak README

BasisResearch · Nov 10, 2021 · 8cc9f49 · 8cc9f49
1 parent cd8ac3f
commit 8cc9f49
Show file tree

Hide file tree

Showing 2 changed files with 19 additions and 17 deletions.
diff --git a/README.md b/README.md
@@ -18,20 +18,21 @@ In the context of generalized linear models with `P` covariates `{X_1, ..., X_P}
 Bayesian variable selection can be used to identify *sparse* subsets of covariates (i.e. far fewer than `P`) 
 that are sufficient for explaining the observed responses.
 
-In more detail, Bayesian variable selection can be understood as a model selection problem in which we consider 
+In more detail, Bayesian variable selection is formulated as a model selection problem in which we consider 
 the space of `2^P` models in which some covariates are included and the rest are excluded.
+For example, one particular model might be `Y = b_3 X_3 + b_9 X_9`.
 A priori we assume that models with fewer included covariates are more likely than those with more included covariates.
-The models best supported by the data are encoded as a posterior distribution over the space of models.
+The set of parsimonious models best supported by the data then emerges from the posterior distribution over the space of models.
 
-What's especially appealing about Bayesian variable selection is that it provides us with an interpretable score
+What's especially appealing about Bayesian variable selection is that it provides an interpretable score
 called the PIP (posterior inclusion probability) for each covariate `X_p`. 
 The PIP is a true probability and so it satisfies `0 <= PIP <= 1` by definition.
 Covariates with large PIPs are good candidates for being explanatory of the response `Y`.
 
 Being able to compute PIPs is particularly useful for high-dimensional datasets with large `P`.
 For example, we might want to select a small number of covariates to include in a predictive model (i.e. feature selection). 
 Alternatively, in settings where it is implausible to subject all `P` covariates to 
-some expensive downstream analysis (e.g. a lab experiment),
+some expensive downstream analysis (e.g. a laboratory experiment),
 Bayesian variable selection can be used to select a small number of covariates for further analysis. 
 
 
@@ -69,7 +70,7 @@ selector = NormalLikelihoodVariableSelector(dataframe,  # pass in the data
                                             S=1,        # specify the expected number of covariates to include a priori
                                            )
 
-# run the MCMC algorithm to compute posterior compusion probabilities and other posterior quantities of interest
+# run the MCMC algorithm to compute posterior inclusion probabilities and other posterior quantities of interest
 selector.run(T=1000, T_burnin=500)
 
 # inspect the results

diff --git a/docs/source/getting_started.rst b/docs/source/getting_started.rst
@@ -5,26 +5,27 @@ What is Bayesian variable selection?
 ------------------------------------
 
 Bayesian variable selection is a model-based approach for identifying parsimonious explanations of observed data.
-In the context of generalized linear models with `P` covariates `{X_1, ..., X_P}` and responses `Y`, 
-Bayesian variable selection can be used to identify *sparse* subsets of covariates (i.e. far fewer than `P`) 
+In the context of generalized linear models with `P` covariates `{X_1, ..., X_P}` and responses `Y`,
+Bayesian variable selection can be used to identify *sparse* subsets of covariates (i.e. far fewer than `P`)
 that are sufficient for explaining the observed responses.
 
-In more detail, Bayesian variable selection can be understood as a model selection problem in which we consider 
+In more detail, Bayesian variable selection is formulated as a model selection problem in which we consider    
 the space of `2^P` models in which some covariates are included and the rest are excluded.
+For example, one particular model might be `Y = b_3 X_3 + b_9 X_9`.
 A priori we assume that models with fewer included covariates are more likely than those with more included covariates.
-The models best supported by the data are encoded as a posterior distribution over the space of models.
+The set of parsimonious models best supported by the data then emerges from the posterior distribution over the space of models.
 
-What's especially appealing about Bayesian variable selection is that it provides us with an interpretable score
-called the PIP (posterior inclusion probability) for each covariate `X_p`. 
+What's especially appealing about Bayesian variable selection is that it provides an interpretable score
+called the PIP (posterior inclusion probability) for each covariate `X_p`.
 The PIP is a true probability and so it satisfies `0 <= PIP <= 1` by definition.
 Covariates with large PIPs are good candidates for being explanatory of the response `Y`.
 
 Being able to compute PIPs is particularly useful for high-dimensional datasets with large `P`.
-For example, we might want to select a small number of covariates to include in a predictive model (i.e. feature selection). 
-Alternatively, in settings where it is implausible to subject all `P` covariates to 
-some expensive downstream analysis (e.g. a lab experiment),
-Bayesian variable selection can be used to select a small number of covariates for further analysis. 
-  
+For example, we might want to select a small number of covariates to include in a predictive model (i.e. feature selection).
+Alternatively, in settings where it is implausible to subject all `P` covariates to
+some expensive downstream analysis (e.g. a laboratory experiment),
+Bayesian variable selection can be used to select a small number of covariates for further analysis.
+
 
 Requirements
 -------------
@@ -73,7 +74,7 @@ Using millipede is easy:
                                                 S=1,        # specify the expected number of covariates to include a priori
                                                )
 
-    # run the MCMC algorithm to compute posterior compusion probabilities and other posterior quantities of interest
+    # run the MCMC algorithm to compute posterior inclusion probabilities and other posterior quantities of interest
     selector.run(T=1000, T_burnin=500)
 
     # inspect the results