Skip to content

Commit

Permalink
🔀 Merge pull request #2 from lenarddome/feature-init-matrix
Browse files Browse the repository at this point in the history
introduce new and API-breaking features, see CHANGELOG
  • Loading branch information
lenarddome committed Aug 16, 2023
2 parents c4bc53a + 0095df8 commit c892989
Show file tree
Hide file tree
Showing 11 changed files with 149 additions and 69 deletions.
2 changes: 1 addition & 1 deletion DESCRIPTION
@@ -1,6 +1,6 @@
Package: psp
Title: Parameter Space Partitioning MCMC for Global Model Evaluation
Version: 0.5.8
Version: 1.0.0
Date: 2022-08-12
Authors@R: c(person("Lenard", "Dome",
email = "lenarddome@gmail.com",
Expand Down
3 changes: 2 additions & 1 deletion NAMESPACE 100644 → 100755
Expand Up @@ -5,4 +5,5 @@ importFrom("parallel", "clusterExport", "parApply", "makeCluster",
"clusterCall", "detectCores", "stopCluster")
importFrom("stats", "rnorm", "runif")
importFrom("utils", "tail", "type.convert")
importFrom("data.table", "data.table")
importFrom("data.table", "data.table")
importFrom("methods", "is")
4 changes: 2 additions & 2 deletions R/RcppExports.R
@@ -1,7 +1,7 @@
# Generated by using Rcpp::compileAttributes() -> do not edit by hand
# Generator token: 10BE3573-1514-4C36-9D1C-5A225CD40393

pspGlobal <- function(model, control, save = FALSE, path = ".", quiet = FALSE) {
.Call(`_psp_pspGlobal`, model, control, save, path, quiet)
pspGlobal <- function(model, discretize, control, save = FALSE, path = ".", extension = ".csv", quiet = FALSE) {
.Call(`_psp_pspGlobal`, model, discretize, control, save, path, extension, quiet)
}

2 changes: 1 addition & 1 deletion R/psp_global.R
Expand Up @@ -109,7 +109,7 @@ psp_global <- function(fn, control = psp_control(), ..., quiet = FALSE) {

.Deprecated(
new = "pspGlobal", package = "psp",
msg = paste("This function is no longer maintained and will be pspGlobal in the future.")
msg = paste("This function is no longer maintained and is scheduled for removal.\nPlease use pspGlobal instead.")
)

## declare all variables
Expand Down
9 changes: 9 additions & 0 deletions docs/CHANGELOG.md
Expand Up @@ -3,9 +3,18 @@
All notable changes to this project will be documented in this file.

The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/)
Our versioning convenvtions are based on [Semantic Versioning](https://semver.org/).

## [Unreleased]

## v1.0.0

This release introduces breaking changes to the API.

- 6d51dcef6af95699fd0064585dfd6b6745b90807 :boom: separated model evaluation functions from discretization functions
- 6d51dcef6af95699fd0064585dfd6b6745b90807 :sparkles: allowed to save model outputs (continuous variables) to disk
- d407392b9bc2f909e5f086790235ba50645537a6 :sparkles: allowed to define multiple starting points for the sampling

## v0.5.8

### Added
Expand Down
64 changes: 51 additions & 13 deletions man/pspGlobal.Rd
Expand Up @@ -8,19 +8,36 @@
}

\usage{
pspGlobal(model, control, save = FALSE, path = ".", quiet = FALSE)
pspGlobal(model, discretize, control, save = FALSE, path = ".",
extension = ".csv", quiet = FALSE)
}

\arguments{
\item{model}{The ordinal function. It should take a numeric vector (parameter set)
as its argument, and return an inequality matrix (Dome & Wills, n.d.) in a \code{matrix} format with `type=double`. NA values are note allowed, see Note 2.}
\item{control}{A \code{list()} of control arguments that tunes the behaviour of the parameter space partitioning routine. See Details for more information on what to include.}
\item{save}{if \code{save = TRUE}, all evaluated parameters will be saved to disk. The deafult is \code{FALSE}.}
\item{path}{If `save = TRUE`, the path to the file that will store all evaluated parameters. The default path is the current working directory.}
\item{quiet}{If \code{FALSE} (default), print the number of the current iteration. If \code{TRUE}, do not print anything.}
\item{model}{ It should take a numeric vector (parameter set)
as its argument, and return a numeric vector of continuous variables.}
\item{discretize}{The inequality matrix constructor. It should take a numberic
vector of probabilities. It must return a matrix in a \code{matrix} format
with `type=double`. NA values are note allowed, see Note 1.}
\item{control}{A \code{list()} of control arguments that tunes the behaviour
of the parameter space partitioning routine. See Details for more information
on what to include.}
\item{save}{if \code{save = TRUE}, all evaluated parameters will be saved to
disk. The deafult is \code{FALSE}.}
\item{path}{If `save = TRUE`, the path to the file that will store all
evaluated parameters and continuous model outputs. The default path is the
current working directory. Evaluated parameters and continuous model outputs
are save separately, see Details.}
\item{extension}{If `save = TRUE`, the extension of the file that will store
all evaluated parameters and continuous model outputs. The default extension
is \code{.csv}.}
\item{quiet}{If \code{FALSE} (default), print the number of the current
iteration. If \code{TRUE}, do not print anything.}
}

\details{

\bold{Overview:}

This function implements the Parameter Space Partitioning algorithm
desribed by Pitt et al. (2006). The brief overview of the algorithm is as follows:

Expand All @@ -32,14 +49,17 @@ pspGlobal(model, control, save = FALSE, path = ".", quiet = FALSE)

1. Pick a random jumping distribution from for each ordinal pattern from the
sampling region defined by a hypershere with a center of the last recorded
parameter set for a given pattern. Clamp parameter values with their respective lower and upper bounds.
parameter set for a given pattern. Clamp parameter values with their
respective lower and upper bounds.

2. Evaluate model on all new parameter sets.

3. Record new patterns and their corresponding parameter sets. If the
parameter sets returns an already discovered pattern, add parameter set
to their records. Return to Step 1.

\bold{Tuning the behaviour of the algorithm via \code{control}:}

This behaviour is further tuned by `control`, which needs to contain a list of the following values:

\code{population}{The number of parameter sets in each ordinal region,
Expand All @@ -49,13 +69,30 @@ pspGlobal(model, control, save = FALSE, path = ".", quiet = FALSE)
\code{lower, upper}{Vectors specifiying the lower and upper boundaries of
the parameter space for each parameter. The i-th element of lower and
upper bounds applies to the i-th parameter.}
\code{init}{A vector of parameters to use as the first jumping
distribution.}
\code{init}{A marix of parameters to use as the first jumping distribution.
Each row contains the parameter set, whereas columns correspond to
freely varying paarameters of the model.}
\code{radius}{The radius of the hypershere with n-dimensions to sample from.
Must be of type double. If you are unsure what to set here, set it to 1.}
\code{param_names}{A character vector that includes the names of each parameter.}

}
\code{parameter_names}{A character vector that includes the names of each parameter.
The order of elements should correspond to the order of parameter columns in
init.}
\code{dimensionality}{A single integer that specifies the number of
dimensions for the inequality matrix. The inequality matrix is a strict
upper triangular matrix. The number of rows and columns is equal to each
other.}
\code{responses}{It is an integer that specifies the number of continuous
variables the model output before the ordinal function is applied. See Note 2.}

\strong{Saving files to disk:}

The evaluated parameter sets and their corresponding continuous model outputs
are saved to disk if \code{save = TRUE}. The evaluated parameter sets are saved in
a file with the name \code{path_parameters} and the extension specified,
whereas continuous model outputs are saved in a file with the name \code{path_continuous}
and the extension specified.

}

\value{

Expand Down Expand Up @@ -90,5 +127,6 @@ The output is a list with the following items:
\note{

1. NA values are usually a result of some parameter combination falling outside of what the model implementation can handle. It is best handled outside of the PSP routine, e.g. during the inequality matrix construction. For example, if NA is detected in the matrix, change all values to 99 before returning the output.
2. Ideally, responses and dimensionality should be the same, but we can imagine a scenario where the dimensionality of the inequality matrix will be smaller than the number of responses. For example, when continuous variables compressed into a more compact format via clustering.

}
10 changes: 3 additions & 7 deletions man/psp_control.Rd
Expand Up @@ -45,25 +45,21 @@ psp_control(radius = 0.1, init, lower, upper,
\code{export_objects} below.}
\item{export_objects}{A character vector that includes all of the objects
to be loaded into each cluster. It is handled by
\code{\link[pkg:parallel]{clusterExports}}. Default is \code{NULL}.}
\code{parallel::clusterExports}. Default is \code{NULL}.}
\item{export_libs}{A character vector that includes all the packages to
be loaded into each cluster. It is handled by
\code{\link[pkg:parallel]{clusterExports}}. Default is \code{NULL}.}
\code{parallel::clusterExports}. Default is \code{NULL}.}
\item{iterations}{The number of global iterations for psp_global. Default is
1000.}
}

\value{

Returns a control list suitable for \code{\link{psp_global}} with the above
Returns a control list suitable for \code{psp_global} with the above
elements.

}

\seealso{
\code{\link{psp_global}}.
}

\examples{
# two parameter model
psp_control(lower = rep(0, 2), upper = rep(1, 2), init = rep(0.5, 2),
Expand Down
4 changes: 0 additions & 4 deletions man/psp_global.Rd
Expand Up @@ -68,10 +68,6 @@ of \code{PSP}. A \code{PSP} object is a list with the following items:

}

\seealso{
\code{\link{psp_control}}.
}

\keyword{computational modelling; parameter space; model evaluation}

\references{
Expand Down
10 changes: 6 additions & 4 deletions src/RcppExports.cpp
Expand Up @@ -12,23 +12,25 @@ Rcpp::Rostream<false>& Rcpp::Rcerr = Rcpp::Rcpp_cerr_get();
#endif

// pspGlobal
List pspGlobal(Function model, List control, bool save, std::string path, bool quiet);
RcppExport SEXP _psp_pspGlobal(SEXP modelSEXP, SEXP controlSEXP, SEXP saveSEXP, SEXP pathSEXP, SEXP quietSEXP) {
List pspGlobal(Function model, Function discretize, List control, bool save, std::string path, std::string extension, bool quiet);
RcppExport SEXP _psp_pspGlobal(SEXP modelSEXP, SEXP discretizeSEXP, SEXP controlSEXP, SEXP saveSEXP, SEXP pathSEXP, SEXP extensionSEXP, SEXP quietSEXP) {
BEGIN_RCPP
Rcpp::RObject rcpp_result_gen;
Rcpp::RNGScope rcpp_rngScope_gen;
Rcpp::traits::input_parameter< Function >::type model(modelSEXP);
Rcpp::traits::input_parameter< Function >::type discretize(discretizeSEXP);
Rcpp::traits::input_parameter< List >::type control(controlSEXP);
Rcpp::traits::input_parameter< bool >::type save(saveSEXP);
Rcpp::traits::input_parameter< std::string >::type path(pathSEXP);
Rcpp::traits::input_parameter< std::string >::type extension(extensionSEXP);
Rcpp::traits::input_parameter< bool >::type quiet(quietSEXP);
rcpp_result_gen = Rcpp::wrap(pspGlobal(model, control, save, path, quiet));
rcpp_result_gen = Rcpp::wrap(pspGlobal(model, discretize, control, save, path, extension, quiet));
return rcpp_result_gen;
END_RCPP
}

static const R_CallMethodDef CallEntries[] = {
{"_psp_pspGlobal", (DL_FUNC) &_psp_pspGlobal, 5},
{"_psp_pspGlobal", (DL_FUNC) &_psp_pspGlobal, 7},
{NULL, NULL, 0}
};

Expand Down
75 changes: 52 additions & 23 deletions src/pspGlobal.cpp
Expand Up @@ -129,7 +129,7 @@ vec MatchJumpDists(cube updated_ordinal, cube predicted) {
}
}
}
return(matches);
return(matches + 1); // add one as c++ starts from 0
}

// create local csv file for storing coordinates
Expand Down Expand Up @@ -163,8 +163,8 @@ void WriteFile(int iteration, mat evaluation, vec matches,
}

// [[Rcpp::export]]
List pspGlobal(Function model, List control, bool save = false,
std::string path = ".", bool quiet = false) {
List pspGlobal(Function model, Function discretize, List control, bool save = false,
std::string path = ".", std::string extension = ".csv", bool quiet = false) {
// setup environment
bool parameter_filled = false;
int iteration = 0;
Expand All @@ -187,45 +187,71 @@ List pspGlobal(Function model, List control, bool save = false,
}

double radius = as<double>(control["radius"]);
rowvec init = as<rowvec>(control["init"]);
mat init = as<mat>(control["init"]);

colvec lower = as<colvec>(control["lower"]);
colvec upper = as<colvec>(control["upper"]);
int dimensions = init.n_elem;
int dimensions = init.n_cols;
// do some basic error checks
if (dimensions != lower.n_elem || dimensions != upper.n_elem) {
stop("init, lower and upper must have the same length.");
}
int dimensionality = as<int>(control["dimensionality"]);
int response_length = as<int>(control["responses"]);
rowvec counts(1, fill::ones); // keeps track of the population of ordinal regions
cube ordinal; // stores all evaluations of fn on jumping_distribution
cube filtered; // stores all unique predictions
cube storage; // stores all unique ordinal patterns
CharacterVector names = as<CharacterVector>(control["param_names"]);
if (names.size() != dimensions) {
CharacterVector parameter_names = as<CharacterVector>(control["parameter_names"]);
if (parameter_names.size() != dimensions) {
stop("Length of param_names must equal to the number of dimensions");
}
CharacterVector stimuli_names = as<CharacterVector>(control["stimuli_names"]);
List out;

// seed has to be set at the global R level
// see Documentation about the sampling
Rcpp::Environment base_env("package:base");
Rcpp::Function set_seed_r = base_env["set.seed"];

// evaluate first parameter set
NumericMatrix teatime = model(init);
const mat& evaluate = as<mat>(teatime);
int stimuli = evaluate.n_rows;
// last evaluated parameters
// evaluate first parameter sets
mat last_eval = init;
// add output to storage
storage = join_slices(storage, evaluate);
mat jumping_distribution = init;
mat continuous(jumping_distribution.n_rows, response_length);


// create first ordinal storage
cube ordinal(dimensionality, dimensionality, jumping_distribution.n_rows);


// evaluate jumping distributions
for (uword i = 0; i < jumping_distribution.n_rows; i++) {
NumericVector probabilities = model(jumping_distribution.row(i));
NumericMatrix teatime = discretize(probabilities);
const rowvec& responses = as<rowvec>(probabilities);
continuous.row(i) = responses;
const mat& evaluate = as<mat>(teatime);
ordinal.slice(i) = evaluate;
}

// compare ordinal patterns to stored ones and update list
uvec include = FindUniqueSlices(ordinal);

// update last evaluated parameters
last_eval = LastEvaluatedParameters(storage, ordinal.slices(include),
jumping_distribution.rows(include),
last_eval);

storage = OrdinalCompare(storage, ordinal.slices(include));
counts = CountOrdinal(storage, ordinal, counts);

////////////////////////////////////////////////////////////////

if (save) {
CreateFile(names, path);
WriteFile(0, conv_to<mat>::from(init),
vec(1, fill::value(0)),
path);
vec match = MatchJumpDists(storage, ordinal);
CreateFile(parameter_names, path + "_parameters" + extension);
CreateFile(stimuli_names, path + "_continuous" + extension);
WriteFile(0, jumping_distribution, match, path + "_parameters" + extension);
WriteFile(0, continuous, match, path + "_continuous" + extension);
}

// run parameter space partitioning until parameter is filled
Expand All @@ -247,11 +273,13 @@ List pspGlobal(Function model, List control, bool save = false,
jumping_distribution = jumping_distribution + last_eval.rows(underpopulated);
jumping_distribution = ClampParameters(jumping_distribution, lower, upper);
// allocate cube for ordinal predictions
cube ordinal(stimuli, stimuli, jumping_distribution.n_rows);
ordinal.resize(dimensionality, dimensionality, jumping_distribution.n_rows);
continuous.resize(jumping_distribution.n_rows, response_length);

// evaluate jumping distributions
for (uword i = 0; i < jumping_distribution.n_rows; i++) {
NumericMatrix teatime = model(jumping_distribution.row(i));
NumericVector probabilities = model(jumping_distribution.row(i));
NumericMatrix teatime = discretize(probabilities);
const mat& evaluate = as<mat>(teatime);
ordinal.slice(i) = evaluate;
}
Expand All @@ -274,7 +302,8 @@ List pspGlobal(Function model, List control, bool save = false,
if (save) {
// index locations of currently found patterns in storage
vec match = MatchJumpDists(storage, ordinal);
WriteFile(iteration, jumping_distribution, match, path);
WriteFile(iteration, jumping_distribution, match, path + "_parameters" + extension);
WriteFile(0, continuous, match, path + "_continuous" + extension);
}

// check if either of the parameter_filled thresholds is reached
Expand All @@ -290,4 +319,4 @@ List pspGlobal(Function model, List control, bool save = false,
Rcpp::Named("iterations") = iteration);

return(out);
}
}

0 comments on commit c892989

Please sign in to comment.