Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

createPrior - bug in the help file and confusing behavior of the truncation option #98

Closed
florianhartig opened this issue Feb 6, 2018 · 22 comments

Comments

@florianhartig
Copy link
Owner

Arising from github.com//issues/72

If we try to parallelize the help of createPrior(), we got an error.

The reason is that there is an error in the prior specification in the help, and/or, depending on how you see it, in the logic of the createPrior.

In the example, we specify a density and a sampler, both identical. So far so good.

density = function(par){
  d1 = dunif(par[1], -2,6, log =TRUE)
  d2 = dnorm(par[2], mean= 2, sd = 3, log =TRUE)
  return(d1 + d2)
}

sampler = function(n=1){
  d1 = runif(n, -2,6)
  d2 = rnorm(n, mean= 2, sd = 3)
  return(cbind(d1,d2))
}

but then, when we create the prior, we use lower and upper. The idea of this option was to be able to add truncation to a arbitrary density, which is nice.

prior <- createPrior(density = density, sampler = sampler, 
                     lower = c(-3,-3), upper = c(3,3), best = NULL)

However, I just realized that the truncation it is not applied to the sampling function, which causes the sampler function to sample initial proposal in the MCMC that are outside the prior range.

Consequences from this

  • Clearly, the help needs to be changed - remove the truncation!

  • Secondly, we need to decide how we handle the truncation option in the future, should this be removed?

  • Thirdly, I don't understand why the MCMC is throwing an error only when run in parallel. Moreover, should we add additional checks in the MCMC to make sure that the initial values are sensible? For this, it would be nice to understand what's actually going on internally and why the error is only in the parallel mode.

@florianhartig
Copy link
Owner Author

@MaximilianPi can you do point 1,3

@MaximilianPi
Copy link
Collaborator

added missing checks in DEzs and DREAMzs samplers (they were only missing in the parallel part)
ffeb3c2

@MaximilianPi
Copy link
Collaborator

MaximilianPi commented Mar 2, 2018

I checked it again and the fix should solve at least the problem with parallel error.

if(!is.na(logfitness_x_prop[i,1] - logfitness_X[i,1]))

This statement was only missing in the parallel part of the DEzs and DREAMzs sampler (It's done this way in the DE, DEzs, DREAM and DREAMzs samplers). If both, the new proposed priors' and the start priors' likelihoods are -Inf this check is required otherwise there occurs an error in the ll comparison(only if both lls are -Inf):

if ((logfitness_x_prop[i,1] - logfitness_X[i,1] + r_extra[i]) > log(runif(1))){
           # accept <- accept + 1 
            X[i,] <- x_prop[i,]
            logfitness_X[i,] <- logfitness_x_prop[i,]

This occurs only if the start values are bad and outside of the density. Maybe we should sample the priors at the start until they are inside?

Another small issue/question. Why is the ll set to NA (if the prior is outside) in case of is.matrix(x):
(It's the setup$posterior$density function, code )

posterior <- function(x, returnAll = F){
    
    if (is.vector(x)){
      priorResult = prior$density(x) # Checking if outside prior to save calculation time
      if (! (priorResult == -Inf)) ll = likelihood$density(x)
      else ll = -Inf
      if (returnAll == F) return(ll + priorResult)    
      else return(c(ll + priorResult, ll, priorResult)) 
    
    } else if(is.matrix(x)){
      
      priorResult = prior$density(x) # Checking first if outside the prior to save calculation time
      feasible <- (! priorResult == -Inf)
      if (dim(x)[2] == 1) llResult <- likelihood$density(matrix(x[feasible, ], ncol = 1))
      else{
        if(TRUE %in% feasible) llResult <- likelihood$density(x[feasible, ])
        else llResult <- -Inf 
      }
      post = priorResult
      ll = priorResult
      ll[!feasible] = NA
      ll[feasible] = llResult
      post[feasible] = post[feasible] + llResult
      post[!feasible] = -Inf
      if (returnAll == F) return(post)    
      else{
        out <- cbind(post, ll, priorResult)
        colnames(out) = c("posterior", "likelihood", "prior")
        return(out)    
      } 
    }
    else stop("parameter must be vector or matrix")
  }

@florianhartig
Copy link
Owner Author

suggested solution - remove upper / lower from create prior, and accept upper / lower in setup only if there is no prior provided

@TankredO
Copy link
Contributor

Is there any reason to keep the "best" parameter when reworking createPrior?

@florianhartig
Copy link
Owner Author

the motivation for this was to be able to provide a reference parameter set, even for flat priors ... yes, I would keep it, for various functions (e.g. sensitivity analysis), it's useful to have a "center point"

is there a technical problem with that parameter?

@TankredO
Copy link
Contributor

TankredO commented Mar 19, 2018

The problem is that without lower and upper we can't calculate "best" the way we are doing now ((upper + lower) / 2). But this should not be too bad if we just keep it optional.

@TankredO
Copy link
Contributor

9b3fa68 b5ce8be
I renamed "createPrior" to "createPriorGeneric" and added a new function "createPrior" taking the inputs density, sampler and best that wraps it.
The idea is that the user calls this new createPrior function with his custom density and sampler and passes the prior to createBayesianSetup. If he wants more control he can still call createPriorGeneric. Further we don't have to rework all createPrior functions (which would be the case if we only changed the old createPrior).
If no prior is passed to createBayesianSetup, the user must pass lower and upper to get a uniform density truncated by lower and upper.
If prior and lower or upper are passed to createBayesianSetup, lower and upper wil be ignored and the prior object will be used.

Please let me know if this is the desired behaviour.

@TankredO
Copy link
Contributor

TankredO commented Apr 4, 2018

As discussed with @florianhartig the createPrior function will be changed:

  • lower/upper/best parameters are removed
  • info arguments is added: info is a list of lower, upper, best, name
  • lower/upper/best do not influence prior density or sampling, they are rather used for plotting/sensitivity analyis

This leads to changes in createBayesianSetup:

  • lower/upper/best will be deprecated
  • if lower/upper/best are passed together with a prior, they will be ignored
  • if lower/upper/best are passed without a prior, a uniform prior will be created

And global changes in the package:

  • occurences of prior$lower/prior$upper/prior$best will have to replaced with a getter "method" for classPrior. E.g. getLower/getUpper/getBest(prior)

Another issue we should address while working on createBayesianSetup:
By making those changes we separate createBayesianSetup from the priorCreation. However, right now the user can pass functions for prior and priorSampler in createBayesianSetup. Maybe we should also deprecate those parameters and tell the user to generate "real" priors with createPrior? @florianhartig

@florianhartig
Copy link
Owner Author

Hi Tankred,

I would deprecate the sampler option in the createBayesianSetup, so that the createBayesianSetup has only 3 options to create a prior

  • no prior Argument, but lower / upper / best / name (which will then call createUniformPrior with the arguments to info)
  • a density function (which then calls the createPrior), with the additional option to provide lower / upper / best / name
  • a real prior. If lower / upper / best / name are provided additionally in the setup, check if the info exists in the prior, and if so, don't overwrite and throw a warning.

About the deprecation of lower / upper / best / name in the setup - I remember we discussed this and decided for this in the end. I'm not 100% sure if that's the best option, but let's go for it and see how this works.

@TankredO
Copy link
Contributor

TankredO commented Apr 4, 2018

So latest state is:

  • deprecate upper/lower/best/priorSampler in createBayesianSetup
  • add info field/property to likelihood, prior, and bayesianSetup
  • the user can pass info containing upper/lower/best/name/parameterNames to prior, likelihood, and bayesianSetup
  • the info field is evaluated hierarchically and stored in bayesianSetup.
    • if there are inconsistencies, the hierarchically higher object wins: bayesianSetup > prior/likelihood
    • if bayesianSetup is not passed info and prior's and likelihood's info are inconsistent throw an error/warning
    • What happens if no info is specified?
    • What happens if at createBayesianSetup info and lower/upper/best are specified?"
  • all function that were using something like bayesianSetup$prior$lower/upper/best should now use bayesianSetup$info$lower/upper/best

@TankredO
Copy link
Contributor

TankredO commented Apr 4, 2018

There is a bug in createBayesianSetup when a function is passed for prior:

if(inherits(prior,"bayesianOutput")){
    # priorClass = createPriorDensity(prior, lower = lower, upper = upper, best = best)
    priorClass = createPriorDensity(prior)
  } else if(! "prior" %in% class(prior)){
    if(is.null(prior) & !is.null(lower) & !is.null(upper)){
      priorClass = createUniformPrior(lower = lower,upper = upper, best = best)
    }else if(is.null(prior) || class(prior) == "function"){
      # priorClass = createPriorDensity(prior, lower = lower, upper = upper, best = best)

      priorClass = createPriorDensity(prior) # <===== HERE IS THE BUG

    }else{
      stop("wrong input in prior")
    }
  } else{
    priorClass = prior 
    if("function" != class(prior)){
      if( !is.null(lower)) warning("Prior and lower values provided to createBayesiansetup, the latter will be ignored")
      if( !is.null(upper)) warning("Prior and upper values provided to createBayesiansetup, the latter will be ignored")
    }
  }

We can not create a prior object from a density alone. This probably never worked, since createPriorDensity expects a bayesianOutput or a matrix (of sampled parameters).
Calling createBayesianSetup with functions for prior and priorSampler also throws an error ("no applicable method for 'getSample' applied to an object of class "function"")

@MaximilianPi
Copy link
Collaborator

Functions for prior density and sampler in bayesian setups seem to work:

createBayesianSetup(likelihood = function(x) sum(dnorm(1:6,mean = x, sd = 1, log = T)) , prior = function(x) sum(dnorm(1:6, mean = x, log = T)), priorSampler = function(x=1) rnorm(x, mean = 3))

But yes if we give a function for density and no sampler he can not create a sampler..

We could do inverse transform sampling?

@TankredO
Copy link
Contributor

TankredO commented Apr 4, 2018

The question is more if we should do that

@florianhartig
Copy link
Owner Author

  1. Can you guys clarify - is there a bug or no?

  2. I don't see what's the problem with providing a density function to the setup. A density function is also what you can provide to the prior, right? It's just passing it through

@MaximilianPi
Copy link
Collaborator

MaximilianPi commented Apr 4, 2018

  1. createBayesianSetup with functions for sampler and prior works (without boundaries)

  2. createBayesianSetup with prior = createPrior(density = function) does not work without boundaries, error in the createBayesianSetup function

  3. createBayesianSetup with prior = densityfunction works without boundaries (without sampler func)

  4. createBayesianSetup with only sampler function and without density function in createPrior or as argument for createBayesianSetup does not work (ofc not but we could change this)

2 and 3 should be discussed regarding the boundaries and we discussed about 4, if we provide only a density but no sampler what should happen?

@florianhartig
Copy link
Owner Author

I think we should keep the option to provide a density without a sampler. It's perfectly reasonable for the user not to provide a sampler, and just choose starting values by hand.

What one could discuss is if the prior optimization that Tankred is implementing should access the info parameters, as they might provide some initial guesses of the user about boundaries, best values etc., even if no sampler is provided.

@florianhartig
Copy link
Owner Author

The alternative would be that we force certain info parameters, e.g. user has to provide an initial best guess / variance if no sampler is provided.

@TankredO
Copy link
Contributor

TankredO commented Apr 5, 2018

Since we want to keep createBayesianSetup the way that it is, we could do something like this:

If prior is object of class prior
  use it
else if prior is function
  create prior object from function and use it (not possible right now, we need a sampler)
else if prior is NULL
  use lower/upper/best to create uniform prior object

Regarding creating a prior from a density alone: I know the sampler is "optional" in createPrior. However, if it is not provided we will get errors in createBayesianSetup and multiple samplers (e.g. DE). So it is not really optional.

@TankredO
Copy link
Contributor

TankredO commented Apr 5, 2018

Another issue is how to get the number of parameters if only a density is passed.

@florianhartig florianhartig moved this from ToDo to In Progress in setup / prior / likelihood interface Apr 6, 2018
@TankredO
Copy link
Contributor

solved with #124

@TankredO TankredO moved this from In Progress to Done in setup / prior / likelihood interface Apr 10, 2018
@florianhartig
Copy link
Owner Author

So, there is still this stale branch https://github.com/florianhartig/BayesianTools/tree/issue98 which seems to have code that was not merged into the master

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

No branches or pull requests

3 participants