New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Calculating the Bayesian Information Criterion. #298

Open
elkeschaper opened this Issue Feb 4, 2016 · 6 comments

Comments

Projects
None yet
5 participants
@elkeschaper
Copy link

elkeschaper commented Feb 4, 2016

I'm interested in calculating the Bayesian Information Criterion for model selection, i.e. to assess the fit for different kernels.

For a model of type GPy.models.gp_regression.GPRegression, my current approach is as follows:

def calculate_BIC(model):

    """
    Calculate the Bayesian Information Criterion (BIC) for a GPy `model` with maximum likelihood hyperparameters on a given dataset.
    https://en.wikipedia.org/wiki/Bayesian_information_criterion
    """
    # model.log_likelihood() is the natural logarithm of the marginal likelihood of the Gaussian process.
    # len(model.X) is the number of data points.
    # model._size_transformed() is the number of optimisation parameters.
    return - 2 * model.log_likelihood() + len(model.X) * np.log(model._size_transformed())

My questions now are:

  • Is this already implemented, but I missed it?
  • Would you agree with this approach? Or are there better ways for generalization?

Cheers!

Elke

@lawrennd

This comment has been minimized.

Copy link
Member

lawrennd commented Feb 4, 2016

Hi Elke,

Just from a theoretical perspective, although people do sometimes compute
BIC for these models (often under pressure from reviewers) I think it
doesn't make sense to. I believe implicit in BIC (and AIC) is the idea that
the parameters you count should conditionally make the data independent. In
other words your model should make an i.i.d. assumption about the
stochasticity given the parameters. GPs don't do this, they assume
correlation.

Feel free to include it (thanks for contributing!) but perhaps put a note
to that effect on it.

Neil
On 4 Feb 2016 14:43, "Elke Schaper" notifications@github.com wrote:

I'm interested in calculating the Bayesian Information Criterion
https://enwikipediaorg/wiki/Bayesian_information_criterion for model
selection, ie to assess the fit for different kernels

For a model of type GPymodelsgp_regressionGPRegression, my current
approach is as follows:

def calculate_BIC(model):

"""
Calculate the Bayesian Information Criterion (BIC) for a GPy `model` with maximum likelihood hyperparameters on a given dataset
https://enwikipediaorg/wiki/Bayesian_information_criterion
"""
# modellog_likelihood() is the natural logarithm of the marginal likelihood of the Gaussian process
# len(modelX) is the number of data points
# model_size_transformed() is the number of optimisation parameters
return - 2 * modellog_likelihood() + len(modelX) * nplog(model_size_transformed())

My questions now are:

  • Is this already implemented, but I missed it?
  • Would you agree with this approach? Or are there better ways for
    generalization?

Cheers!

Elke


Reply to this email directly or view it on GitHub
#298.

@elkeschaper

This comment has been minimized.

Copy link

elkeschaper commented Feb 4, 2016

Hi Neil,

you're making a very good point. Unfortunately, I do not know the best alternative for selecting from models with different numbers of hyperparameters (that is, when the full evaluation of the models' posterior probabilities is not feasible).

I'd be very happy to add BIC code snippet to the library.

Elke

@lawrennd

This comment has been minimized.

Copy link
Member

lawrennd commented Feb 4, 2016

@alansaul again, wouldn't Aki's work help here?
On 4 Feb 2016 19:06, "Elke Schaper" notifications@github.com wrote:

Hi Neil,

you're making a very good point. Unfortunately, I do not know the best
alternative for selecting from models with different numbers of
hyperparameters (that is, when the full evaluation of the models' posterior
probabilities is not feasible).

I'd be very happy to add BIC code snippet to the library.

Elke


Reply to this email directly or view it on GitHub
#298 (comment).

@mzwiessele

This comment has been minimized.

Copy link
Member

mzwiessele commented Mar 10, 2016

Any news, can we close?

@majidaldo

This comment has been minimized.

Copy link

majidaldo commented Jan 6, 2017

there's a mistake. in the positive term, the log should be on the number of data points.

@rohanchauhan

This comment has been minimized.

Copy link

rohanchauhan commented Dec 29, 2018

Hi, shouldn’t the log-likelihood which you are calculating be the log of MLE estimate?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment