Skip to content

Handle random_seed setting at higher level in package #512

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

edwardchalstrey1
Copy link
Collaborator

@edwardchalstrey1 edwardchalstrey1 commented Jun 6, 2025

Should close #465

Changes

  • Make set_random_seed function available to both AutoEmulate class and Emulator class (and everything inheriting from it) via ConversionMixin
  • I removed setting random_seed in the models directly, but then I noticed with was needed for RF, so restored it there as the random_state is a required parameter in the actual sklearn to ensure it behaves deterministically
  • I also kept the random_seed parameter in GP (and we might want to keep for all models), see my questions below

Reviewers

A few decisions need to be made here that I'd like input on:

  1. I added a test for the GP model, which if we want to keep, I will set up for the other models too. The test attempts to change the random_seed and then create a new model which should have different predictions. At the moment, this fails. My question is, do we care? And assuming that we do, can you spot where I might be going wrong?
  2. Might we want to refactor this into a new random_seed into a new mixin or rename this ConversionMixin? Assuming we want it to be available to AutoEmulate and Emulator, which we might not

assert torch.allclose(pred1.mean, pred2.mean)
assert torch.allclose(pred1.variance, pred2.variance)

# TODO: These should be different, but do we care that they aren't?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be the case that:

  • we expect these to be close for different fits of the GP (perhaps it would be worth also testing the hyperparameters of the fitted model for similarity?)
  • the fitting is not stochastic for this model (and others in this PR too) and has no dependence on seed

I think it might be worth documenting for each model in the init docstring the expected effect of providing a random seed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Refactor: set random seeds
2 participants