Skip to content

SAASBO Tutorial not working on GPU / Pyro error? #1108

@winf-hsos

Description

@winf-hsos

When I download the code from https://ax.dev/tutorials/saasbo.html and run it on Google Colab with GPU, I get the following error message:

RuntimeError                              Traceback (most recent call last)
[<ipython-input-21-fb6ba87452b0>](https://localhost:8080/#) in <module>
     24         torch_dtype=tkwargs["dtype"],
     25         verbose=True,  # Set to True to print stats from MCMC
---> 26         disable_progbar=True,  # Set to False to print a progress bar from MCMC
     27     )
     28     generator_run = model.gen(BATCH_SIZE)

20 frames
[/usr/local/lib/python3.7/dist-packages/pyro/infer/mcmc/util.py](https://localhost:8080/#) in _potential_fn_jit(self, skip_jit_warnings, jit_options, params)
    292 
    293         if self._compiled_fn:
--> 294             return self._compiled_fn(*vals)
    295 
    296         with pyro.validation_enabled(False):

RuntimeError: The following operation failed in the TorchScript interpreter.
Traceback of TorchScript (most recent call last):
RuntimeError: Graph::copy() encountered a use of a value 133 not in scope. Run lint!

when running this code cell:

# Experiment
experiment = Experiment(
    name="saasbo_experiment",
    search_space=search_space,
    optimization_config=optimization_config,
    runner=SyntheticRunner(),
)

# Initial Sobol points
sobol = Models.SOBOL(search_space=experiment.search_space)
for _ in range(N_INIT):
    experiment.new_trial(sobol.gen(1)).run()

# Run SAASBO
data = experiment.fetch_data()
for i in range(N_BATCHES):
    model = Models.FULLYBAYESIAN(
        experiment=experiment, 
        data=data,
        num_samples=256,  # Increasing this may result in better model fits
        warmup_steps=512,  # Increasing this may result in better model fits
        gp_kernel="rbf",  # "rbf" is the default in the paper, but we also support "matern"
        torch_device=tkwargs["device"],
        torch_dtype=tkwargs["dtype"],
        verbose=True,  # Set to True to print stats from MCMC
        disable_progbar=True,  # Set to False to print a progress bar from MCMC
    )
    generator_run = model.gen(BATCH_SIZE)
    trial = experiment.new_batch_trial(generator_run=generator_run)
    trial.run()
    data = Data.from_multiple_data([data, trial.fetch_data()])
    
    new_value = trial.fetch_data().df["mean"].min()
    print(f"Iteration: {i}, Best in iteration {new_value:.3f}, Best so far: {data.df['mean'].min():.3f}")

This only happens when I use CUDA. When I change device to CPU it works fine. The same error occurs on our internal Cluster with an NVIDIA A40 GPU.

BTW: The same error occurs when I use the BoTorch example here: https://botorch.org/tutorials/saasbo Given they use the same libraries that makes perfect sense.

Any help is greatly appreciated! Thanks!
Nicolas

Metadata

Metadata

Assignees

No one assigned

    Labels

    upstream issueResolution depends on upstream fixes

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions