## Export Data for Optimization Help Issues

This notebook helps you export the information needed to file an [Optimization Help](https://github.com/meta-pytorch/botorch/issues/new?template=optimization_help.yml) issue on BoTorch.

You'll use the `get_data_for_optimization_help` helper function to export `optimization_help_data.json`, which contains the training data and model state dict. Ideally, this file would be saved as close to the error-causing part of the code as possible, or at the end of optimization in case of bad performance.

In [1]:
import torch

In [2]:
# Replace these with your actual data and model
# train_X: (n, d) tensor of inputs
# train_Y: (n, m) tensor of outputs
# model: your fitted BoTorch model

# Example (delete this and use your own):
from botorch.models import SingleTaskGP
train_X = torch.rand(10, 2)
train_Y = torch.sin(train_X.sum(dim=-1, keepdim=True))
model = SingleTaskGP(train_X, train_Y)

[W 260120 20:28:20 2584013286:10] The model inputs are of type torch.float32. It is strongly recommended to use double precision in BoTorch, as this improves both precision and stability and can help avoid numerical errors. See https://github.com/meta-pytorch/botorch/discussions/1444


In [3]:
# Export training data and model state dict to JSON
from botorch.models.utils import get_data_for_optimization_help

get_data_for_optimization_help(model)
print("Saved optimization_help_data.json")

ImportError: cannot import name 'get_data_for_optimization_help' from 'botorch.models.utils' (/var/svcscm/.bento/kernels/bento_kernel_ae/5847/bento_kernel_ae_binary-inplace#link-tree/botorch/models/utils/__init__.py)

**Done!** Attach `optimization_help_data.json` to your [Optimization Help issue](https://github.com/meta-pytorch/botorch/issues/new?template=optimization_help.yml).

# Appendix: Example Issue

Below is an example of a common issue where a user forgot to use an input transform to scale the inputs to the unit cube, causing the GP to be poorly conditioned. The comments marked with `>>> EXTRACT HERE <<<` show where you should (ideally) call `get_data_for_optimization_help(model)` for filing an issue.


This cell will intentionally error out with a numerical optimization issue. The correct setup is given by uncommenting the `input_transform` and re-initializing the model with the input transform.

In [None]:
from botorch.models.transforms.input import Normalize
import torch
from botorch.fit import fit_gpytorch_mll
from botorch.utils.sampling import draw_sobol_samples
from botorch.models import SingleTaskGP
from botorch.acquisition import qLogExpectedImprovement
from botorch.optim import optimize_acqf
from gpytorch.mlls import ExactMarginalLogLikelihood
from botorch.models.utils import get_data_for_optimization_help

# --- User-defined parameters and objective ---

NUM_INIT = 11
NUM_ITERATIONS = 50
batch_size = 1
num_restarts = 4
raw_samples = 1024

# Define bounds here - The bounds here are intentionally set to be extreme -
# one small and one large. If inputs are not scaled to be in [0, 1], this will
# cause numerical issues downstream.
bounds = torch.tensor([[-0.0001, -10000], [0.0001, 10000]])

# we also need use float64 instead of float32. Setting it on the bounds is enough,
# the change will be propagated to the X's, the Y's and the model.
# bounds = bounds.to(torch.float64)
input_transform=Normalize(d=train_X.shape[-1], bounds=bounds)

# Dummy objective function for illustration
def your_objective(X):
    # Example: sum of squares (replace with your actual objective)
    return -(X ** 2).sum(dim=-1, keepdim=True)

# Generate initial data - TODO make sure your initial data is within these bounds
train_X = draw_sobol_samples(n=NUM_INIT, bounds=bounds, q=1).squeeze(-2)
train_Y = your_objective(train_X)

# Optimization loop
for iteration in range(NUM_ITERATIONS):
    # ---------------------------------------------------------------------------
    # Fit GP model
    # NOTE: missing input_transform here causes issues, since the bounds are too
    # extreme for the optimization to work.
    model = SingleTaskGP(
       train_X=train_X,
       train_Y=train_Y,
       # input_transform=input_transform
    )
    # ---------------------------------------------------------------------------

    # ==========================================================================
    # >>> EXTRACT DATA FOR OPTIMIZATION HELP HERE <<<
    # Since the error will occur in one of the lines below,
    # this is where you should save your data for the Optimization Help issue.
    # This saves train_X, train_Y, and model.state_dict() to a single JSON file.
    get_data_for_optimization_help(model)
    # ==========================================================================


    mll = ExactMarginalLogLikelihood(model.likelihood, model)
    fit_gpytorch_mll(mll)

    # Build acquisition function
    acq_func = qLogExpectedImprovement(model=model, best_f=torch.max(train_Y))

    # Optimize acquisition function to get new candidates
    candidates, _ = optimize_acqf(
        acq_function=acq_func,
        bounds=bounds,
        q=batch_size,
        num_restarts=num_restarts,
        raw_samples=raw_samples,
    )

    # Evaluate objective at new candidates
    new_Y = your_objective(candidates)

    # Update training data
    train_X = torch.cat([train_X, candidates], dim=0)
    train_Y = torch.cat([train_Y, new_Y], dim=0)

    print(f"Iteration {iteration + 1}/{NUM_ITERATIONS}, best_f: {train_Y.max().item():.4f}")

[W 260120 14:05:37 1214446538:43] The model inputs are of type torch.float32. It is strongly recommended to use double precision in BoTorch, as this improves both precision and stability and can help avoid numerical errors. See https://github.com/meta-pytorch/botorch/discussions/1444
[W 260120 14:05:37 assorted:271] Data (input features) is not contained to the unit cube. Please consider min-max scaling the input data.
[W 260120 14:05:37 1214446538:43] The model inputs are of type torch.float32. It is strongly recommended to use double precision in BoTorch, as this improves both precision and stability and can help avoid numerical errors. See https://github.com/meta-pytorch/botorch/discussions/1444
[W 260120 14:05:37 assorted:271] Data (input features) is not contained to the unit cube. Please consider min-max scaling the input data.
[W 260120 14:05:37 fit:212] `scipy_minimize` terminated with status OptimizationStatus.FAILURE, displaying original message from `scipy.optimize.minimize`

Iteration 1/50, best_f: -948942.3750
Iteration 2/50, best_f: -933277.6875


[W 260120 14:05:37 1214446538:43] The model inputs are of type torch.float32. It is strongly recommended to use double precision in BoTorch, as this improves both precision and stability and can help avoid numerical errors. See https://github.com/meta-pytorch/botorch/discussions/1444
[W 260120 14:05:37 assorted:271] Data (input features) is not contained to the unit cube. Please consider min-max scaling the input data.
[W 260120 14:05:37 fit:212] `scipy_minimize` terminated with status OptimizationStatus.FAILURE, displaying original message from `scipy.optimize.minimize`: ABNORMAL: 
[W 260120 14:05:37 1214446538:43] The model inputs are of type torch.float32. It is strongly recommended to use double precision in BoTorch, as this improves both precision and stability and can help avoid numerical errors. See https://github.com/meta-pytorch/botorch/discussions/1444
[W 260120 14:05:37 assorted:271] Data (input features) is not contained to the unit cube. Please consider min-max scaling the

Iteration 3/50, best_f: -933277.6875
Iteration 4/50, best_f: -933277.6875


[W 260120 14:05:37 1214446538:43] The model inputs are of type torch.float32. It is strongly recommended to use double precision in BoTorch, as this improves both precision and stability and can help avoid numerical errors. See https://github.com/meta-pytorch/botorch/discussions/1444
[W 260120 14:05:37 assorted:271] Data (input features) is not contained to the unit cube. Please consider min-max scaling the input data.
[W 260120 14:05:37 fit:212] `scipy_minimize` terminated with status OptimizationStatus.FAILURE, displaying original message from `scipy.optimize.minimize`: ABNORMAL: 
[W 260120 14:05:38 cholesky:40] A not p.d., added jitter of 1.0e-06 to the diagonal
[W 260120 14:05:38 cholesky:40] A not p.d., added jitter of 1.0e-05 to the diagonal
[W 260120 14:05:38 cholesky:40] A not p.d., added jitter of 1.0e-04 to the diagonal
[W 260120 14:05:38 cholesky:40] A not p.d., added jitter of 1.0e-03 to the diagonal
[W 260120 14:05:38 cholesky:40] A not p.d., added jitter of 1.0e-02 to the

RuntimeError: probability tensor contains either `inf`, `nan` or element < 0