# Techniques for sampling parameter space - Adaptive sampling

In [None]:
# For adaptive sampling, the notebook needs some additional dependencies.

import adaptive
import holoviews
import ipywidgets
import nest_asyncio

adaptive.notebook_extension()
nest_asyncio.apply()

Adaptive sampling is a sampling technique which uses data fitting to decide where in the parameter space to sample next. By fitting the data from samples that have already been taken, the overall data trend across the parameter space can be roughly predicted and an informed choice on where to sample the parameter space next can be made. Regions in the parameter space where the data trend is relatively flat do not have to be sampled as densely as rapidly changing regions. By allowing sample points to be chosen based on the data trend, computational time can be focused on the most important parts of the data trend.

The following code runs a neutronics simulation using a simple pre-defined model. Simulations begin by sampling the limits of the parameter space (i.e. (enrichment, breeder percentage) = (0, 100), (100, 0), (0, 100), (100, 100)) and then fitting these points to predict where TBR is varying most rapidly across the parameter space. A sample is then taken at this point and the process repeated. There are many ways to fit existing data points during adaptive sampling, however, this particular example uses gaussian process regression.

In [None]:
from pathlib import Path
import json
import uuid

from openmc_model import find_tbr_hcpb
from plot_sampling_coordinates import plot_simulation_results, read_in_data
from plot_interpolated_results import plot_interpolated_results


def find_tbr(x):

    breeder_percent_in_breeder_plus_multiplier_ratio, blanket_breeder_li6_enrichment = x
                           
    result = find_tbr_hcpb(breeder_percent_in_breeder_plus_multiplier_ratio,
                           blanket_breeder_li6_enrichment)

    result["sample"] = "adaptive"

    filename = "outputs/" + str(uuid.uuid4()) + ".json"
    Path(filename).parent.mkdir(parents=True, exist_ok=True)
    with open(filename, mode="w", encoding="utf-8") as f:
        json.dump(result, f, indent=4)

    return result["tbr"]


number_of_simulations = 16


print("running simulations with adaptive sampling")

learner = adaptive.Learner2D(find_tbr, bounds=[(0, 100), (0, 100)])

runner = adaptive.Runner(learner, ntasks=1, goal=lambda l: l.npoints > number_of_simulations)

# example goal setting for acceptable coverage error is also possible
# runner = adaptive.Runner(learner, ntasks=1, goal=lambda l: l.loss() < 0.01)

runner.live_info()
# runner.live_plot()

runner.ioloop.run_until_complete(runner.task)

print("results saved in outputs folder")

results_df = read_in_data()
filtered_results_df = results_df[results_df["sample"] == "adaptive"]

In [None]:
# plot results

plot_simulation_results(filtered_results_df)

In [None]:
# plot interpolated results

plot_interpolated_results(filtered_results_df)

As mentioned, the most important parts of a data trend are (usually) the regions where the data is changing as a function of parameter values. In our example, these are the regions where TBR is changing as a function of enrichment and breeder percentage. I.e. we do not want to excessively sample regions where TBR changes negligibly as a function of enrichment and breeder percentage. As shown, the parameter space is densely sampled in regions where TBR is changing most rapidly, and sparsely sampled in regions where TBR is changing negligibly.

The main advantage of adaptive sampling is that it is the most efficient technique for sampling a parameter space with an unknown distribution. By iteratively fitting the data and performing additional simulations we can determine an accurate distribution across the parameter space with fewer simulations than any other sampling technique. It is not a perfect solution, however, because over-sampling could still take place if we don't specify when to stop sampling. I.e. we would calculate the data fit and stop when we reach an acceptable uncertainty. Also could miss areas which have less prominent trends? I.e we don't get the whole picture across the whole parameter space.

Overall, adaptive sampling allows computational time to be focused on the most important parts of a distribution and is a highly efficient way of sampling a parameter space and, therefore, performing simulations.

To more accurately cover this parameter space more than the default 40 samples would be required.