## Multi-fidelity optimization

Some experiments can be very expensive. These may be supplemented by simpler alternatives, or perhaps high-throughput calculations. This would give measurements of lower *fidelity*, and the planner can take advantage of these measurements to guide high fidelity optimization.

This can also be used in a virtual screening setting. Expensive quantum chemistry calculations can be supplemented by faster semi-empirical methods. Another example could also be the virtual screening of compounds for drug activity, with high fidelity free-energy perturbation calcualtions being approximated by faster and lower fidelity docking calculations.

In [None]:
import json
import pickle
import numpy as np
import pandas as pd
from copy import deepcopy

from olympus.datasets import Dataset
from olympus.objects import (
	ParameterContinuous,
	ParameterDiscrete, 
	ParameterCategorical, 
	ParameterVector
)
from olympus.campaigns import ParameterSpace, Campaign

from atlas.planners.multi_fidelity.planner import MultiFidelityPlanner      # specially designed planner for multi-fidelity optimization

import pickle

  from .autonotebook import tqdm as notebook_tqdm


For this example, we will perform a screening of the bandgap of perovskites. There's two fidelities of measurements, one using GGA (*low*), and one use HSE06 (*high*). You can set the associated cost to each one, but we will consider queries to GGA calculations as 10 times cheaper than with HSE06.

In [None]:
COST_BUDGET = 50            # this time the budget is a cost
NUM_INIT_DESIGN = 10
NUM_CHEAP = 8               # this is the ratio of low:high measurements (ie. 8:1 low/high fidelity)

Here we will create an additional fidelity parameter `s`, which can only be the permitted fidelities. The `MultiFidelityPlanner` will be allowed to vary this parameter, and perform optimization with an additional constrained *fidelity* parameter.

In [25]:
dataset = Dataset(kind='perovskites')

# build parameter space
param_space = ParameterSpace()

# fidelity param
param_space.add(ParameterDiscrete(name='s', options=[0.1, 1.0], low=0.1, high=1.0))
for param in dataset.param_space: # add perovskite component parameters ('organic', 'cation', and 'anion')
	param_space.add(param)


In [None]:
# lower fidelity data calucated using GGA is available in the examples folder
# so we will load it here to create a new function for measurements
# fill in the ATLAS_PATH
ATLAS_PATH = '.'
LOOKUP = pickle.load(open(f'{ATLAS_PATH}/examples/multi_fidelity/perovskites/lookup/lookup_table.pkl', 'rb'))

def measure(params, s):
	# high-fidelity is hse06, low-fidelity is gga
	if s == 1.0:
		measurement = np.amin(
			LOOKUP[params.organic.capitalize()][params.cation][params.anion]['bandgap_hse06']
		)
	elif s == 0.1:
		measurement = np.amin(
			LOOKUP[params.organic.capitalize()][params.cation][params.anion]['bandgap_gga']
		)
	return measurement


In [None]:
campaign = Campaign()
campaign.set_param_space(param_space)

planner = MultiFidelityPlanner(
    goal='minimize',
    init_design_strategy='random',
    num_init_design=NUM_INIT_DESIGN,
    use_descriptors=True,
    batch_size=1,
    acquisition_optimizer_kind='pymoo',     # this is required
    fidelity_params=0,                      # this dimension is the fidelity parameter (we use the first one)
    fidelities=[0.1, 1.],                   # these are the possible fidelities (GGA = 0.1, and HSE = 1.0)
)

planner.set_param_space(param_space)

In [None]:
# accumulated cost, the budget is also cost
COST = 0.

target_rec_measurements = []
iter_ = 0
while COST < COST_BUDGET:
    print(f'\nITER : {iter_+1}\tCOST : {COST}\n')

    # this is how much the corresponding measurement will cost
    if iter_ % NUM_CHEAP == 0:
        planner.set_ask_fidelity(1.0)
    else:
        planner.set_ask_fidelity(0.1)

    samples = planner.recommend(campaign.observations)
    for sample in samples:
        measurement = measure(sample, sample.s)
        campaign.add_observation(sample, measurement)

        print('SAMPLE : ', sample)
        print('MEASUREMENT : ', measurement)

        iter_+=1

    # do a check to see if model will find the optimal
    if campaign.num_obs > NUM_INIT_DESIGN:
        # make greedy recommendation on the target fidelity
        # use this to make a high-fidelity measurement
        rec_sample = planner.recommend_target_fidelity(batch_size=1)[0]
        rec_measurement = measure(rec_sample, rec_sample.s)
        print('REC SAMPLE : ', rec_sample)
        print('REC MEASUREMENT : ', rec_measurement)

        target_rec_measurements.append(rec_measurement)
        # kill the run if we have found the lowest hse06 bandgap
        # on the most recent high-fidelity measurement
        if rec_measurement == min_hse06_bandgap:
            print('found the min hse06 bandgap!')
            break
    else:
        target_rec_measurements.append(measurement)
        if measurement == min_hse06_bandgap and samples[0].s == 1.:
            print('found the min hse06 bandgap!')
            break