## System Description
1. We have a set of COFs from a database. Each COF is characterized by a feature vector $$x_{COF} \in X \subset R^d$$ were d=14.


2. We have **two different types** of simulations to calculate **the same material property $S_{Xe/Kr}$**. Therefore, we have a Single-Task/Objective (find the material with the optimal selevtivity), Multi-Fidelity problem. 
    1. low-fidelity  = Henry Coefficient calculation - MC integration - cost=1
    2. high-fidelity = GCMC mixture simulation - 80:20 (Kr:Xe) at 298 K and 1.0 bar - cost=30


3. We will initialize the system with *two* COFs at both fidelities in order to initialize the Covariance Matrix.
    - The fist COF will be the one closest to the center of the normalized feature space
    - The second COF will be chosen at random


4. Each surrogate model will **only train on data acquired at its level of fidelity** (Heterotopic data). $$X_{lf} \neq X_{hf} \subset X$$
    1. We are using the augmented EI acquisition function from [here](https://link.springer.com/content/pdf/10.1007/s00158-005-0587-0.pdf)


5. **kernel model**: 
    1.  We need a Gaussian Process (GP) that will give a *correlated output for each fidelity* i.e. we need a vector-valued kernel
    2. Given the *cost aware* acquisition function, which imposes a fidelity hierarchy, we anticipate the number of training points at each fidelity *will not* be equal (asymmetric scenario) $$n_{lf} > n_{hf}$$
        - perhaps we can force the symmetric case, $n_{lf} = n_{hf} = n$, if we can include `missing` or `empty` entries in the training sets.


Note: even though we have heterotopic data in an asymmetric scenario -- due to hierarchical, multi-fidelity -- we can still use a symmetric multi-output GP. 

### Strategy
1. Implement SingleTaskMultiFidelity Gp
2. Get augmented EI working


In [1]:
import torch
import gpytorch
from botorch.models import SingleTaskMultiFidelityGP
from gpytorch.mlls import ExactMarginalLogLikelihood
from botorch import fit_gpytorch_model
from scipy.stats import norm
import math 
import h5py
import matplotlib.pyplot as plt
import numpy as np
import os

In [2]:
from botorch.models.transforms.outcome import Standardize
from botorch.utils.transforms import unnormalize, standardize
from botorch.utils.sampling import draw_sobol_samples

In [3]:
###
#  import data
###
f = h5py.File("targets_and_normalized_features.jld2", "r")

X = torch.from_numpy(np.transpose(f["X"][:]))
henry_y = torch.from_numpy(np.transpose(f["henry_y"][:]))
gcmc_y  = torch.from_numpy(np.transpose(f["gcmc_y"][:]))
print("data - \nX:", X.shape)
print("henry_y:", henry_y.shape)
print("gcmc_y: ", gcmc_y.shape)

###
#  construct initial inputs
#  1. get initial points
#  2. standardize outputs
#  3. stack into tensor
###
nb_COFs = henry_y.shape[0] # total number of COFs data points 
nb_COFs_initialization = 7 # number of COFs to initialize with
ids_acquired = np.random.choice(np.arange((nb_COFs)), size=nb_COFs_initialization, replace=False)

fidelities = torch.tensor([0.1, 1.0]) # assign fidelities (arbitrary?)
fid_acquired = torch.randint(2, (nb_COFs_initialization, 1))
train_f = fidelities[fid_acquired] # selected fidelity of training points

train_x = X[ids_acquired, :] # torch.Size([nb_COFs_initialization, 14])
train_x_full = torch.cat((train_x, train_f), dim=1) # last col is associated fidelity


y = torch.tensor((), dtype=torch.float64).new_zeros((ids_acquired.shape[0], 1))
for i, fid in enumerate(fid_acquired):
    if fid == 0:
        y[i][0] = henry_y[ids_acquired[i]]
    else:
        y[i][0] = gcmc_y[ids_acquired[i]]

data - 
X: torch.Size([608, 14])
henry_y: torch.Size([608])
gcmc_y:  torch.Size([608])


**Construct Model**

In [55]:
from botorch.test_functions.multi_fidelity import AugmentedHartmann


problem = AugmentedHartmann(negate=True).to()
fidelities = torch.tensor([0.5, 0.75, 1.0])

In [6]:
def generate_initial_data(n=16):
    # generate training data
    train_x = torch.rand(n, 6) # torch.Size([n, 6])
    train_f = fidelities[torch.randint(2, (n, 1))] # torch.Size([n, 1]), sampled fidelities of training data
    train_x_full = torch.cat((train_x, train_f), dim=1) # torch.Size([16, 7]), last col is associated fidelity
    train_obj = problem(train_x_full).unsqueeze(-1) # torch.Size([16, 1]), add output dimension
    return train_x_full, train_obj
    
def initialize_model(train_x, train_obj):
    # define a surrogate model suited for a "training data"-like fidelity parameter
    # in dimension 6, as in [2]
    model = SingleTaskMultiFidelityGP(
        train_x, 
        train_obj, 
        outcome_transform=Standardize(m=1),
        data_fidelity=6
    )   
    mll = ExactMarginalLogLikelihood(model.likelihood, model)
    return mll, model

In [7]:
train_x, train_obj = generate_initial_data(n=16)