# Gaussian Process Set Up
We now construct a way to initialize an environment that we can use for the GPUCB algorithm in the following notebook. In particular, we want to define a function $f$ between two points xmin and xmax (this region will also be our action space) and we want to define the variance $\sigma^2$ of the noise on the observation of $f(x)$ when playing action $x$. Namely, when playing action $x\in [x_{\text{min}},x_{\text{max}}]$, we observe $f(x)+\epsilon$ where $\epsilon \sim N(0,\sigma^2)$. We do so by constructiong the following class, while also making sure that we store the history of actions $x$ and observations $y$.

In [1]:
# add necessary modules
import numpy as np
from scipy.stats import norm as normal
from scipy.optimize import fminbound
from scipy.optimize import minimize

In [2]:
class GPEnv:
    def __init__(self, xmin, xmax, f, noisevar, dim=1):
        self.var = noisevar
        self.xmin = xmin
        self.xmax = xmax
        self.f = f
        self.xhist = np.empty(0)
        self.yhist = np.empty(0)
        if dim ==1:
            self.argmax = fminbound(lambda x: -self.f(x),xmin,xmax)
        if dim != 1:
            self.dim = dim
            self.argmax = minimize(lambda x: -f(x),np.repeat((xmin+xmax)/2,dim), bounds=(((xmin,xmax),)*dim)).x
        self.maxrew = self.f(self.argmax)
        
    # define function to get expected reward for given x
    def meanrew(self,x):
        meanrew = self.f(x)
        return meanrew
    
    # define function to sample a reward of the arm with a context
    def sample(self, x):
        y = self.meanrew(x) + normal(0.0, np.sqrt(self.var)).rvs()
        return x, y
    
    # define function to update arm statistics
    def update(self,x,y):
        self.xhist = np.append(self.xhist,x)
        self.yhist = np.append(self.yhist,y)
    
    # define function to play arm
    def play(self,x):
        x,y = self.sample(x)
        self.update(x,y)
        return y

    # define function to reset counts etc every replication
    def reset(self):
        self.xhist = np.empty(0)
        self.yhist = np.empty(0)