We have seen two closely related methods, The Probability of Improvement and the Expected Improvement.
-
- In the figure below, we make a scatter plot showing the policies' acquisition functions evaluated on different points in the domain (each dot is a point in the domain). In this plot, our train set consists of a single point (0.5, f(0.5)).
-
-
- We can see that the \alpha_{EI} and \alpha_{PI} reach a maximum of 0.3 and around 0.44, respectively. Choosing a point with low \alpha_{PI} and high \alpha_{EI} is high risk (probability of improvement is low) and high reward (expected improvement is high).
- If multiple points have the same (\alpha_{EI}), we should prioritize the point with lesser risk (higher \alpha_{PI}). Similarly, when the risks are the same (same \alpha_{PI}) for any two points, we should choose the point with greater reward (higher \alpha_{EI}).
+ The scatter plot above shows the policies' acquisition functions evaluated on different pointsEach dot is a point in the search space. Additionally, the training set used while making the plot only consists of a single observation (0.5, f(0.5)).
+ We see that \alpha_{EI} and \alpha_{PI} reach a maximum of 0.3 and around 0.44, respectively. Choosing a point with low \alpha_{PI} and high \alpha_{EI} translates to high riskSince "Probability of Improvement" is low and high rewardSince "Expected Improvement" is high.
+ In case of multiple points have the same \alpha_{EI}, we should prioritize the point with lesser risk (higher \alpha_{PI}). Similarly, when the risks are the same (same \alpha_{PI}) for any two points, we should choose the point with greater reward (higher \alpha_{EI}).