allow users to specify schedule for re-classification of all points #175
Labels
documentation
Improvements or additions to documentation
enhancement
New feature or request
feature_request
priority
Feature description
If the GPR is off in the first iterations, bad things can happen. We try to warn users using the cross-validation but this is not practical in all cases. One typical issue I've been seeing now a few times is that the code discards a point (correctly, if one considers the model predictions at that point) that then ends up dominating some points that have been classified as Pareto optimal.
The reason this can happen is that points that have been discarded are never considered again---which makes a lot of sense to keep things efficient but can lead to such errors.
What we could do to fix this, is to allow uses to specify a schedule to reclassify the full space, and by default, I'd run this at the end.
Implementation idea
I would use the "scheduling function" mechanism we also use for hyperparameter optimization. I think the default scheduling function should be exponentially spaced and run the re-classification at the end of (when no unclassified points are left).
For the "reclassification", I'd simply delete all existing classification markers (except the ones for "sampled") and then run the
_classify
function on it.We certainly need to explain this "issue" in the docs.
I think this would deserve a thorough benchmarking. Do you have any good datasets we could use for testing in mind @byooooo ?
The text was updated successfully, but these errors were encountered: