-
Notifications
You must be signed in to change notification settings - Fork 9
Clearly define meaning of the various iteration types #103
Comments
Continuing here the discussion started in the closely related #93. Just to sum up the current situation: the core of the API is the reverse-communication interface implemented via The problem is that the iteration types have a broad and not yet clearly defined meaning. Also, since Iterate() returns both IterationType and EvaluationType, there are also many combinations whose meaning we should define, document and deal with. We should also define and document the order in which things happen, e.g., that evaluations (should) take place before anything else. One possibility to avoid this and make things clearer is for Init() and Iterate() return only the IterationType. Each IterationType would have a much narrower scope or in other words one iteration type would mean one action for us to do. So instead of defining what Minor- and SubIteration mean, I would get rid of them. Morever, IterationType and EvaluationType always appear together (except for Linesearcher), so it makes sense to merge them into one type and thus avoid their possible combinations. It seems that there do not need to be many iteration types: MajorIteration (check convergence and increase iteration count), three evaluation types and (possibly) an iteration type that would instruct us to store the location as an candidate for the next convergence check (especially for Nelder-Mead). ORing of iteration types would not be allowed, except perhaps for the three evaluations. This would keep things simple and clear. I think that the above would require us to loosen/drop the requirement that Methods cannot modify location, but maybe that is not such a big deal. Location would simply become the one communication point between the client and Method. The main optimization loop then would become something like (just a sketch): iterType, err := method.Init(loc, xNext)
for {
switch iterType {
case FuncEvaluation, GradEvaluation, HessEvaluation:
// Perform iterType evaluation of the function at xNext and store the
// result in location.
evaluate(p, iterType, xNext, loc, stats) // We could even remove evaluate()
case MajorIteration:
stats.MajorIterations++
status = checkConvergence(optLoc, iterType, stats, settings)
if status != NotTerminated {
return
}
case StoreCandidate:
// update optLoc?
default:
// Should not happen.
}
// Check runtime and evaluations limits, record, check status, ...
// Find the next location (stored in-place into xNext).
iterType, err = method.Iterate(loc, xNext)
} Comments, @btracey ? |
Thanks for the recap. I'd first like to add some things to help the discussion. One of the goals for the package is to express a consistent interface to optimization routines. If one looks at the set of routines available in most optimization packages (say, numpy), they all accept slightly different arguments, have different meanings for some of the variables, and have different diagnostic capabilities. As a specific example, in optimize we allow the user to specify a maximum number of gradient evaluations. We can either hope that all optimization methods count this value properly, or we can make it impossible to not count it properly. We choose the later by not allowing the optimization method to call Gradient directly. As an added bonus, this makes coding an optimization method somewhat easier as the implementer just has to focus on the specific needs of the method, and not have to deal with all of the overhead code (everything in local.go). In order for this reverse communication scheme to work is the need to be flexible despite the diversity in optimization routines. #93 highlights some of the difficulties, and there are more basic ones about how an "Iteration" is defined. I think the proposal is a really good idea. We can make IterationType and int64 which gives us 64 bits to play with. There are a few additional bits we may need, such as ConstraintEvaluation for Func, Grad, and Hess, but we are still at way less than 64. I would suggest that MajorIteration means update OptLoc and check for convergence. Having Local keep track of OptLoc has a number of difficulties, #93 being one example, and removing the function noise check from StrongWolfe (etc) is another. I still think that we need to count and check the number of evaluations at every step. I wonder if we should provide some cleanup mechanism, where if the FunctionEvaluation limit has been reached, the optimizer can try and fill the rest if it wants (gradient, etc.). This would allow it to update OptLoc a final time. Another huge upside of this proposal is the elimination of Minor and Sub iteration. What would you say in the behavior in re Recorder and problem.Status? Only at major iterations? |
Do I interpret this correctly as your agreement with using Location for a two-way communication? For evaluations in loc.X we get the location and evaluate and fill in the requested field and at MajorIteration we get a complete location to store? In other words, do you agree with changing Method to: type Method interface {
Init(loc *Location) (IterationType, error)
Iterate(loc *Location) (IterationType, error)
Needs() // ...
} ?
Agreed. I did not intend to remove them.
The mechanism could be something like
I think Recorder should be called at every iteration, they are easy to filter anyway. Do you agree with allowing to combine the evaluating IterationTypes with the binary |
Yes, I think that's fine.
I like this idea. I would say that if StrictLimits == true, then we stop immediately, and whatever the last major iteration was is returned. If == false, we quit at the next MajorIteration. I don't know that it fixes the other problem though (of the optimizer has found a better location but hasn't communicated it).
The location could give a nil x, and the optimization routine could choose to not do anything. Maybe this worry is not worth the added complexity. Most line searches should only take a few function evaluations anyway, so it shouldn't be missing much
I agree with overloading them into one integer type. I don't think they should be combined. We have two modes: "Evaluate" -- where we do function/gradient/hessian/whatever evaluations (that may be combined with |), and "Update", where we read in from the location and do the convergence checks. Update and Evaluate are distinct steps. I think we only need one Update step now, correct (MajorIteration, MinorIteration, and SubIteration distinctions are meaningless?) |
Good idea to continue to the next MajorIteration. Not sure, but I feel like Final() should not be necessary.
Yes, I was referring only to those IterationTypes that evaluate something as candidates for combining with
So at the moment it could look like: type IterationType uint64 // Is the name still ok?
const (
NoIteration IterationType = 0
// InitialIteration is maybe unnecessary. Only to be passed
// to Recorders from Local(), Init() and Iterate() must not return it.
InitialIteration = 1 << (iota-1)
MajorIteration
FuncEvaluation
GradEvaluation
HessEvaluation
) |
... and how that effects us maintaining OptLoc and testing for convergence
The text was updated successfully, but these errors were encountered: