{{ message }}

# RFC: Redo statuses #164

Closed
opened this issue May 30, 2017 · 74 comments
Closed

# RFC: Redo statuses#164

opened this issue May 30, 2017 · 74 comments
Labels
Milestone

### mlubin commented May 30, 2017 • edited

Here is my attempt at a revision of the status system in MathProgBase (#156). Right now I'm just trying to get the concepts right and then we can worry about the implementation.

## Termination status: why did it stop?

t = is_termination_status(m, XXX) # bool

where XXX is one of:

### OK

• Success: The algorithm ran successfully and has a result. This includes cases where the algorithm converges to an infeasible point (NLP) or converges to a solution of a homogeneous self-dual problem and has a certificate of primal/dual infeasibility.
• AlmostSuccess: The algorithm almost ran successfully (e.g., to relaxed convergence tolerances) and has a result.
• InfeasibleNoResult: The algorithm stopped because it decided the problem is infeasible but does not have a result to return.
• UnboundedNoResult: The algorithm stopped because it decided the problem is unbounded but does not have a result to return. This does not include Gurobi's UNBOUNDED because that really means that it found a ray of the primal and doesn't imply feasibility of the primal.
• InfeasibleOrUnbounded: The algorithm stopped because it decided that the problem is infeasible or unbounded; no result is available. This happens during MIP presolve.

### Limits

• IterationLimit
• TimeLimit
• NodeLimit
• SolutionLimit
• MemoryLimit
• ObjectiveLimit: Gurobi USER_OBJ_LIMIT and CUTOFF, Knitro KTR_RC_UNBOUNDED
• NormLimit: includes diverging iterates Ipopt
• OtherLimit

### Problematic

• SlowProgress: The algorithm stopped because it was unable to continue making progress towards the solution. AlmostSuccess should be used if there is additional information that relaxed convergence tolerances are satisfied.
• NumericalError: Unexpected numerical error, like division by zero
• InvalidModel: The solver has determined that it cannot solve this model because it violates some assumption, e.g., PSD hessian.
• InvalidOption
• Interrupted
• OtherError

These will probably be defined as an enum or a type. The reason to not return a single value and let users check equality is that it's much easier to rename statuses or deprecate methods this way.

## What is a result tuple?

We say a result tuple is a tuple (x,y,α,β) where

• x is the variable primal vector
• y is the constraint primal vector
• α is the variable dual vector (e.g., reduced costs in LPQP or vardual in conic)
• β is the constraint dual vector

Each vector contains one component per variable or constraint. For conic, for this discussion we say each row of the A matrix is a constraint, regardless of the dimensions of the cones.

The constraint primal vector is defined as:

lb <= a'x <= ub -> primal con value is defined as a'x
lb <= a'x + x'Qx <= ub -> primal con value is defined as a'x + x'Qx
lb <= f(x) <= ub -> primal con value is defined f(x)
b-Ax \in Cone -> primal con value is defined as b - Ax (row-wise)

## How do we retrieve a result tuple?

The following methods will be used by JuMP (for example) to decide whether to return a result to the user.

Since a number of solvers support returning multiple results, we'll define the API to enable querying multiple results. If available, result number 1 should be the most important result. Arguments specifying the result number are always last, and have a default value of 1.

result_count(m) returns the number of results available to inspect

primal_available(m, ::Int) and dual_available(m, ::Int) returns true or false

It may be the case that result_count(m) returns 1 and dual_available(m) returns false because the solver might decide not to return duals.

## How do we interpret a result tuple?

The "result status" tells us how to interpret a result returned by the solver.

is_primal_status(m, XXX, result_index::Int = 1)
is_dual_status(m, XXX, result_index::Int = 1)

where XXX is one of:

• FeasiblePoint
• NearlyFeasiblePoint
• InfeasiblePoint
• FeasibleDirection
• NearlyFeasibleDirection
• ReductionCertificate
• NearlyReductionCertificate
• Unknown
• Other

These will probably be defined as an enum or a type. The reason to not return a single value and let users check equality is that it's much easier to rename statuses or deprecate methods this way.

getobjval(m, :Int) can be used to query the objective value of a result by index

## Discussion

This resolves:

because we will be able to return a result to the user if one is available regardless of the reason why the solver stopped. It will also greatly simplify the implementation of solve() in JuMP (jump-dev/JuMP.jl#1033).

It resolves the undocumented :InfeasibleOrUnbounded status by making it official.
:Suboptimal and :Stall now correspond to is_termination_status(m, AlmostSuccess).

This new framework replaces getsolution, getreducedcosts, getconstrsolution, getconstrduals, getinfeasibilityray, getunboundedray, getdual, getvardual, and status.

## RFC procedure

This is a high priority for me, so I am planning on closing the RFC June 12 (first day of the meetup) unless there's a good reason to delay. Please consider and comment on this proposal before then if it interests you.

### odow commented May 30, 2017 • edited

 This is very well thought out! Potentially naive question regarding the constraint primal solution: what is the slack in a ranged constraint?

### mlubin commented May 30, 2017

 Potentially naive question regarding the constraint primal solution: what is the slack in a ranged constraint? I would call it the minimum distance to infeasibility, i.e., for 0 <= 2x <= 10 with x = 1 it would be 2.

### blegat commented May 30, 2017

 I like the proposal ! I would be in favor of defining the constraint primal to be the slack. That way we have ⟨x, α⟩ = 0 and ⟨y, β⟩ = 0 when strong duality holds.

### odow commented May 30, 2017

 I like the minimum distance to infeasibilty since it lines up with infinite bounds. Can we just make it signed such that lb <= terms + slack <= ub. In this case i.e., for 0 <= 2x <= 10 with x = 1 it would be 2. the constraint primal should be -2

### joehuchette commented May 30, 2017

 I also like this a lot. One question: what is the distinction between Other and Unknown?

### mlubin commented May 30, 2017

 lb <= terms - slack <= ub seems a bit more natural than lb <= terms + slack <= ub. IIRC the former corresponds to what many LP solvers (e.g., Clp) do internally to transform two-sided constraints into bounded variables. For conic it's also nice to define the slack as b - Ax so that the slack belongs to the cone. There's a problem here at the JuMP level that this concept doesn't align with how we deal with one-sided constraints, since we always normalize them by subtracting the rhs. But that's probably something we'll have to change in JuMP.

### mlubin commented May 30, 2017

 One question: what is the distinction between Other and Unknown? Unknown means that the solver does not know the status, e.g., MSK_SOL_STA_UNKNOWN and MSK_PRO_STA_UNKNOWN, maybe because it hasn't tried to solve the problem yet. Other means that the status corresponds to something that's not in the list of possible statuses, and the user should probably query a solver-specific routine to see what happened. There's some room to tweak the definitions here.

### mlubin commented May 30, 2017 • edited

 Here's a proposal for the primal constraint solution: lb <= a'x <= ub -> primal con solution is defined as a'x lb <= a'x + x'Qx <= ub -> primal con solution is defined as a'x + x'Qx lb <= f(x) <= ub -> primal con solution is defined f(x) b-Ax \in Cone -> primal con solution is defined as b - Ax (row-wise) I was hesitant to go this route because it conflicts with what JuMP does, but that's probably not a good reason.

### odow commented May 30, 2017

 I think this is the least confusing definition.
mentioned this issue May 30, 2017

### joaquimg commented May 31, 2017 • edited

 About solver status: Xpress might return Nonconvex, which might be interesting if other solvers do the same. Initial could also be Unstarted. I think it is a bit more clear, but maybe I am just used to that...

### joaquimg commented May 31, 2017

 I prefer the constraint primal subsolution as slacks, because, as @blegat said, That way we have ⟨x, α⟩ = 0 and ⟨y, β⟩ = 0 when strong duality holds. and for ranged constraints, I am with @odow's first reaction: I like the minimum distance to infeasibility since it lines up with infinite bounds But I am not a fan of ranged constraint, so I never think too much about them... I forgot to mention, very good proposal!

### blegat commented May 31, 2017

 There is a inconsistency between two-sided and one-sided constraints: if one does ⟨a, x⟩ <= 1, the constraint primal will return ⟨a, x⟩ - 1 as the cone is the nonpositive orthant and the corresponding dual variable will be nonpositive so that their inner product is always positive. if one does -1 <= ⟨a, x⟩ <= 1, the constraint primal will return ⟨a, x⟩ and since the two bounds cannot be tight at the same time, by complementary slackness only one of the two dual variables can be nonzero so this one is returned. Here is a suggestion: If ones calls with ConPrim on a model containing ranged constraint we throw an warning or an error. If one calls with ConVal (new), we return the value of A * x. If one calls with RedCost (new), we return A' * β. About the replacement of symbols by enums, I would like to suggest using instead struct/immutable. The advantage of using types over enum is dispatch. It will make the code for solvers and users both more convenient, more readable, and slighly more efficient as suggested by the following two examples: In the solver, one could do getsolution(m, id::Int, ::Type{ConPrim)) = ... instead of doing function getsolution(m, id::Int, sc::SolutionComponent) if sc == ConPrim ... elseif ... end In the user code: reacttonosol(::Type{PrimDualInfeas}) = ... function someotherfunction(...) ... reacttonosol(whynosolution(m)) ... end instead of function someotherfunction(...) ... reason = whynosolution(m) if reason == PrimDualInfeas ... elseif ... end ... end

### mlubin commented May 31, 2017 • edited

 @blegat, I'm proposing for the constraint primal to return ⟨a, x⟩ in the case of ⟨a, x⟩ <= 1 as an LP constraint. For conic-form problems I'm proposing for it to return 1 - ⟨a, x⟩ if the constraint was specified in terms of the nonnegative cone. I think this makes sense because solvers like SCS maintain explicit vectors for constriant primals which are only approximately equal to b - A*x but properly belong to the cone, and it may be useful access these vectors. There are no range constraints in conic form, so there's no need for the con primal solution to have the same definition between LP and conic. I'd prefer not to introduce additional (redundant) solution subvectors like ConVal and RedCost, it's just more work for the solver interfaces. The solution subvectors are supposed to correspond to entities that the solvers would typically compute and store on their own, not things that can easily be computed from the problem data and other solutions. I'm open to considering using types instead of enums or symbols.

### mlubin commented May 31, 2017 • edited

 @joaquimg, I changed Initial to NotStarted and introduced an InvalidModel status which would include the case when the solver stops because it determined the problem was not convex.

### mlubin commented May 31, 2017 • edited

 I like InvalidModel because it also covers cases where GLPK decides to stop because it doesn't like the bounds on the variables. We can now represent this case instead of covering it up in the interface.

### ccoffrin commented Jun 2, 2017

 Overall I think this is fantastic. My only question is what is the precise definition of the solution status Optimal? I am wondering if the Optimal status can be removed. For example, if I see that the solution status is PrimFeas and the solver status is Success, then I can deduce that my solution is globally optimal, if the underlying algorithm provides such a guarantee. If the solution status is PrimFeas and the solver status is *Limit, then I will deduce suboptimal. In any case, this would ensure that users think carefully about what solution properties hold for the algorithm they are using. One case where I can imagine the Optimal status being useful would be cases where the solver provides an explicit proof of optimality (e.g. primal-dual solvers). But the information might still be redundant; if the solution object includes both the primal and dual solutions. There could be a function isoptimal(s) that checks if the most relevant solution includes the proof of optimality.

### mlubin commented Jun 2, 2017

 @ccoffrin, good question. I would define Optimal analogously to PrimFeas, which is not at all. PrimFeas is decided by the solver according to its own tolerances, some of which may be set by the user. I would leave it up to the solver to precisely define Optimal. One thing I was trying to achieve with this restructuring is decoupling the reason why the solver stopped from the interpretation of the solution(s) returned. I agree that a PrimFeas solution that is returned after solver Success could be interpreted as optimal, but there are some cases where this is less clear: Mosek can terminate because of a "stall" and still have a primal-dual solution that satisfies the optimality conditions. A MIP solver could return multiple solutions, some of which have the optimal objective value and some are just feasible.

### mlubin commented Jun 3, 2017

 It's clear that we should extend getobjval to query the objective value per solution number.

### ccoffrin commented Jun 3, 2017

 @mlubin ok, then I am wondering if Optimal is the best name for some solver-specific nomination of solution(s); because Optimal has a connotation for the mathematical model. Maybe Final or Best would be a less loaded term? Pulling that thread, one nice feature of the other solution status values is that they are solver independent. They are intrinsic mathematical properties of the model, which one can easily verify given the model's equations (up to some numerical tolerance). In the case of Optimal, if the semantics can be defined by the solver (and not the model), then it may not be an intrinsic property. For example, you could imagine sending the same model to two solvers, both return an Optimal solution, but the solutions and objective values are entirely different; they simply have a different semantics of what Optimal is. That said, if possible, I think it would be great if feasible solutions could be annotated with the mathematical properties that the each solver guarantees that they satisfy (e.g. globally optimal, locally optimal). The reason being that an average user of Math Prog solvers does not have a deep understanding of the underlying algorithms and what the guarantee. Having a domain expert encode this information is preferable. Thinking of some examples, re the Mosek case, maybe I am missing something; If you give a any primal+dual solution, as long as the optimality conditions are satisfied, it still a proof of optimality, right? In that case, the solver status may be of lesser concern. Even if it is Error the solution is still fine for me. re the MIP case, from my perspective handling multiple solutions is orthogonal to the semantics of Optimal, discussed above. If you give me a pool of feasible solutions and a solver status Success; and I know that the solver ensures optimality; then I can deduce that all of the lowest cost solutions in the pool are globally optimal. In the case of a solver like Ipopt; would you imagine it only using the PrimFeas status and never using Optimal status, because it only guarantees local optimality? When it converges to a infeasible point; what should the whynosolution(m) value be? Do these properties imply a proof of infeasibility?

### mlubin commented Jun 3, 2017

 I would be open to splitting Optimal into GlobalOptimal and LocalOptimal. GlobalOptimal means that a global lower bound is available and the solution has objective value sufficiently close to the global lower bound according to the solver's tolerances. If a MIP solver terminates with a 1% gap (as requested by the user), I'd still call this GlobalOptimal. LocalOptimal means that the solver's convergence criteria were met but no global lower bound is available. For the Mosek case, as the user I don't want to have the obligation to manually check the duality gap on all solutions returned. If the solver has the ability to label a solution as optimal, which Mosek does, then we need to have a way to make this easily accessible. Abstracting this away seems frustrating for anyone who just wants to solve a problem and check if the solver decided that the solution was optimal. The Ipopt case is interesting because in the case of infeasibility there might still be value in seeing the solution (jump-dev/JuMP.jl#938). We might need to add a LocalInfeas solution status.

### chriscoey commented Jun 3, 2017

 I agree with Carleton - I think any statuses should make sense in the context of the mathematical model (of course, taking into account numerical tolerances). I think Miles' suggestion for GlobalOptimal and LocalOptimal and LocalInfeas helps fix the most egregious issue with the status quo (e.g. Ipopt on nonconvex), so that would be a step in a right direction.

### mlubin commented Jun 3, 2017

 Pulling that thread, one nice feature of the other solution status values is that they are solver independent. They are intrinsic mathematical properties of the model, which one can easily verify given the model's equations (up to some numerical tolerance). Another comment on this. As soon as we generalize conic form to allow affine (or quadratic) expressions belonging to arbitrary sets, then feasibility isn't necessarily any easier to verify than optimality. If you write down a copositive constraint then you may need a fancy oracle (a.k.a. a solver) to verify feasibility. It's not clear what this means for how we should design statuses, but I wouldn't assume that feasibility is as easy to check as asking the solver for a solution.

### dpo commented Jun 3, 2017

 I think Miles' suggestion for GlobalOptimal and LocalOptimal and LocalInfeas helps fix the most egregious issue with the status quo (e.g. Ipopt on nonconvex), so that would be a step in a right direction. On nonconvex problems, solvers such as Ipopt aren't even guaranteed to find a local minimizer. All one can say is that it found a stationary point. In my view, that would be a more meaningful status and could also resolve some misunderstandings.

### mlubin commented Jun 3, 2017

 Good point, @dpo. If we can find a name for "a feasible point which the solver converged to" that's not specific to NLP then I'd go with that.

### mlubin commented Jun 3, 2017

 Ok, maybe the solver status should be used to deduce if the solver converged (or almost converged).

### ccoffrin commented Jun 3, 2017

 I think Miles' suggestion for GlobalOptimal and LocalOptimal and LocalInfeas helps fix the most egregious issue with the status quo (e.g. Ipopt on nonconvex), so that would be a step in a right direction. Off the top of my head I can't think of any cases that would be outside the scope of these status. So I think this is a good solution. Because each of these will be conditioned on some tolerance values, should the solution vector also include a standardized set of tolerance values? Or maybe the solvers should have standardized, get constraint tol and get opt tol functions? On nonconvex problems, solvers such as Ipopt aren't even guaranteed to find a local minimizer. All one can say is that it found a stationary point. Good point, what about addressing this in the current scheme by setting the solver status to Success and the solution status to PrimalFeas or LocalInfeas and avoiding the Optimal status entirely? For the Mosek case, as the user I don't want to have the obligation to manually check the duality gap on all solutions returned. Agreed, if we have a clear semantics of what the optimal status means, then the check can and should be performed for the users. Another comment on this. As soon as we generalize conic form to allow affine (or quadratic) expressions belonging to arbitrary sets, then feasibility isn't necessarily any easier to verify than optimality. That's an interesting point. When I said "easy to verify" I was thinking there is a known polynomial time check (not necessarily linear time). This is in contrast to discrete or non-convex problems where it is not obvious that such a check exists (i.e. P vs NP question).

### mlubin commented Jun 3, 2017

 I removed Optimal status and added PrimInfeas. Because each of these will be conditioned on some tolerance values, should the solution vector also include a standardized set of tolerance values? Or maybe the solvers should have standardized, get constraint tol and get opt tol functions? If someone volunteers to define standard tolerances across solvers, I'll consider it, but that won't be me.

### mlubin commented Jun 3, 2017

 I renamed Success and AlmostSuccess to Converged and AlmostConverged and introduced a Diverged status to account for Ipopt's "diverging iterates" status which usually means the problem is unbounded.

### joehuchette commented Jun 3, 2017

 What's the difference between Converged and AlmostConverged? Would the second always correspond to some type of time/iteration/etc. limit being reached?

### ccoffrin commented Jun 6, 2017

 Very interesting, I wish more solvers had this! Agreed that it make sense to put this in Other for the near term.

### blegat commented Jun 6, 2017

 Why not keeping it ? It is well defined, solver independent and adding it might motivate other solvers to implement it.

### chriscoey commented Jun 6, 2017 • edited

 It is specific to conic, right? Is there a non-conic generalization? I imagine there is a reasonable notion for non-conic. Like being on hyperplane that intersects the boundary of a convex set.

### HFriberg commented Jun 7, 2017

 @chriscoey, it may be specific to convex optimization, but not to conic. The first papers dealing with these issues treated the general convex case: "Projections of Convex Programs with Unattained Infima" by Robert Abrams (1975) and "Regularizing the abstract convex program" by Jon Borwein and Henry Wolkowicz (1981).

### mlubin commented Jun 7, 2017

 I've just done a major update to the proposal following brainstorming with @chriscoey and @joehuchette. The main points are: For clarity, we use "result" instead of "solution" New names for result statuses and termination statuses whynosolution() is removed. Instead, termination statuses will explain the specific cases where the solver terminated and no result is available, e.g., InfeasibleNoResult. We have ObjectiveLimit to include both USER_OBJ_LIMIT in Gurobi and Knitro's objective limit which it uses to detect unbounded problems Ipopt stops when the solution has a norm that exceeds some parameter, so we call this NormLimit I'd like to conclude this discussion on Monday at the meetup so that we can proceed with implementation during the rest of the week.

### blegat commented Jun 8, 2017

 I like the fact that it is now clear whether what is returned is a feasible point or feasible ray (I would prefer Ray than Direction). However it seems to me that the solver might also want to tell us some objective-related information about the result. Namely: If this is a point, does the solver know that it is globally optimal or nearly globally optimal (same for locally)? If this is a ray, does the solver know that the objective value of this ray is nonzero and in the desired direction (i.e. negative if this is a minimization)?

### mlubin commented Jun 8, 2017

 @blegat, the current thinking is that it's up to the solver to make a contract with the user about what properties its results will have under the Success and AlmostSuccess case. If Mosek reports success on a conic problem and returns a point, then you can infer global optimality of the point, and if it returns a ray you can infer that the ray is an infeasibility certificate. If Ipopt reports success on a nonconvex problem and returns a point, all that you can infer is that it is a stationary point, not even a locally optimal solution as @dpo reminded us. If we can come up with a simple way of encoding optimality information without misleading users about global optimality as @ccoffrin has pointed out, then I'd be open to considering it.

### blegat commented Jun 8, 2017

 I would understand if this kind of thing is model-dependent but solver dependent is weird. The package built in top of MPB should be able to interpret the solution without knowing what the solver is.

### ccoffrin commented Jun 8, 2017

 Like the revision! Great idea to use a boolean function for future maintenance. In terms of solution statuses, what is the semantics of Nearly*? Should there be a NearlyInfeasiblePoint, for consistency? With the addition of ObjectiveLimit I was wondering should there also be something like ObjectiveGapLimit? In my mind the notion of an objective gap could be interpreted as both a numerical tolerance and also as a termination criterial (i.e. a limit). In the current semantics I presume Success includes the notion that the desired objective gap has been reached, so Success and ObjectiveGapLimit are one in the same.

### blegat commented Jun 8, 2017

 To interpret the solution, we probably already have everything we need with getobjval but shouldn't it be split between primal and dual objective value then ? Also, what is the getter for the result ? getresult(m, count::Int) ? Does it return the tuple or can the results be taken separately ?

### mlubin commented Jun 8, 2017 • edited

 I would understand if this kind of thing is model-dependent but solver dependent is weird. The package built in top of MPB should be able to interpret the solution without knowing what the solver is. That's essentially hypothetical though, because all existing conic and MIP solvers with MPB interfaces guarantee global optimality. It doesn't seem critical that we address this issue, but if there's a reasonable proposal I'll consider it. In terms of solution statuses, what is the semantics of Nearly*? Should there be a NearlyInfeasiblePoint, for consistency? These correspond to notions that are directly reported by the solver. Mosek, for example, distinguishes between feasible and nearly feasible results. I don't know of any solver that would call a result nearly infeasible, that's just feasible. InfeasiblePoint has two different uses: if Ipopt reports that it converges to an infeasible point, or if CPLEX reports OPTIMAL_INFEAS which means that it finished branch-and-bound and ended up with an infeasible solution after reversing presolve transformations. In the current semantics I presume Success includes the notion that the desired objective gap has been reached, so Success and ObjectiveGapLimit are one in the same. Yes, I don't see a need to distinguish these termination statuses. You can query the gap afterwards. Solvers have a number of different termination criteria, so I don't see why ObjectiveGap should be a special one that deserves a status distinct from Success. To interpret the solution, we probably already have everything we need with getobjval but shouldn't it be split between primal and dual objective value then ? getobjval corresponds to the primal, getobjbound could correspond to the dual or the otherwise best known bound. Also, what is the getter for the result ? Haven't written this down yet, but I was thinking one named getter per component of the result, e.g., getvarprim or getvariableprimal.

### mlubin commented Jun 8, 2017

 The constraint concept needs a bit more thought since LPQP solvers have multiple types of constraints (so this would affect getters), but let's leave that separate from the status discussion.
mentioned this issue Jun 9, 2017
mentioned this issue Jun 9, 2017
This was referenced Jun 9, 2017
This was referenced Jun 9, 2017

### mlubin commented Jun 10, 2017

 Worth thinking about: we could have solvers return "attributes" per result, where the primal and duals could be treated identically at the API level as constraint-wise and variable-wise attributes. This allows trivial extensions for solvers to return other pieces of information like membership in an IIS (jump-dev/JuMP.jl#1035). @dourouc05 This is more of an API question that's orthogonal to how we categorize statuses.
This was referenced Jun 11, 2017
added this to the 1.0 milestone Jun 11, 2017

### dourouc05 commented Jun 11, 2017

 For IISes, results that make sense are an infeasible set that is irreducible (best case), an infeasible set without any irreducibility property (e.g., the user stopped the computation, put a flag), or no infeasible set at all (probably not yet computed). I can think of nothing else that would be interesting here.
mentioned this issue Jun 14, 2017
mentioned this issue Jul 7, 2017
mentioned this issue Jul 14, 2017

### mlubin commented Jul 15, 2017

 This proposal has been incorporated into MathOptInterface
closed this Jul 15, 2017