findmin implementation #455

jverzani · 2023-01-28T19:50:01Z

This is suggested in #454

It basically implements the findmin(f, domain) generic. (Well, not quite. With a function and iterable, it returns an index not a value. This returns the value)

codecov · 2023-01-28T19:57:48Z

Codecov Report

Base: 83.23% // Head: 83.35% // Increases project coverage by +0.12% 🎉

Coverage data is based on head (d7b8633) compared to base (efeb466).
Patch coverage: 91.30% of modified lines in pull request are covered.

Additional details and impacted files

@@            Coverage Diff             @@
##           master     #455      +/-   ##
==========================================
+ Coverage   83.23%   83.35%   +0.12%     
==========================================
  Files          22       22              
  Lines        2994     3040      +46     
==========================================
+ Hits         2492     2534      +42     
- Misses        502      506       +4

Impacted Files	Coverage Δ
src/common.jl	`90.02% <91.30%> (+0.15%)`	⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

singularitti · 2023-01-28T23:13:58Z

May I propose a more unified API? We could also extend findmax. And what is more important, we could define a pair of functions islocalmin and islocalmax. And with the help of that, we could define

findall(islocalmin, poly)
findall(islocalmax, poly)
findfirst(islocalmin, poly)
findfirst(islocalmax, poly)
findlast(islocalmin, poly)
findlast(islocalmax, poly)
findnext(islocalmin, poly, i)
findnext(islocalmax, poly, i)
findprev(islocalmin, poly, i)
findprev(islocalmax, poly, i)

I am no expert in polynomials, but I could help to make a PR for such API extension.

jverzani · 2023-01-28T23:40:49Z

Hmm, I'll have to mull this over, Certainly findmax should be added if findmin is. But putting the polynomial in the iterator position is not quite in line with the generic, as iteration over a polynomial in this package means iterate over the coefficients. Something like findall is like findall(predicate, iterator) so it would be better to have an iterator over local min, maybe? That would be easy enough, as that is just re-wrapping the first-derivative test code I started with..

singularitti · 2023-01-28T23:43:45Z

Please give me a few minutes to try to propose a new API.

jverzani · 2023-01-29T00:05:45Z

Sure

singularitti · 2023-01-29T00:40:06Z

Ok, I found that findall, findnext, etc., are usually for arrays, and based on the reason you said above, we cannot use them. There is another way to do it: create a new iterable type (e.g., EachPoint) and iterate over it.

export iscriticalpoint, islocalmin, islocalmax, eachcriticalpoint, eachlocalmin, eachlocalmax

function iscriticalpoint end

function islocalmin end

function islocalmax end

struct EachPoint{F,T<:Polynomial}
    f::F
    poly::T
end

eachcriticalpoint(poly::Polynomial) = EachPoint(iscriticalpoint, poly)

eachlocalmin(poly::Polynomial) = EachPoint(islocalmin, poly)

eachlocalmax(poly::Polynomial) = EachPoint(islocalmax, poly)

function Base.iterate(iter::EachPoint{typeof(iscriticalpoint)})
    # If there are no critical points, return nothing
end
function Base.iterate(iter::EachPoint{typeof(iscriticalpoint)}, state)
    # First determine how many critical points there are
    if state > length(iter)
        return nothing
    else
        return x, state + 1  # `x` is the `state`th critical point
    end
end
function Base.iterate(iter::EachPoint{typeof(islocalmin)})
    # If there are no local minima, return nothing
end
function Base.iterate(iter::EachPoint{typeof(islocalmin)}, state)
    if state > length(iter)
        return nothing
    else
        return x, state + 1  # `x` is the `state`th local minimum
    end
end
function Base.iterate(iter::EachPoint{typeof(islocalmax)})
    # If there are no local maxima, return nothing
end
function Base.iterate(iter::EachPoint{typeof(islocalmax)}, state)
    if state > length(iter)
        return nothing
    else
        return x, state + 1  # `x` is the `state`th local maximum
    end
end

Base.length(iter::EachPoint) = 100  # Calculate how many critical points there are

I have not thought about it thoroughly, since I do not know how you calculate the critical points, local minima, and local maxima. I think you calculate all the critical points at once, then check if each one of them is a local minimum or maximum, is it? The Base.length for EachPoint{typeof(iscriticalpoint)} can be defined. But the lengths of EachPoint{typeof(islocalmin)} and EachPoint{typeof(islocalmax)} are not. They will be suitable if we calculate local minima/maxima one by one.

This idea is from IterTools.fieldvalues, and I have been using such API for a while.

jverzani · 2023-01-29T00:53:21Z

An iterator over local minima might look like this:


# Iterator over local minima
struct LocalMinPoints{T}
    pts::Vector{T}
end

function LocalMinPoints(p::AbstractPolynomial{T}) where {T <: Real}

    p′ = derivative(p)

    
    critical_pts = sort(real.(filter(isreal, roots(p′))))
    n = length(critical_pts)
    n <= 1 && return LocalMinPoints(critical_pts)

    S = eltype(critical_pts)
    δ, ϵ = eps(S), Base.rtoldefault(S)

    r, rst... = critical_pts
    pts = [r]
    for i ∈ 1:(n-1)
        r,s = critical_pts[i], critical_pts[i+1]
        !isapprox(r, s; atol=δ, rtol=ϵ) && push!(pts, s)
        r = s
    end

    n, aₙ = degree(p′), p′[end]
    lsgn = ifelse((iseven(n) && aₙ < 0) || (isodd(n) && aₙ > 0), -one(S), one(S))

    lpts = similar(pts, 0)
    for i ∈ 1:(length(pts)-1)
        r,s = pts[i], pts[i+1]
        Δ = s/2 - r/2
        y = p′(r + Δ)
        rsgn = abs(y) <= δ + (r + Δ) * ϵ ? zero(S) : y < 0 ? -one(S) : one(S)
        lsgn * rsgn < 0 && push!(lpts, r)
        lsgn = rsgn
    end
    rsgn = aₙ > 0 ? one(S) : -one(S)
    lsgn * rsgn < 0 && push!(lpts, pts[end])

    LocalMinPoints(pts)
end

Base.iterate(l::LocalMinPoints) = iterate(l.pts)
Base.iterate(l::LocalMinPoints, state) = iterate(l.pts, state)
Base.length(l::LocalMinPoints) = length(l.pts)

The ugly part is the first derivative test along with worrying a bit about floating point issues. A LocalMax iterator would be easily defined using -p. (Maybe it would make sense to use the same type and different names for the constructor). But with this, you could implement some of what you want. e.g, findnext, etc.

I don't mind implementing these, but I'm a bit wary about exporting the constructor, as it isn't really a specialization of some generic function I know of.

singularitti · 2023-01-29T01:05:27Z

Forgive me if I do not understand your code very well.
When you say "floating point issues", do you mean the inaccuracy of function roots?
And this part:

    r, rst... = critical_pts
    pts = [r]
    for i ∈ 1:(n-1)
        r,s = critical_pts[i], critical_pts[i+1]
        !isapprox(r, s; atol=δ, rtol=ϵ) && push!(pts, s)
        r = s
    end

Are you comparing two roots? And if they are too close to distinguish, just discard one of them?

And at this part, I am completely lost:

    lpts = similar(pts, 0)
    for i ∈ 1:(length(pts)-1)
        r,s = pts[i], pts[i+1]
        Δ = s/2 - r/2
        y = p′(r + Δ)
        rsgn = abs(y) <= δ + (r + Δ) * ϵ ? zero(S) : y < 0 ? -one(S) : one(S)
        lsgn * rsgn < 0 && push!(lpts, r)
        lsgn = rsgn
    end
    rsgn = aₙ > 0 ? one(S) : -one(S)
    lsgn * rsgn < 0 && push!(lpts, pts[end])

Are you testing each critical point in a small neighborhood with radius Δ? Can we just compute the second derivative of that polynomial at that critical point (as I did in my code) since all polynomials are continuous and smooth everywhere?

I do not think export the constructor is an issue here, as I wrote above, we could write a fake function eachcriticalpoint for EachPoint to imitate eachcol functions defined in Base.

jverzani · 2023-01-29T02:11:12Z

Yeah, the issues with roots are many:

first, the second derivative is only determinative if f''(r) is not 0. But what does that mean? f''(r) is subject to calculation errors like f'''(r) * r * eps(), so testing for 0 needs a tolerance. Then if f''(r) is identified as 0, you can't remove it from consideration. This is why the first derivative test is better used. For that, you still need something to consider if f' is positive, negative, or zero. Again even if mathematically 0, the computed value can be as big as f''(r)*abs(r) * eps(), so there is a tolerance check.
But the bigger issue is likely multiplicities. Consider this simple case:

p = integrate(fromroots([pi, ℯ, pi]))

Great, critical points are pi and e, and since pi is double, it is not a local max. But taking roots gives:

julia> roots(derivative(p))
3-element Vector{ComplexF64}:
  2.718281828458972 + 0.0im
 3.1415926535898264 - 2.2207909901031392e-7im
 3.1415926535898264 + 2.2207909901031392e-7im

Oops, it won't be identified. We have the method multroot, which does get this right:

julia> Polynomials.Multroot.multroot(derivative(p))
(values = [2.7182818284590433, 3.141592653589794], multiplicities = [1, 2], κ = 56.26785197710239, ϵ = 1.3242378009143006e-16)

But that misses some other roots it should get when there aren't multiplicities, though perhaps that is best to identify the critical points (instead of checking nearby roots).

This is part of the reluctance to add this, as it may not yield correct answers in fairly simple cases.

singularitti · 2023-01-29T23:10:14Z

Oh, I was thinking we may not want to get all the critical points in one go (because it takes time to do so), so we needed some kind of iterative steps to find them one by one (lazily). But is it true that we find them all at once? If so, there is no need for findall, findnext, findprev interface.

I like the current implementation very much. If I understand correctly, you use min_max to find all critical points in one go, then use their values to implement findmax, etc., right?

singularitti · 2023-01-29T23:25:41Z

However, there is one potential performance issue: if we call findmin, findmax, etc., multiple times on the same polynomial, min_max will be called multiple times but only one of its results will be used. It could be better if we have a cache mechanism.

jverzani · 2023-01-29T23:46:48Z

True, extrema returns both so one wouldn't necessarily call them multiple times, though no exported method returns both the values and the arguments where they happen for both (min_max's job). But is there a use case where one wouldn't just save the computed value?

singularitti · 2023-01-29T23:55:10Z

But is there a use case where one wouldn't just save the computed value?

True. Maybe we could write something in the docs like: If you only want to calculate x, use argmin/argmax; to calculate minima and maxima, use extrema; to calculate both, use findmin/findmax, avoid using them multiple times.

singularitti · 2023-01-29T23:59:22Z

Also, could we add two more methods like

minimum(poly, xs)
maximum(poly, xs)

which returns the smallest/largest result of calling the polynomial poly on each element of xs? Because extrema is just (minimum(poly, xs), maximum(poly, xs)). It seems wierd to have extrema but no minimum and maximum.

findmin(p::AbstractPolynomial{<:Real}, I=domain(p)) -> (f(x), x)
findmax(p::AbstractPolynomial{<:Real}, I=domain(p)) -> (f(x), x)
minimum(p::AbstractPolynomial{<:Real}, I=domain(p)) -> m
maximum(p::AbstractPolynomial{<:Real}, I=domain(p)) -> M
extrema(p::AbstractPolynomial{<:Real}, I=domain(p)) -> (m, M)
argmin(p::AbstractPolynomial{<:Real}, I=domain(p)) -> x
argmax(p::AbstractPolynomial{<:Real}, I=domain(p)) -> x

Then the API seems to be complete.

singularitti · 2023-01-30T00:02:54Z

Also, a small suggestion to the naming of variables: In Julia's official doc, they use mn and mx as the minimum and maximum values:

extrema(f, itr; [init]) -> (mn, mx)

Could we use them also?

singularitti · 2023-01-30T00:13:23Z

I have re-read the doc of extrema, minimum, and maximum, it seems that their API

extrema(f, itr) -> (mn, mx)
maximum(f, itr)
minimum(f, itr)

means calling f on an iterable itr, and comparing these results f.(itr), then find the maximum/minimum values. It is a little bit different from what findmax/argmax, etc., intend to do: they are evaluated on a domain.

Should we respect the intention of the API? I mean, should we define

function Base.maximum(poly::Polynomial, itr)
    ys = poly.(iter)
    return maximum(poly)
end

or keep the current implementation?

Sorry if I am being too picky.

singularitti · 2023-01-30T00:16:14Z

OK, it seems that if I call maximum(f, itr), it will just fall back to Julia's default definition of maximum.

julia> maximum(p, 1:10)
4321

julia> @which maximum(p, 1:10)
maximum(f, a::AbstractArray; dims, kw...) in Base at reducedim.jl:995

julia> maximum(p, Tuple(1:10))
4321

julia> @which maximum(p, Tuple(1:10))
maximum(f, a; kw...) in Base at reduce.jl:698

jverzani · 2023-01-30T00:17:41Z

Maybe if someone is going to redo these things, a critical_points function might be useful. I just broke that out, but below it can be generalized. With just that function, a user could get most of this functionality of this PR with just this:

using Polynomials

function critical_points(p, l=-Inf, r=Inf)
    q = Polynomials.ngcd(derivative(p), derivative(p,2)).v
    rs = real.(filter(isreal, roots(q)))
    pts = Iterators.flatten((l, r, rs))
    sort(collect(Iterators.filter(x -> l ≤ x ≤ r, pts)))
end

pts = critical_points(p)
findmin(p, pts) # (-13.016666666666666, 3)
argmin(p, pts)  # -0.9999999999999993

jverzani · 2023-01-30T00:19:06Z

Thanks for all this. Still things to think about :)

jverzani · 2023-01-31T22:34:25Z

Using a different approach with just a critical_points function

jishnub · 2023-02-01T04:48:13Z

Just in case this is reconsidered, perhaps argmin might be more suitable than findmin, as it returns the value in the domain and not the index.

jverzani · 2023-02-01T15:57:29Z

Yeah, this PR would have done argmin as well, but adding a feature to find the critical points seems like a better solution, then using the XXXmin/max(f, itr) interfaces just work for these tasks

While I have your attention, the code to find the critical points first reduces the polynomial with the numeric gcd to hopefully reduce the issue with multiple roots being identified as complex values. However, that code takes some real time to compile so the first use is really sluggish. I also added a small amount of pre-compile code to reduce that, but that in turn increases the loading time. It was around 300ms, it might now be about 500ms. I wonder if that is a big price to pay for something that isn't likely to be used by most. Any thoughts on that? I personally find it un-noticable, but it is a bit of a trade off.

jishnub · 2023-02-05T14:12:35Z

Sorry I had missed this. I think adding precompile statements is a good idea, as load times should be resolved by that, now that invalidations have been dealt with.

findmin implementation

e331df3

redo

e0ec761

take 3

f1e12b3

jverzani added 2 commits January 29, 2023 15:54

rework

531f36e

rework2

bb0c5b6

separate out critical points

d7b8633

singularitti mentioned this pull request Jan 30, 2023

minimum(p) returns the minimal coefficient of p but minimum(p, itr) returns when p is applied onto itr #459

Closed

jverzani closed this Jan 31, 2023

jverzani deleted the issue_454 branch January 31, 2023 22:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

findmin implementation #455

findmin implementation #455

jverzani commented Jan 28, 2023 •

edited

Loading

codecov bot commented Jan 28, 2023 •

edited

Loading

singularitti commented Jan 28, 2023 •

edited

Loading

jverzani commented Jan 28, 2023

singularitti commented Jan 28, 2023

jverzani commented Jan 29, 2023

singularitti commented Jan 29, 2023 •

edited

Loading

jverzani commented Jan 29, 2023

singularitti commented Jan 29, 2023 •

edited

Loading

jverzani commented Jan 29, 2023

singularitti commented Jan 29, 2023 •

edited

Loading

singularitti commented Jan 29, 2023

jverzani commented Jan 29, 2023

singularitti commented Jan 29, 2023

singularitti commented Jan 29, 2023 •

edited

Loading

singularitti commented Jan 30, 2023

singularitti commented Jan 30, 2023

singularitti commented Jan 30, 2023

jverzani commented Jan 30, 2023

jverzani commented Jan 30, 2023

jverzani commented Jan 31, 2023

jishnub commented Feb 1, 2023

jverzani commented Feb 1, 2023

jishnub commented Feb 5, 2023

findmin implementation #455

findmin implementation #455

Conversation

jverzani commented Jan 28, 2023 • edited Loading

codecov bot commented Jan 28, 2023 • edited Loading

Codecov Report

singularitti commented Jan 28, 2023 • edited Loading

jverzani commented Jan 28, 2023

singularitti commented Jan 28, 2023

jverzani commented Jan 29, 2023

singularitti commented Jan 29, 2023 • edited Loading

jverzani commented Jan 29, 2023

singularitti commented Jan 29, 2023 • edited Loading

jverzani commented Jan 29, 2023

singularitti commented Jan 29, 2023 • edited Loading

singularitti commented Jan 29, 2023

jverzani commented Jan 29, 2023

singularitti commented Jan 29, 2023

singularitti commented Jan 29, 2023 • edited Loading

singularitti commented Jan 30, 2023

singularitti commented Jan 30, 2023

singularitti commented Jan 30, 2023

jverzani commented Jan 30, 2023

jverzani commented Jan 30, 2023

jverzani commented Jan 31, 2023

jishnub commented Feb 1, 2023

jverzani commented Feb 1, 2023

jishnub commented Feb 5, 2023

jverzani commented Jan 28, 2023 •

edited

Loading

codecov bot commented Jan 28, 2023 •

edited

Loading

singularitti commented Jan 28, 2023 •

edited

Loading

singularitti commented Jan 29, 2023 •

edited

Loading

singularitti commented Jan 29, 2023 •

edited

Loading

singularitti commented Jan 29, 2023 •

edited

Loading

singularitti commented Jan 29, 2023 •

edited

Loading