Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Adam and AdaMax #1069

Merged
merged 3 commits into from Jan 29, 2024
Merged

Add Adam and AdaMax #1069

merged 3 commits into from Jan 29, 2024

Conversation

pkofod
Copy link
Member

@pkofod pkofod commented Jan 28, 2024

Fixes #1012

I don't use Adam and AdaMax myself, but I suppose the slow convergence of Adam from zeros(2) is sort of expected sometimes? Otherwise it may be good to compare against another implementation.

julia> rosenbrock(x) =  (1.0 - x[1])^2 + 100.0 * (x[2] - x[1]^2)^2
rosenbrock (generic function with 1 method)

julia> result = optimize(rosenbrock, ones(2), AdaMax(), Optim.Options(iterations=5000))
 * Status: success

 * Candidate solution
    Final objective value:     5.647600e-17

 * Found with
    Algorithm:     AdaMax

 * Convergence measures
    |x - x'|               = 1.50e-08 ≰ 0.0e+00
    |x - x'|/|x'|          = 1.50e-08 ≰ 0.0e+00
    |f(x) - f(x')|         = NaN ≰ 0.0e+00
    |f(x) - f(x')|/|f(x')| = NaN ≰ 0.0e+00
    |g(x)|                 = 9.95e-09 ≤ 1.0e-08

 * Work counters
    Seconds run:   0  (vs limit Inf)
    Iterations:    4637
    f(x) calls:    4638
    ∇f(x) calls:   4638


julia> result = optimize(rosenbrock, ones(2), Adam(), Optim.Options(iterations=5000))
 * Status: success

 * Candidate solution
    Final objective value:     4.950178e-16

 * Found with
    Algorithm:     Adam

 * Convergence measures
    |x - x'|               = 4.45e-08 ≰ 0.0e+00
    |x - x'|/|x'|          = 4.45e-08 ≰ 0.0e+00
    |f(x) - f(x')|         = NaN ≰ 0.0e+00
    |f(x) - f(x')|/|f(x')| = NaN ≰ 0.0e+00
    |g(x)|                 = 9.94e-09 ≤ 1.0e-08

 * Work counters
    Seconds run:   0  (vs limit Inf)
    Iterations:    590
    f(x) calls:    591
    ∇f(x) calls:   591


julia> result = optimize(rosenbrock, zeros(2), Adam(), Optim.Options(iterations=5000))
 * Status: failure (reached maximum number of iterations)

 * Candidate solution
    Final objective value:     2.899319e-01

 * Found with
    Algorithm:     Adam

 * Convergence measures
    |x - x'|               = 4.62e-01 ≰ 0.0e+00
    |x - x'|/|x'|          = 1.00e+00 ≰ 0.0e+00
    |f(x) - f(x')|         = NaN ≰ 0.0e+00
    |f(x) - f(x')|/|f(x')| = NaN ≰ 0.0e+00
    |g(x)|                 = 1.08e+00 ≰ 1.0e-08

 * Work counters
    Seconds run:   0  (vs limit Inf)
    Iterations:    5000
    f(x) calls:    5001
    ∇f(x) calls:   5001


julia> result = optimize(rosenbrock, zeros(2), AdaMax(), Optim.Options(iterations=5000))
 * Status: success

 * Candidate solution
    Final objective value:     9.309545e-17

 * Found with
    Algorithm:     AdaMax

 * Convergence measures
    |x - x'|               = 1.00e+00 ≰ 0.0e+00
    |x - x'|/|x'|          = 1.00e+00 ≰ 0.0e+00
    |f(x) - f(x')|         = NaN ≰ 0.0e+00
    |f(x) - f(x')|/|f(x')| = NaN ≰ 0.0e+00
    |g(x)|                 = 9.40e-09 ≤ 1.0e-08

 * Work counters
    Seconds run:   0  (vs limit Inf)
    Iterations:    4580
    f(x) calls:    4581
    ∇f(x) calls:   4581

Copy link

codecov bot commented Jan 28, 2024

Codecov Report

Attention: 5 lines in your changes are missing coverage. Please review.

Comparison is base (1a649e8) 84.73% compared to head (ac1a8a2) 84.90%.

Files Patch % Lines
src/multivariate/solvers/first_order/adam.jl 91.89% 3 Missing ⚠️
src/multivariate/solvers/first_order/adamax.jl 94.28% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1069      +/-   ##
==========================================
+ Coverage   84.73%   84.90%   +0.17%     
==========================================
  Files          44       46       +2     
  Lines        3419     3491      +72     
==========================================
+ Hits         2897     2964      +67     
- Misses        522      527       +5     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@pkofod
Copy link
Member Author

pkofod commented Jan 28, 2024

Of course, the results of Adam and AdaMax will also depend on learning parameters etc... so

julia> result = optimize(rosenbrock, zeros(2), Optim.Adam(alpha=0.1), Optim.Options(iterations=100000))
 * Status: success

 * Candidate solution
    Final objective value:     4.408001e-16

 * Found with
    Algorithm:     Adam

 * Convergence measures
    |x - x'|               = 1.00e+00 ≰ 0.0e+00
    |x - x'|/|x'|          = 1.00e+00 ≰ 0.0e+00
    |f(x) - f(x')|         = NaN ≰ 0.0e+00
    |f(x) - f(x')|/|f(x')| = NaN ≰ 0.0e+00
    |g(x)|                 = 9.91e-09 ≤ 1.0e-08

 * Work counters
    Seconds run:   0  (vs limit Inf)
    Iterations:    2818
    f(x) calls:    2819
    ∇f(x) calls:   2819

@pkofod pkofod merged commit d7324eb into master Jan 29, 2024
13 of 15 checks passed
@pkofod pkofod deleted the pkm/adam(ax) branch January 29, 2024 08:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Feature request: Include the Adam algorithm
1 participant