Trouble reproducing Stochastic FW benchmarks #100

busycalibrating · 2023-01-20T21:21:36Z

Hello!

I'm trying play around with the stochastic Frank-Wolfe benchmarks you have in this example and I'm having trouble reproducing the figure by running the provided example. The expected results are as follows:

However when I run the script out of the box, the first thing to note is that it takes a very long time; for 10 iters, these are the timing results:

CPU times: 
     user 16min 12s
     sys: 7min 29s
     total: 23min 42s
Wall time: 12min 25s

Perhaps this is fine seeing as ~10 steps should be sufficient (this would correspond to ~10^8 gradient evaluations which would match the figure in the example), but then the problem comes with the actual convergence results; if I set max_iter=500 (vs 1e4) and let it run for the full 2h x 3 runs, I get worse results:

In particular, after 10^8 gradient evaluations, the relative suboptimality

Some additional info that may be useful:

Python 3.7.6
SciPy 1.7.3
NumPy 1.21.6
Latest (development) version of COPT.

If useful, we can play around with the example in a colab notebook.

Thanks for any help!

The text was updated successfully, but these errors were encountered:

fabianp · 2023-01-23T09:23:17Z

Thanks for the report. For the timing, having numba installed can help. In your notebook you can install it with "!pip install numpy scipy numba" . However, I can see from the example site (https://openopt.github.io/copt/auto_examples/frank_wolfe/plot_sfw_real_data.html#sphx-glr-download-auto-examples-frank-wolfe-plot-sfw-real-data-py) that it anyway took 47 min, so I don't expect numba to change the runtime too much.

For the different suboptimality I don't have an answer. If I understand correctly, the issue is that for more iterations one gets worse FW gap and suboptimality? It could also just be that the example in our website is out of date. I'll try to re-generate it just in case

Any ideas @GeoffNN ?

busycalibrating · 2023-01-23T14:52:40Z

Thanks for the response! I'll note that I did try using numba on my local device and it was taking even longer, even after a full run through one epoch. I haven't looked into this much so this may be a versioning issue though.

fabianp · 2023-01-24T11:27:28Z

@GeoffNN : any idea why it's so slow? Could it be related to the use of sparse matrices?

busycalibrating · 2023-01-24T16:47:52Z

Were you able to reproduce the figures in the doc/paper by the way? Or were you getting similar results to mine?

fabianp · 2023-01-24T17:21:15Z

I'm building the doc locally to see. I'm traveling at the moment so it might take a while. Hopefully by tomorrow we'll know

…

On Tue, Jan 24, 2023, 17:48 busycalibrating ***@***.***> wrote: Were you able to reproduce the figures in the doc/paper by the way? Or were you getting similar results to mine? — Reply to this email directly, view it on GitHub <#100 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AACDZB6XSP2HKIW2CMK6Z3TWUABUFANCNFSM6AAAAAAUB6T4ZA> . You are receiving this because you commented.Message ID: ***@***.***>

busycalibrating · 2023-01-24T17:25:57Z

Sounds great, thank you for your help!

GeoffNN · 2023-01-24T18:41:09Z

Hi!

I'm not sure what's going on...
Can you please try the version on the branch geoff/plotsSFW? , and let me know if it changes anything?

fabianp · 2023-01-24T18:41:42Z

out of curiosity @busycalibrating , have you tried running the example locally? It could also be that colab's free tier has caps CPU speed or other resources

busycalibrating · 2023-01-24T18:44:32Z

Yes, I tried it both locally (both linux and an m1 mac) and on a compute cluster. I think the actual execution time seems to be okay; it looks like the script should take ~10 iters (epochs?) which translates to ~10^8 gradient evaluations, and roughly ~45m for the script which is in line with what the doc says. It's the convergence rate that's confusing me more.

@GeoffNN I'll give it a try!

GeoffNN · 2023-01-24T18:44:34Z

Someone had reached out to me on a different channel, and had a similar issue. I couldn't identify exactly what was going on, but it seems like the code on main runs less iterations than the code on geoff/plotsSFW. I think it may be a x-axis issue with the plotting, rather than a problem with the algorithm itself.
They managed to reproduce with the branch code.
Sorry, I never got around to identifying exactly what was going on, since diffing the files didn't expose the problem.

The speed part is more surprising, and may be a Colab issue as @fabianp is mentioning.

busycalibrating · 2023-01-24T23:27:49Z

Update: it seems like the branch you pointed me to is working; speed wise it's much faster, and with numba it's hitting ~2 iter/s so I think the older version of the package works as intended.

I also ran the code up to 500 epochs (~5m with numba), and it now translates to about 10^7 steps. Furthermore, the convergence rate looks consistent with the old figures in the documentation and paper, so I think there must be a bug in the latest version somewhere.

GeoffNN · 2023-01-24T23:34:32Z

Awesome, I hope this fixes your issue for now. Leaving the issue open to remind myself to double check and merge again.

On Tue 24 Jan 2023 at 15:28 busycalibrating ***@***.***> wrote: Update: it seems like the branch you pointed me to is working better; speed wise it's much faster, and with numba it's hitting ~2 iter/s so I think the older version of the package works as intended. I also ran the code up to 500 epochs (~5m with numba), and it now translates to about 10^7 steps. Furthermore, the convergence rate looks consistent with the old figures in the documentation and paper, so I think there must be a bug in the latest version. [image: image] <https://user-images.githubusercontent.com/19653980/214443610-76c82057-61e3-4dad-a1ae-a40c21a8111a.png> — Reply to this email directly, view it on GitHub <#100 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AC2EQTMHFKWHMCPRC5XH2E3WUBQQBANCNFSM6AAAAAAUB6T4ZA> . You are receiving this because you were mentioned.Message ID: ***@***.***>

-- *Geoffrey Négiar* *PhD Candidate - UC Berkeley, EECS departmentX2013*

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Trouble reproducing Stochastic FW benchmarks #100

Trouble reproducing Stochastic FW benchmarks #100

busycalibrating commented Jan 20, 2023

fabianp commented Jan 23, 2023

busycalibrating commented Jan 23, 2023

fabianp commented Jan 24, 2023

busycalibrating commented Jan 24, 2023

fabianp commented Jan 24, 2023 via email

busycalibrating commented Jan 24, 2023

GeoffNN commented Jan 24, 2023 •

edited

Loading

fabianp commented Jan 24, 2023

busycalibrating commented Jan 24, 2023

GeoffNN commented Jan 24, 2023 •

edited

Loading

busycalibrating commented Jan 24, 2023 •

edited

Loading

GeoffNN commented Jan 24, 2023 via email

Trouble reproducing Stochastic FW benchmarks #100

Trouble reproducing Stochastic FW benchmarks #100

Comments

busycalibrating commented Jan 20, 2023

fabianp commented Jan 23, 2023

busycalibrating commented Jan 23, 2023

fabianp commented Jan 24, 2023

busycalibrating commented Jan 24, 2023

fabianp commented Jan 24, 2023 via email

busycalibrating commented Jan 24, 2023

GeoffNN commented Jan 24, 2023 • edited Loading

fabianp commented Jan 24, 2023

busycalibrating commented Jan 24, 2023

GeoffNN commented Jan 24, 2023 • edited Loading

busycalibrating commented Jan 24, 2023 • edited Loading

GeoffNN commented Jan 24, 2023 via email

GeoffNN commented Jan 24, 2023 •

edited

Loading

GeoffNN commented Jan 24, 2023 •

edited

Loading

busycalibrating commented Jan 24, 2023 •

edited

Loading