Need better heuristic for default value of LAZYFLOW_THREADS #1458

stuarteberg · 2017-04-05T22:01:47Z

Lou Scheffer was experiencing poor interactive performance during pixel classification on a large 2D image. But on my machine, I get good performance, even using the same project file. The most notable difference between our two systems is the number of CPUs: Lou has 48 (presumably, half are hyper-threads).

We timed (with a stopwatch) how long it took to completely predict all tiles of a 12577x750 image using various settings of LAZYFLOW_THREADS. We found that the benefit to using more CPUs disappears rather quickly. Maximum performance is achieved around 4-8 CPUs. Using all 48 CPUs is nearly as bad as using just a single thread.

I believe that our upcoming switch to 512px tiles will improve the situation, but we should still come up with a better heuristic for how to set LAZYFLOW_THREADS if the user hasn't set it themselves. Otherwise, users with beefy workstations are likely to experience worse performance than users with modern laptops.

`LAZYFLOW_THREADS`	Time (seconds)
1	95
2	63
4	37
8	35
16	41
32	60
48 (max)	72

Image Size: (12577, 750)
Features: All
Prediction classes: 2
Viewer tile size: 256x256
OS: Fedora 20
ilastik version: 1.2.0

The text was updated successfully, but these errors were encountered:

stuarteberg · 2017-04-19T19:39:11Z

There is some hope that the situation will magically improve when we upgrade to Python 3, since it uses the "New GIL" implementation:

http://www.dabeaz.com/python/NewGIL.pdf

But it's difficult to say for sure.

akreshuk · 2017-04-21T09:03:49Z

I'm writing it here, since it might be related and the old issue is closed now. Our switch to 512x512 tiles for 2D gave a very noticeable speed-up there, but, as I just found, also a very noticeable slow-down for 3D data. First of all, we should make the tile size conditional on the dataset dimensions, but it would be good to understand this behaviour in principle.

xref: ilastik/ilastik#1458

akreshuk · 2018-01-12T09:49:34Z

Looks like the default behavior only got worse in the new version. Here is a screenshot of my machine doing autocontext. The whole thing feels slow and most of the time it's running less than 8 cores and at less than 100 %.

k-dominik · 2018-09-05T19:10:13Z

So I did some benchmarking on a machine with 20 cores, 40 threads.
Task was pixel classification on some cremi sample dataset. In addition to varying n_threads, I also varied the amount of RAM for lazyflow, with some interesting results:

In short, I could reproduce the behavior that @stuarteberg showed in the first post. However, with more RAM, ilastik could take more and more threads, without slowing down.

I guess we see three effects:

With small amounts of ram and many threads, we get very small block sizes -> halo slows us down
With a small number of threads and loads of ram, threads will start concurrently and finish approximately at the same time. When writing occurs, the other threads are also more or less finished and wait for writing to finish.
at some point (all threads more or less running all the time, one thread writing all the time) we run into saturation: can only be overcome by parallel writing.

So in the end, we should find a better heuristic to set both, LAZYFLOW_THREADS and LAZYFLOW_TOTAL_RAM_MB, jointly.

imagesc-bot · 2020-08-03T16:53:18Z

This issue has been mentioned on Image.sc Forum. There might be relevant details there:

https://forum.image.sc/t/notable-memory-usage-difference-when-running-ilastik-in-headless-mode-on-different-machines/41144/2

imagesc-bot · 2021-05-06T13:10:43Z

This issue has been mentioned on Image.sc Forum. There might be relevant details there:

https://forum.image.sc/t/cpu-and-ram-core-limit-for-ilastik/52428/2

imagesc-bot · 2023-03-21T15:04:51Z

This issue has been mentioned on Image.sc Forum. There might be relevant details there:

https://forum.image.sc/t/multiphase-segmentation-and-other-questions/78696/4

stuarteberg added a commit to ilastik/lazyflow that referenced this issue Apr 27, 2017

request: Limit default threadpool size to 8.

77576cc

xref: ilastik/ilastik#1458

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Need better heuristic for default value of LAZYFLOW_THREADS #1458

Need better heuristic for default value of LAZYFLOW_THREADS #1458

stuarteberg commented Apr 5, 2017 •

edited

stuarteberg commented Apr 19, 2017

akreshuk commented Apr 21, 2017

akreshuk commented Jan 12, 2018 •

edited

k-dominik commented Sep 5, 2018 •

edited

imagesc-bot commented Aug 3, 2020

imagesc-bot commented May 6, 2021

imagesc-bot commented Mar 21, 2023

Need better heuristic for default value of LAZYFLOW_THREADS #1458

Need better heuristic for default value of LAZYFLOW_THREADS #1458

Comments

stuarteberg commented Apr 5, 2017 • edited

stuarteberg commented Apr 19, 2017

akreshuk commented Apr 21, 2017

akreshuk commented Jan 12, 2018 • edited

k-dominik commented Sep 5, 2018 • edited

imagesc-bot commented Aug 3, 2020

imagesc-bot commented May 6, 2021

imagesc-bot commented Mar 21, 2023

stuarteberg commented Apr 5, 2017 •

edited

akreshuk commented Jan 12, 2018 •

edited

k-dominik commented Sep 5, 2018 •

edited