[Peer Review] Add progress monitoring #5

cthoyt · 2020-10-02T19:27:38Z

For small graphs like BioGRID, it might be reasonable to wait for 30 seconds or so in the different steps, but as the graphs start getting large and training time gets into the minutes/hours, it would be nice to communicate progress to the user.

I would normally suggest using tqdm, but after modifying the code myself, it doesn't look like it plays nicely with numba's parallelization. I found this thread: numba/numba#4267 that describes the issue a bit more in detail, and the outlook is a bit grim.

It's not a dealbreaker, but it would make your package much more user friendly.

RemyLau · 2021-03-08T00:13:04Z

Many tries have been made to exchange progress information among threads during numba jit parallelism, which is necessary to print the overall progress, in particular, for the simulate_walk function. One of the thoughts was to use atomic operation to sync progress among threads, but cpu atomic is not available yet: numba/numba#2988. Another thought was to use reduction on iteration count, but the reduction only happens after all threads are done.

RemyLau · 2021-03-08T01:04:56Z

It is also attempted to use only one of the threads to print the progress as the estimation to the overall progress by using numba.np.ufunc.parallel._get_thread_id(). However to print out progress bar in a nice format, it is necessary to be able to flush out the previous stdout line with current print, which is made possible in Python either by print(flush=True) or sys.stdout.flush(), and yet neither is supported in nopython mode: numba/numba#3475

songololo · 2021-05-19T09:49:12Z

Interesting discussion, one I've been following for some time (across other threads).

In my case I've been using a manual progress bar (just hash symbols printed as strings from no python mode) and when in parallelised mode it scrambles the order, e.g.:

|                                 | 0.0 %
|############################     | 87.87 %
|########################         | 75.75 %
|################                 | 51.51 %
|####################             | 63.63 %

Though I suppose this still gives some indication that something is happening. One way to give some sense of progress might be to change the progress bar to read something like "one of x number of batches has been completed..." instead of an explicit percentage and progress bar.

Edit: Batches probably wouldn't work as the iterations triggering the update are probably fairly arbitrary...?

RemyLau · 2021-05-19T11:49:20Z

Hi @songololo, thanks for your comment! That's a good idea and I've tried something similar along this line. But as you've pointed out at the end, there's some issue with the scheduling being random. And also the main issue that hold this kind of counting base approach back is due to inability to communicate progress information across threads during parallel execution in Numba, as briefly discussed above. So it is at the moment impossible to get an overall progress by counting the number of iterations done across all threads.

With that being said, we could instead monitor the progress in each thread separately, and output similarly "x% of iterations completed in thread i / N", where i is the thread id and N is the total number of threads. However this makes the monitoring very messy and might not be a good idea to do so. One work around is to flush out previous prints whenever next monitoring message is being printed. But the flushing option is not currently supported by Numba nopython mode (see above).

songololo · 2021-05-19T12:18:15Z

@RemyLau I don't fully understand the numba.np.ufunc.parallel._get_thread_id() approach, but if it works in principle, then would this be worth pursuing?

i.e. just making peace with a not-so-nice progress bar that doesn't flush between prints?

RemyLau · 2021-05-19T14:33:31Z

@songololo so the main idea of the _get_thread_id() approach is to keep track of the progress of each individual thread instead of the overall progress since we numba threads are not capable of communicating with each other at present. I did try to make it print out the progress just now and I realized that even if we're only keeping track of progress within a single thread, there's a complication: the total number of jobs (or iterations) assigned to a thread is not known a priori so we can't compute the percent of iteration done by individual thread either.

Update: after some more searching, it seems like prange by default perform static scheduling and assign (roughly) equal number of jobs to each thread, see this reply in an issue with numba scheduling. So perhaps I could still output to the progress for each thread. Another complication I've encountered is that nopython mode does not support string formatting (see numba/numba#3250). In another thread numba/numba#3475, this is linked to an StackOverflow thread, which uses the objmode() context manager to resolve this issue.

Update on objmode: attempt on using objmode() failed, I'm using numba 0.52.0 currently. I'll try to see if this work on the latest released version of numba

numba.core.errors.LoweringError: Failed in nopython mode pipeline (step: Handle with contexts)
Failed in nopython mode pipeline (step: nopython mode backend)
Failed in nopython mode pipeline (step: nopython mode backend)
ctypes objects containing pointers cannot be pickled

File "../src/pecanpy/node2vec.py", line 113:
        def node2vec_walks():
            <source elided>
                    print(np.float32(private_count / n * 100), "% walks generated by thread #", _get_thread_id())
                    with objmode():
                    ^

During: lowering "$148 = call $147(func=$147, args=[], kws=(), vararg=None)" at /mnt/ufs18/home-026/liurenmi/repo/PecanPy/src/pecanpy/node2vec.py (113)
During: lowering "id=2[LoopNest(index_variable = parfor_index.98, range = (0, $n.103, 1))]{290: <ir.Block at /mnt/ufs18/home-026/liurenmi/repo/PecanPy/src/pecanpy/node2vec.py (96)>, 130: <ir.Block at /mnt/ufs18/home-026/liurenmi/repo/PecanPy/src/pecanpy/node2vec.py (101)>, 228: <ir.Block at /mnt/ufs18/home-026/liurenmi/repo/PecanPy/src/pecanpy/node2vec.py (112)>, 132: <ir.Block at /mnt/ufs18/home-026/liurenmi/repo/PecanPy/src/pecanpy/node2vec.py (101)>, 194: <ir.Block at /mnt/ufs18/home-026/liurenmi/repo/PecanPy/src/pecanpy/node2vec.py (107)>, 208: <ir.Block at /mnt/ufs18/home-026/liurenmi/repo/PecanPy/src/pecanpy/node2vec.py (110)>, 86: <ir.Block at /mnt/ufs18/home-026/liurenmi/repo/PecanPy/src/pecanpy/node2vec.py (96)>, 158: <ir.Block at /mnt/ufs18/home-026/liurenmi/repo/PecanPy/src/pecanpy/node2vec.py (104)>}Var(parfor_index.98, node2vec.py:96)" at /mnt/ufs18/home-026/liurenmi/repo/PecanPy/src/pecanpy/node2vec.py (96)

songololo · 2021-05-19T14:40:34Z

@RemyLau thanks, this makes sense compared to what I've experienced with plotting progress from prange.

For example, I've now switched to really simple updates which report a non-sequential bundle number:

Completed non-sequential bundle 1 of 33
Completed non-sequential bundle 30 of 33
Completed non-sequential bundle 26 of 33
etc.

From a function which looks like this:

@njit(cache=False)
def progress_bar(current: int, total: int, steps: int = 20):
    '''
    Printing carries a performance penalty
    Cache has to be set to false per Numba issue:
    https://github.com/numba/numba/issues/3555
    TODO: set cache to True once resolved
    '''
    if steps == 0:
        return
    step_size = int(total / steps)
    if step_size == 0:
        return
    if current % step_size == 0:
        chunk = round(current / step_size) + 1
        print('Completed non-sequential bundle', chunk, 'of', steps)

and called like this:

  for idx in prange(n):
      # numba no object mode can only handle basic printing
      # note that progress bar adds a performance penalty
      if not suppress_progress:
          progress_bar(idx, n, steps)

Not very pretty but at least it gives a sense of progress! 🤣

RemyLau · 2021-05-19T15:06:58Z

Thanks for the suggestion @songololo! I've just pushed some changes (see commit:9c6ca6f
) to implement the suggested progress monitoring, here's an example of the output with --verbose option set to enable progress monitoring:

$ pecanpy --input karate.edg --output karate.emd --dimensions 8 --workers 4 --verbose

Took 00:00:00.00 to load graph
Took 00:00:00.00 to pre-compute transition probabilities
Thread #  3 progress: |###                      | 9.41 %
Thread #  0 progress: |###                      | 9.41 %
Thread #  1 progress: |###                      | 9.41 %
Thread #  2 progress: |###                      | 9.41 %
Thread #  3 progress: |#####                    | 18.82 %
Thread #  1 progress: |#####                    | 18.82 %
Thread #  2 progress: |#####                    | 18.82 %
Thread #  0 progress: |#####                    | 18.82 %
Thread #  3 progress: |########                 | 28.23 %
Thread #  1 progress: |########                 | 28.23 %
Thread #  2 progress: |########                 | 28.23 %
Thread #  0 progress: |########                 | 28.23 %
Thread #  3 progress: |##########               | 37.64 %
Thread #  1 progress: |##########               | 37.64 %
Thread #  2 progress: |##########               | 37.64 %
Thread #  3 progress: |############             | 47.05 %
Thread #  1 progress: |############             | 47.05 %
Thread #  0 progress: |##########               | 37.64 %
Thread #  2 progress: |############             | 47.05 %
Thread #  3 progress: |###############          | 56.47 %
Thread #  1 progress: |###############          | 56.47 %
Thread #  2 progress: |###############          | 56.47 %
Thread #  0 progress: |############             | 47.05 %
Thread #  3 progress: |#################        | 65.88 %
Thread #  1 progress: |#################        | 65.88 %
Thread #  2 progress: |#################        | 65.88 %
Thread #  3 progress: |###################      | 75.29 %
Thread #  1 progress: |###################      | 75.29 %
Thread #  0 progress: |###############          | 56.47 %
Thread #  2 progress: |###################      | 75.29 %
Thread #  1 progress: |######################   | 84.7 %
Thread #  3 progress: |######################   | 84.7 %
Thread #  0 progress: |#################        | 65.88 %
Thread #  2 progress: |######################   | 84.7 %
Thread #  1 progress: |######################## | 94.11 %
Thread #  3 progress: |######################## | 94.11 %
Thread #  2 progress: |######################## | 94.11 %
Thread #  0 progress: |###################      | 75.29 %
Thread #  0 progress: |######################   | 84.7 %
Thread #  0 progress: |######################## | 94.11 %
Took 00:00:05.70 to generate walks
Took 00:00:00.03 to train embeddings

I guess the couple other things that I'll do right after is to add progress bar to the preprocessing phase for PreComp mode. I'll also try to see if using the latest Numba release would allow me to use objmode to enable flushing to make the printing prettier.

songololo · 2021-05-19T15:09:10Z

Interesting approach! Thanks for sharing.

RemyLau · 2021-11-25T15:48:07Z

https://github.com/mortacious/numba-progress

cthoyt mentioned this issue Oct 2, 2020

[Peer Review] Code Review for "PecanPy: a fast, efficient, and parallelized Python implementation of node2vec" #12

Closed

10 tasks

RemyLau mentioned this issue May 19, 2021

Flushing STDOUT in nopython mode numba/numba#3475

Open

This was referenced May 27, 2021

Initial support for selecting the chunk size for parallel regions. numba/numba#6025

Closed

Refine progress bar for parallelised numba functions benchmark-urbanism/cityseer-api#34

Closed

RemyLau added the help wanted Extra attention is needed label Nov 1, 2021

RemyLau added a commit that referenced this issue Nov 25, 2021

Add progressbar using numba-progress, closes #5

fbcd4eb

RemyLau closed this as completed in d34393f Nov 25, 2021

cthoyt changed the title ~~Add progress monitoring~~ [Peer Review] Add progress monitoring Aug 22, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Peer Review] Add progress monitoring #5

[Peer Review] Add progress monitoring #5

cthoyt commented Oct 2, 2020

RemyLau commented Mar 8, 2021

RemyLau commented Mar 8, 2021

songololo commented May 19, 2021 •

edited

Loading

RemyLau commented May 19, 2021

songololo commented May 19, 2021

RemyLau commented May 19, 2021 •

edited

Loading

songololo commented May 19, 2021 •

edited

Loading

RemyLau commented May 19, 2021

songololo commented May 19, 2021

RemyLau commented Nov 25, 2021

[Peer Review] Add progress monitoring #5

[Peer Review] Add progress monitoring #5

Comments

cthoyt commented Oct 2, 2020

RemyLau commented Mar 8, 2021

RemyLau commented Mar 8, 2021

songololo commented May 19, 2021 • edited Loading

RemyLau commented May 19, 2021

songololo commented May 19, 2021

RemyLau commented May 19, 2021 • edited Loading

songololo commented May 19, 2021 • edited Loading

RemyLau commented May 19, 2021

songololo commented May 19, 2021

RemyLau commented Nov 25, 2021

songololo commented May 19, 2021 •

edited

Loading

RemyLau commented May 19, 2021 •

edited

Loading

songololo commented May 19, 2021 •

edited

Loading