fix: ensure executor doesn't deadlock when closure errors #152

ifross89 · 2024-03-21T21:55:28Z

When running 'gomod2nix' on in my project, the 'gomod2nix import' was failing for every import. I have more imports than the default maxJobs.

This caused a deadlock and the program never finished, due to an issue in the ParallelExecutor.

This is because in the erroring case, we send to the errChan, which is a blocking channel. If this blocks then the defers are never called, most importantly the defer which pulls an entry off the semaphore (e.guard).

This means once the erroring work functions exceeds the numWorkers, we will block trying to acquire the semaphore when we call .Add with more work.

We never get to the point where we call .Wait(), which would drain the errChan becuase we are blocked on the semaphore whilst we are still generating work.

This change moves the semaphore acquire to within the goroutines themselves. This alters the behaviour in that we now will start as many goroutines as we have work items, but the work they do will still be gated by the semaphore.

This is reasonable behaviour: goroutines are cheap, in general this package is useful if the work the functions are doing is expensive not the goroutine creation itself. The work still is guarded by the semaphore.

There is also a regression test added and in passing, the spelling of Parallel is corrected.

When running 'gomod2nix' on in my project, the 'gomod2nix import' was failing for every import. I have more imports than the default maxJobs. This caused a deadlock and the program never finished. This is because in the erroring case, we send to the errChan, which is a blocking channel. If this blocks then the defers are never called, most importantly the `defer` which pulls an entry off the semaphore (e.guard). This means once the erroring work functions exceeds the numWorkers, we will block trying to acquire the semaphore when we call .Add with more work. We never get to the point where we call .Wait(), which would drain the errChan becuase we are blocked on the semaphore whilst we are still generating work. This change moves the semaphore acquire to within the goroutines themselves. This alters the behaviour in that we now will start as many goroutines as we have work items, but the work they do will still be gated by the semaphore. This is reasonable behaviour: goroutines are cheap, in general this package is useful if the work the functions are doing is expensive not the goroutine creation itself. The work still is guarded by the semaphore. There is also a regression test added and in passing, the spelling of Parallel is corrected.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: ensure executor doesn't deadlock when closure errors #152

fix: ensure executor doesn't deadlock when closure errors #152

ifross89 commented Mar 21, 2024

fix: ensure executor doesn't deadlock when closure errors #152

Are you sure you want to change the base?

fix: ensure executor doesn't deadlock when closure errors #152

Conversation

ifross89 commented Mar 21, 2024