Open
Description
Clearinghouse for "how to execute N instances of a given task, parameterized in some fashion, in parallel?"
Fabric 1.x has a naive parallelization which assumes a single parameterization vector (host lists); we need to support this more generally while still honoring that sort of use case.
Fab 1.x's implementation has to leverage multiprocessing due to not being threadsafe; Invoke should be threadsafe, so threads are an option, as is multiprocessing or (if we decide it's worthwhile to add a dependency on them) coroutines/greenlets. EDIT: see also Tulip/asyncio in Python 3.x and its port in 2.x, Trollius.
In either case we have to solve the following problems:
- Display of live output - how to tell different "channels" apart?
- Fabric handles this by forcing 'linewise' output in parallel mode, and using a line prefix string. This is functional but not very readable (though troubleshooting-related "reading" can be fixed by using logging to per-channel files or data structures).
- Communicating data between channels
- Generally, not doable or desirable since you have no idea which permutations of the task will run before/after others.
- Though if threading is used, users can simply opt to use locks/sempahores/queues/etc
- Communication between channels and the "master" process, e.g. return values
- Error handling - should errors partway through the run cause a halt, or just get tallied and displayed at the end?
- Probably more.