I think this might be much better than it was now. There have been a large number of improvements since this bug was filed that should mean it's heaps faster.
There is one tiny thing that could be done to make this even faster; use a "slack delta" if there are no blocks. This will mean it bypasses rollsum calculation and just emits literal commands for each input buffer.
I can do this pretty easily... I'll put together a merge request.