Currently these two functions split the list (almost a million entries) into chunks and then put the chunks in a queue. Unfortunately, even chunking the list process takes 10s of seconds. My suspicion is that chunking the list causes the list to be fully evaluated.
Instead of chunking the list, we should just put the whole list into the queue. The consumer should then pop the whole list off the queue, take the amount of elements it intends to process and put the rest of the list back on the head of the queue. see ungetTBQueue .