Skip to content

Commit

Permalink
docs: using Iterators instead of Generators
Browse files Browse the repository at this point in the history
  • Loading branch information
william-silversmith committed Feb 18, 2019
1 parent c29f2c5 commit df58d58
Showing 1 changed file with 29 additions and 0 deletions.
29 changes: 29 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -179,6 +179,35 @@ Second, we pass parameters for task generation to the child proceses, not tasks.

Third, as described in the narrative for Listing 5, the GreenTaskQueue has less context-switching overhead than ordinary multithreaded TaskQueue. Using GreenTaskQueue will cause each core to efficiently run independently of the others. At this point, your main bottlenecks will probably be OS/network card related (let us know if they aren't!). Multiprocessing does scale task production, but it's not 1:1 in the number of processes. The number of tasks per a process will fall with each additional core added, but each core still adds additional throughput up to about 16 cores.

```python
# Listing 7: Exchanging Generators for Iterators
import gevent.monkey
gevent.monkey.patch_all()
from taskqueue import GreenTaskQueue
from concurrent.futures import ProcessPoolExecutor

class PrintTaskIterator(object):
def __init__(self, start, end):
self.start = start
self.end = end
def __len__(self):
return self.end - self.start
def __iter__(self):
for i in range(self.start, self.end):
yield PrintTask(i)

def upload(tsks):
tq = GreenTaskQueue('wms-test-pull-queue-2')
tq.insert_all(tsks)

tasks = [ PrintTaskIterator(0, 100), PrintTaskIterator(100, 200) ]
with ProcessPoolExecutor(max_workers=2) as execute:
execute.map(upload, tasks)
```

If you insist on wanting to pass generators to your subprocesses, you can use iterators instead. The construction above allows us to write the generator call up front, pass only a few primatives through the pickling process, and transparently call the generator on the other side. We can even support the `len()` function which is not available for generators.


[1] You can't pass generators in CPython but [you can pass iterators](https://stackoverflow.com/questions/1939015/singleton-python-generator-or-pickle-a-python-generator/1939493#1939493). You can pass generators if you use Pypy or Stackless Python.

**--**
Expand Down

0 comments on commit df58d58

Please sign in to comment.