parallel docs

zh217 · Aug 23, 2018 · c796ef7 · c796ef7
1 parent bb753b9
commit c796ef7
Showing 1 changed file with 51 additions and 51 deletions.
diff --git a/doc/parallel.ipynb b/doc/parallel.ipynb
@@ -26,7 +26,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Remember `async_pipe` and `async_pipe_unordered`? We discussed them in the context of trying to put more \"concurrency\" into our program by taking advantage of parallelism. But what does that mean here?"
+    "We discussed `async_pipe` and `async_pipe_unordered` in the context of trying to put more \"concurrency\" into our program by taking advantage of parallelism. What does \"parallelism\" mean here?"
    ]
   },
   {
@@ -40,21 +40,14 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "In our examples with `async_pipe` and `async_pipe_unordered`, we see that by giving them more coroutine instances to work with, we indeed achieved more throughput."
+    "With `async_pipe` and `async_pipe_unordered`, by giving them more coroutine instances to work with, we achieved higher throughput. But that is only because our coroutines are, in a quite literal sense, sleeping on the job: to simulate real jobs, we called `await` on `asyncio.sleep`. The event loop, faced with this await, just puts the coroutine on hold until it is ready to act again."
    ]
   },
   {
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "But that is only because our coroutines are, in a quite literal sense, sleeping on the job! Remember that to simulate real jobs, we called `await` on `asyncio.sleep`. And the event loop, faced with this await, just puts the coroutine on hold until it is ready to act again."
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "Now it is entirely possible that this behaviour --- of not letting sleeping coroutines block the whole program --- is all you need. In particular, if you are dealing with network connections or sockets *and* you are using a proper asyncio-based library, then \"doing network work\" isn't too much from sleeping on the loop, and you *will* see performance gains."
+    "Now it is entirely possible that this behaviour --- of not letting sleeping coroutines block the whole program --- is all you need. In particular, if you are dealing with network connections or sockets *and* you are using a proper asyncio-based library, then \"doing network work\" isn't too much from sleeping on the loop."
    ]
   },
   {
@@ -81,7 +74,7 @@
      "output_type": "stream",
      "text": [
       "[0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38]\n",
-      "2.009559524987708\n"
+      "2.009612141060643\n"
      ]
     }
    ],
@@ -106,7 +99,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "The only different than before (when we first introduced `async_pipe`) is that we replaced `asyncio.sleep` with `time.sleep`. With this change, we did not get *any* speed up with our so-called parallelism."
+    "The only different than before (when we first introduced `async_pipe`) is that we replaced `asyncio.sleep` with `time.sleep`. With this change, we did not get *any* speed up."
    ]
   },
   {
@@ -118,15 +111,15 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 4,
+   "execution_count": 3,
    "metadata": {},
    "outputs": [
     {
      "name": "stdout",
      "output_type": "stream",
      "text": [
       "[0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38]\n",
-      "0.2073974380036816\n"
+      "0.20713990507647395\n"
      ]
     }
    ],
@@ -151,22 +144,22 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Note that when using `parallel_pipe`, our `worker` has to be a normal function instead of an async function. As before, if order is not important, `parallel_pipe_unordered` can give you even more throughput:"
+    "When using `parallel_pipe`, our `worker` has to be a normal function instead of an async function. As before, if order is not important, `parallel_pipe_unordered` can give you even more throughput:"
    ]
   },
   {
    "cell_type": "code",
-   "execution_count": 13,
+   "execution_count": 5,
    "metadata": {},
    "outputs": [
     {
      "name": "stdout",
      "output_type": "stream",
      "text": [
       "[0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38]\n",
-      "ordered time: 0.3653000319900457\n",
-      "[4, 12, 22, 6, 0, 16, 2, 14, 10, 18, 36, 26, 30, 8, 32, 24, 20, 34, 38, 28]\n",
-      "unordered time: 0.27417757001239806\n"
+      "ordered time: 0.35387236496899277\n",
+      "[16, 2, 8, 24, 6, 10, 0, 32, 22, 34, 12, 36, 4, 38, 28, 18, 30, 20, 14, 26]\n",
+      "unordered time: 0.19887939398176968\n"
      ]
     }
    ],
@@ -196,34 +189,34 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "In fact, `parallel_pipe` works by starting a thread-pool in the back and execute the workers on the thread-pool. Multiple threads can solve the problem of workers sleeping on the thread, as in our example. But what about the GIL? Remember that the default implementation of python, the CPython, has a global interpreter lock (GIL) which prevents more than one python statement executing at the same time. Will `parallel_pipe` help in the presence of GIL, besides the case of workers just sleeping?"
+    "In fact, `parallel_pipe` works by starting a thread-pool and execute the workers in the thread-pool. Multiple threads can solve the problem of workers sleeping on the thread, as in our example. But remember that the default implementation of python, the CPython, has a global interpreter lock (GIL) which prevents more than one python statement executing at the same time. Will `parallel_pipe` help in the presence of GIL, besides the case of workers just sleeping?"
    ]
   },
   {
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "It turns out that for a great number of cases, multiple threads help greatly even in the presence of GIL. This is because most of the heavy-lifting operations, for example file accesses, are implemented in C instead of in python, and in C it is possible to release the GIL when not interacting with the python runtime. Hence, for example, for file operations `parallel_pipe` will suffice. If you are doing number-crunching, then hopefully you are not doing it in pure python but instead relies on dedicated libraries like numpy, scipy, etc. You will be glad to know that all of these libraries do release the GIL when it makes sense to do so. So again, using `parallel_pipe` is usually enough."
+    "It turns out that for the majority of serious cases, multiple threads help even in the presence of the GIL, because most of the heavy-lifting operations, for example file accesses, are implemented in C instead of in pure python, and in C it is possible to release the GIL when not interacting with the python runtime. In addition to file accesses, if you are doing number-crunching, then hopefully you are not doing it in pure python but instead relies on dedicated libraries like numpy, scipy, etc. All of these libraries release the GIL when it makes sense to do so. So using `parallel_pipe` is usually enough."
    ]
   },
   {
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "What if you just have to do your CPU-intensive tasks in python? Well, `parallel_pipe` and `parallel_pipe_unordered` takes an argument called `mode`, which by default takes the value `thread`. If you change it to `process`, then a process-pool instead of a thread-pool will be used. Using process pools, you finally can have multiple python statements executing at the same time. Let's see a comparison:"
+    "What if you just have to do your CPU-intensive tasks in python? Well, `parallel_pipe` and `parallel_pipe_unordered` takes an argument called `mode`, which by default takes the value `'thread'`. If you change it to `'process'`, then a process-pool instead of a thread-pool will be used.  Let's see a comparison:"
    ]
   },
   {
    "cell_type": "code",
-   "execution_count": 21,
+   "execution_count": 6,
    "metadata": {},
    "outputs": [
     {
      "name": "stdout",
      "output_type": "stream",
      "text": [
-      "using threads 1.89601040299749\n",
-      "using threads 0.20880025799851865\n"
+      "using threads 1.7299788249656558\n",
+      "using threads 0.20847543003037572\n"
      ]
     }
    ],
@@ -254,19 +247,19 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Why not always use a process pool? Processes have much greater overhead than threads, and there are far more restrictions. Crucially, you cannot share any object, and anything you pass to your worker, or return from your worker, must be picklable."
+    "Why not use a process pool in all cases? Processes have much greater overhead than threads, and also far more restrictions on their use. Crucially, you cannot share any object unless you do some dirty work yourself, and anything you pass to your worker, or return from your worker, must be picklable."
    ]
   },
   {
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "In our example, our worker is a pure function. It is also possible to prepare some structures in each worker before-hand. In python 3.7 or above, there are the `initializer` and `init_args` arguments accepted by `parallel_pipe` and `parallel_pipe_unordered`, which will be passed to the construction to the pool executors to do the setup. Prior to python 3.7, such a setup is still possible with some hack: you can put the object to be set up in a `threading.local` object, and for *every* worker execution, check if the object exists, if not, do the initialization:"
+    "In our example, our worker is a pure function. It is also possible to prepare some structures in each worker before-hand. In python 3.7 or above, there are the `initializer` and `init_args` arguments accepted by `parallel_pipe` and `parallel_pipe_unordered`, which will be passed to the construction to the pool executors to do the setup. Prior to python 3.7, such a setup is still possible with some hack: you can put the object to be set up in a `threading.local` object, and for *every* worker execution, check if the object exists, and if not, do the initialization:"
    ]
   },
   {
    "cell_type": "code",
-   "execution_count": 22,
+   "execution_count": 7,
    "metadata": {},
    "outputs": [
     {
@@ -308,14 +301,14 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "And this also works for `mode='process'`."
+    "Since we used two thread workers, the setup is done twice. This also works for `mode='process'`."
    ]
   },
   {
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "What about parallelising work across the network? Or more exotic workflows? At its core, *aiochan* is a library that facilitates you in the job of moving data around these workflows: there is nothing preventing you to use channels at the end-points of a network-based parallelism framework, for example, message queues or frameworks like *dart*. Use the appropriate tool for the approriate job. *Aiochan* aims to give you maximum flexibility in developing concurrent workflows within the boundary of a single machine and a single event loop, and you should use *aiochan* it in tandem with some other suitable memory when you want to step out of this boundary."
+    "What about parallelising work across the network? Or more exotic workflows? At its core, *aiochan* is a library that facilitates moving data around within the boundary of a single process on a single machine, but there is nothing preventing you using channels at the end-points of a network-based parallelism framework such as message queues or a framework like *dart*. Within its bounday, *aiochan* aims to give you maximum flexibility in developing concurrent workflows, and you should use *aiochan* it in tandem with some other suitable libraries or frameworks when you want to step out of its boundary."
    ]
   },
   {
@@ -329,12 +322,12 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Speaking of stepping out of boundaries, one of such cases is exceedingly common: you use an aiochan-based workflow to prepare a stream of values, but you want to consume these values outside of the asyncio event loop. Well, in this case, there are convenience methods that have you covered:"
+    "Speaking of stepping out of boundaries, one case is exceedingly common: you use an aiochan-based workflow to prepare a stream of values, but you want to consume these values outside of the asyncio event loop. In this case, there are convenience methods for you:"
    ]
   },
   {
    "cell_type": "code",
-   "execution_count": 25,
+   "execution_count": 8,
    "metadata": {},
    "outputs": [
     {
@@ -383,7 +376,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 26,
+   "execution_count": 9,
    "metadata": {},
    "outputs": [
     {
@@ -420,6 +413,13 @@
     "loop.call_soon_threadsafe(out.close);"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Other queues can be used as long as they follow the public API of `queue.Queue` and are thread-safe."
+   ]
+  },
   {
    "cell_type": "markdown",
    "metadata": {},
@@ -431,7 +431,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Finally, before ending this tutorial, let's tell you a secret: you don't need asyncio to use aiochan! \"What\", you say, \"isn't aiochan based on asyncio?\" Well, not really, the core algorithms of aiochan (which is based on those from Clojure's core.async) does not use any asyncio constructs: they run entirely synchronously. It is only when you use the use-facing methods such as `get`, `put` and `select` that an asyncio-facade was made to cover the internals."
+    "Finally, before ending this tutorial, let's reveal a secret: you don't need asyncio to use aiochan! \"Isn't aiochan based on asyncio?\" Well, not really, the core algorithms of aiochan (which is based on those from Clojure's core.async) does not use any asyncio constructs: they run entirely synchronously. It is only when you use the use-facing methods such as `get`, `put` and `select` that an asyncio-facade was made to cover the internals."
    ]
   },
   {
@@ -449,7 +449,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Normally, when you call `ch.put_nowait(v)`, the put will succeed if it is possible to do so immediately (for example, if there is a pending get or buffer can be used), otherwise it will give up. Note that you never `await` on `put_nowait`. However, if you give the argument `immediate_only=True`, then if the operation cannot be completed immediately, it will be queued (but again, the pending queue can overflow). In addition, you can give a callback to the `cb` argument, which will be called when the put finally succeeds, with the same argument as the return value of `await put(v)`. The same is true with `get_nowait(immediate_only=True, cb=cb)`. For `select`, if you give a callback to the `cb` argument, then you should not call `await` on it but instead the callback will be called as `cb(return_value, which_channel)`. Note if you don't expect to use any event loops, when constructing the channel, you should explicitly pass in `loop='no_loop'`."
+    "Normally, when you call `ch.put_nowait(v)`, the put will succeed if it is possible to do so immediately (for example, if there is a pending get or buffer can be used), otherwise it will give up. Note that you never `await` on `put_nowait`. However, if you give the argument `immediate_only=True`, then if the operation cannot be completed immediately, it will be queued (but again, the pending queue can overflow). In addition, you can give a callback to the `cb` argument, which will be called when the put finally succeeds, with the same argument as the return value of `await put(v)`. The same is true with `get_nowait(immediate_only=True, cb=cb)`. For `select`, if you give a callback to the `cb` argument, then you should not call `await` on it, but instead rely on the callback being called eventually as `cb(return_value, which_channel)`. Note if you don't expect to use any event loops, when constructing channels, you should explicitly pass in `loop='no_loop'`."
    ]
   },
   {
@@ -461,7 +461,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 2,
+   "execution_count": 10,
    "metadata": {},
    "outputs": [
     {
@@ -530,12 +530,12 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "By the appropriate use of callbacks, we can write:"
+    "By the appropriate use of callbacks, we can write avoid using `asyncio` completely:"
    ]
   },
   {
    "cell_type": "code",
-   "execution_count": 3,
+   "execution_count": 12,
    "metadata": {},
    "outputs": [
     {
@@ -603,28 +603,28 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "As you can see, the end result is (almost) the same. An example with `select`:"
+    "The end result is (almost) the same. An example with `select`:"
    ]
   },
   {
    "cell_type": "code",
-   "execution_count": 4,
+   "execution_count": 13,
    "metadata": {},
    "outputs": [
     {
      "name": "stdout",
      "output_type": "stream",
      "text": [
-      "select put into Chan<c 140329680924288>, get value 1\n",
-      "select put into Chan<c 140329680924288>, get value 1\n",
-      "select put into Chan<c 140329680924288>, get value 1\n",
-      "select put into Chan<d 140329680923768>, get value 2\n",
-      "select put into Chan<d 140329680924288>, get value 2\n",
-      "select put into Chan<c 140329680924288>, get value 1\n",
-      "select put into Chan<d 140329672134936>, get value 2\n",
-      "select put into Chan<c 140329680924288>, get value 1\n",
-      "select put into Chan<c 140329680924288>, get value 1\n",
-      "select put into Chan<d 140329672135664>, get value 2\n"
+      "select put into Chan<c 140356982933192>, get value 1\n",
+      "select put into Chan<c 140356982933192>, get value 1\n",
+      "select put into Chan<d 140356982931944>, get value 2\n",
+      "select put into Chan<c 140356982931944>, get value 1\n",
+      "select put into Chan<c 140356982931944>, get value 1\n",
+      "select put into Chan<c 140356982931944>, get value 1\n",
+      "select put into Chan<d 140356982932672>, get value 2\n",
+      "select put into Chan<c 140356982932672>, get value 1\n",
+      "select put into Chan<d 140356982931944>, get value 2\n",
+      "select put into Chan<c 140356982931944>, get value 1\n"
      ]
     }
    ],
@@ -661,7 +661,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "\"But why\", you ask. Well, obviously writing callbacks is much harder than using asyncio. But who knows? Maybe you are writing some other, higher-level framework that can make use of the semantics of aiochan. The possibilities are endless!"
+    "\"But why?\" Well, obviously writing callbacks is much harder than using asyncio. But who knows? Maybe you are writing some other, higher-level framework that can make use of the semantics of aiochan. The possibilities are endless! In particular, there are non-asyncio concurrency frameworks in python itself that utilizes the same coroutines, an example being `python-trio`. Since the core of aiochan does not rely on asyncio, porting it to `trio` is trivial."
    ]
   }
  ],