Added some FAQs

django · Sep 9, 2015 · 041ea3f · 041ea3f
1 parent 39bead9
commit 041ea3f
Show file tree

Hide file tree

Showing 2 changed files with 128 additions and 1 deletion.
diff --git a/docs/faqs.rst b/docs/faqs.rst
@@ -0,0 +1,126 @@
+Frequently Asked Questions
+==========================
+
+Why are you doing this rather than just using Tornado/gevent/asyncio/etc.?
+--------------------------------------------------------------------------
+
+They're kind of solving different problems. Tornado, gevent and other
+in-process async solutions are a way of making a single Python process act
+asynchronously - doing other things while a HTTP request is going on, or
+juggling hundreds of incoming connections without blocking on a single one.
+
+Channels is different - all the code you write for consumers runs synchronously.
+You can do all the blocking filesystem calls and CPU-bound tasks you like
+and all you'll do is block the one worker you're running on; the other
+worker processes will just keep on going and handling other messages.
+
+This is partially because Django is all written in a synchronous manner, and
+rewriting it to all be asynchronous would be a near-impossible task, but also
+because we believe that normal developers should not have to write
+asynchronous-friendly code. It's really easy to shoot yourself in the foot;
+do a tight loop without yielding in the middle, or access a file that happens
+to be on a slow NFS share, and you've just blocked the entire process.
+
+Channels still uses asynchronous code, but it confines it to the interface
+layer - the processes that serve HTTP, WebSocket and other requests. These do
+indeed use asynchronous frameworks (currently, asyncio and Twisted) to handle
+managing all the concurrent connections, but they're also fixed pieces of code;
+as an end developer, you'll likely never have to touch them.
+
+All of your work can be with standard Python libraries and patterns and the
+only thing you need to look out for is worker contention - if you flood your
+workers with infinite loops, of course they'll all stop working, but that's
+better than a single thread of execution stopping the entire site.
+
+
+Why aren't you using node/go/etc. to proxy to Django?
+-----------------------------------------------------
+
+There are a couple of solutions where you can use a more "async-friendly"
+language (or Python framework) to bridge things like WebSockets to Django -
+terminate them in (say) a Node process, and then bridge it to Django using
+either a reverse proxy model, or Redis signalling, or some other mechanism.
+
+The thing is, Channels actually makes it easier to do this if you wish. The
+key part of Channels is introducing a standardised way to run event-triggered
+pieces of code, and a standardised way to route messages via named channels
+that hits the right balance between flexibility and simplicity.
+
+While our interface servers are written in Python, there's nothing stopping
+you from writing an interface server in another language, providing it follows
+the same serialisation standards for HTTP/WebSocket/etc. messages. In fact,
+we may ship an alternative server implementation ourselves at some point.
+
+
+Why isn't there guaranteed delivery/a retry mechanism?
+------------------------------------------------------
+
+Channels' design is such that anything is allowed to fail - a consumer can
+error and not send replies, the channel layer can restart and drop a few messages,
+a dogpile can happen and a few incoming clients get rejected.
+
+This is because designing a system that was fully guaranteed, end-to-end, would
+result in something with incredibly low throughput, and almost no problem needs
+that level of guarantee. If you want some level of guarantee, you can build on
+top of what Channels provides and add it in (for example, use a database to
+mark things that need to be cleaned up and resend messages if they aren't after
+a while, or make idempotent consumers and over-send messages rather than
+under-send).
+
+That said, it's a good way to design a system to presume any part of it can
+fail, and design for detection and recovery of that state, rather than hanging
+your entire livelihood on a system working perfectly as designed. Channels
+takes this idea and uses it to provide a high-throughput solution that is
+mostly reliable, rather than a low-throughput one that is *nearly* completely
+reliable.
+
+
+Can I run HTTP requests/service calls/etc. in parallel from Django without blocking?
+------------------------------------------------------------------------------------
+
+Not directly - Channels only allows a consumer function to listen to channels
+at the start, which is what kicks it off; you can't send tasks off on channels
+to other consumers and then *wait on the result*. You can send them off and keep
+going, but you cannot ever block waiting on a channel in a consumer, as otherwise
+you'd hit deadlocks, livelocks, and similar issues.
+
+This is partially a design feature - this falls into the class of "difficult
+async concepts that it's easy to shoot yourself in the foot with" - but also
+to keep the underlying channels implementation simple. By not allowing this sort
+of blocking, we can have specifications for channel layers that allows horizontal
+scaling and sharding.
+
+What you can do is:
+
+* Dispatch a whole load of tasks to run later in the background and then finish
+  your current task - for example, dispatching an avatar thumbnailing task in
+  the avatar upload view, then returning a "we got it!" HTTP response.
+
+* Pass details along to the other task about how to continue, in particular
+  a channel name linked to another consumer that will finish the job, or
+  IDs or other details of the data (remember, message contents are just a dict
+  you can put stuff into). For example, you might have a generic image fetching
+  task for a variety of models that should fetch an image, store it, and pass
+  the resultant ID and the ID of the object you're attaching it to onto a different
+  channel depending on the model - you'd pass the next channel name and the
+  ID of the target object in the message, and then the consumer could send
+  a new message onto that channel name when it's done.
+
+* Have interface servers that perform requests or slow tasks (remember, interface
+  servers are the specialist code which *is* written to be highly asynchronous)
+  and then send their results onto a channel when finished. Again, you can't wait
+  around inside a consumer and block on the results, but you can provide another
+  consumer on a new channel that will do the second half.
+
+
+How do I associate data with incoming connections?
+--------------------------------------------------
+
+Channels provides full integration with Django's session and auth system for its
+WebSockets support, as well as per-websocket sessions for persisting data, so
+you can easily persist data on a per-connection or per-user basis.
+
+You can also provide your own solution if you wish, keyed off of ``message.reply_channel``,
+which is the unique channel representing the connection, but remember that
+whatever you store in must be **network-transparent** - storing things in a
+global variable won't work outside of development.
diff --git a/docs/index.rst b/docs/index.rst
@@ -21,7 +21,7 @@ Contents:
 
 .. toctree::
    :maxdepth: 2
-   
+
    concepts
    installation
    getting-started
@@ -30,3 +30,4 @@ Contents:
    message-standards
    scaling
    backends
+   faqs