Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Uniform type of data for cpg.response,body #59
Originally reported by: Anonymous
Today, CP2 apps usually return strings. There is the option to return a generator, but only if the generator filter is be enabled. However, there is a problem because filters can't know before hand if the body contains a iterable (a generator, for example) or a single string. If they try to iterate over the string (which is possible), they will end up iterating over the characters, which is unnaceptable slow.
The proposed solution is to make cpg.response.body always contain an uniform type of data. The obvious choice is to wrap any kind of returned data inside a iterable. Single strings will be wrapped inside a list with only one element - ["..."]. This will allow filters, and also the CP2 core functions, to always treat the cpg.response.body as a iterable, without the need to test for its type (as in: isinstance(cpg.response.body, GeneratorType)... else: ...). The check will be contained at a single point, when the called object returns the response body.
This patch also needs to check the standard filters, to make sure that they always use the iterable. It's not clear whether this changes will have some other side effect on generator handling; it's possible that this implementation will make the generator filter not needed, but that's left for the implementation to address and test.
Reported by cribeiro
Original comment by Anonymous:
''[fumanchu: Moved this here from the Wiki]''
= Notes on Ticket #59 =
Ticket #59 was proposed as a way to better integrated generators in the core CP implementation. Previous to it, the core simply used a string to contain the body of the response. Filters also relied on
The ticket #59 was filled as a proposal to solve these issues, by using a uniform representation for
== Design issues ==
==== Content-length ====
The main advantage of generators for long output strings is negated by the need to calculate the content-length before sending data. The content-length is required by the HTTP/1.0 spec. HTTP/1.1 makes it optional, but recommended. The reasoning is that it allows the client application to discriminate between EOFs sent as part of the content itself from the 'real' EOF character. For non-ninary (read: text, including HTML) output, its use is not really necessary. And even for binary streams, I'm pretty sure that any modern HTTP client can handle EOFs in the stream ('''but I haven't tested this assertion''').
==== Removed hooks ====
During the coding, it turned out that some hooks were never used by any existing filter; what is worse, they posed a problem because of their position inside the code. The span of code between the
As a result, a decision was made to remove these hooks. Instead, a new hook called
==== Filters returning generators ====
One side effect of the internal use of generators is that filters that process the body should also return a generator themselves. The modifications are trivial in most cases:
==== Caveat: a string is an iterable ====
There's a potential for hidden bugs in the fact that a string is also an iterable; however, iterating over characters is slow, and this bug may be difficult to catch. One possibility is to add a warning if this case ever happens.
==== Gzip ====
The original GzipFilter used the
As none of the flags are supposed to be set, some optional members that could potentially follow the header are ommited from this presentation. The necessary fields are:
In the trailer part of the stream, the following data should be provided:
===== Assorted remarks =====
The changes in the gzip filter required some research. While reading, some notes were collected we may prove of interest for CherryPy development:
"Both Internet Explorer 5.5 and Internet Explorer 6.0 have a bug with decompression that affects some users. This bug is documented in: the Microsoft knowledge Base articles, Q312496 is for IE 6.0 … , the Q313712 is for IE 5.5. Basically Internet Explorer doesn't decompress the response before it sends it to plug-ins like Adobe Photoshop."