Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Remove serialized HTTP headers from storeClientCopy() (#1335)
Do not send serialized HTTP response header bytes in storeClientCopy() answers. Ignore serialized header size when calling storeClientCopy(). This complex change adjusts storeClientCopy() API to addresses several related problems with storeClientCopy() and its callers. The sections below summarize storeClientCopy() changes and then move on to callers. Squid incorrectly assumed that serialized HTTP response headers are read from disk in a single storeRead() request. In reality, many situations lead to store_client::readBody() receiving partial HTTP headers, resulting in parseCharBuf() failure and a level-0 cache.log message: Could not parse headers from on disk object Inadequate handling of this failure resulted in a variety of problems. Squid now accumulates storeRead() results to parse larger headers and also handles parsing failures better, but we could not just stop there. With the storeRead() accumulation in place, it is no longer possible to send parsed serialized HTTP headers to storeClientCopy() callers because those callers do not provide enough buffer space to fit larger headers. Increasing caller buffer capacity does not work well because the actual size of the serialized header is unknown in advance and may be quite large. Always allocating large buffers "just in case" is bad for performance. Finally, larger buffers may jeopardize hard-to-find code that uses hard-coded 4KB buffers without using HTTP_REQBUF_SZ macro. Fortunately, storeClientCopy() callers either do not care about serialized HTTP response headers or should not care about them! The API forced callers to deal with serialized headers, but callers could (and some did) just use the parsed headers available in the corresponding MemObject. With this API change, storeClientCopy() callers no longer receive serialized headers and do not need to parse or skip them. Consequently, callers also do not need to account for response headers size when computing offsets for subsequent storeClientCopy() requests. Restricting storeClientCopy() API to HTTP _body_ bytes removed a lot of problematic caller code. Caller changes are summarized further below. A similar HTTP response header parsing problem existed in shared memory cache code. That code was actually aware that headers may span multiple cache slices but incorrectly assumed that httpMsgParseStep() accumulates input as needed (to make another parsing "step"). It does not. Large response headers cached in shared memory triggered a level-1 message: Corrupted mem-cached headers: e:... Fixed MemStore code now accumulates serialized HTTP response headers as needed to parse them, sharing high-level parsing code with store_client. Old clientReplyContext methods worked hard to skip received serialized HTTP headers. The code contained dangerous and often complex/unreadable manipulation of various raw offsets and buffer pointers, aggravated by the perceived need to save/restore those offsets across asynchronous checks (see below). That header skipping code is gone now. Several stale and misleading comments related to Store buffers management were also removed or updated. We replaced reqofs/reqsize with simpler/safer lastStreamBufferedBytes, while becoming more consistent with that "cached" info invalidation. We still need this info to resume HTTP body processing after asynchronous http_reply_access checks and cache hit validations, but we no longer save/restore this info for hit validation: No need to save/restore information about the buffer that hit validation does not use and must never touch! The API change also moved from-Store StoreIOBuffer usage closer to StoreIOBuffers manipulated by Clients Streams code. Buffers in both categories now contain just the body bytes, and both now treat zero length as EOF only _after_ processing the response headers. These changes improve overall code quality, but this code path and these changes still suffer from utterly unsafe legacy interfaces like StoreIOBuffer and clientStreamNode. We cannot rely on the compiler to check our work. The risk of these changes exposing/causing bugs is high. asHandleReply() expected WHOIS response body bytes where serialized HTTP headers were! The code also had multiple problems typical for manually written C parsers dealing with raw input buffers. Now replaced with a Tokenizer-based code. To skip received HTTP response headers, peerDigestHandleReply() helper functions called headersEnd() on the received buffer. Twice. We have now merged those two parsing helper functions into one (that just checks the already parsed headers). This merger preserved "304s must come with fetch->pd->cd" logic that was hidden/spread across those two functions. urnHandleReply() re-parsed received HTTP response headers. We left its HTTP body parsing code unchanged except for polishing NUL-termination. netdbExchangeHandleReply() re-parsed received HTTP response headers to find where they end (via headersEnd()). We improved handing of corner cases and replaced some "tricky bits" code, reusing the new Store::ParsingBuffer class. The net_db record parsing code is unchanged. Mgr::StoreToCommWriter::noteStoreCopied() is a very special case. It actually worked OK because, unlike all other storeClientCopy() callers, this code does not get serialized HTTP headers from Store: The code adding bytes to the corresponding StoreEntry does not write serialized HTTP headers at all. StoreToCommWriter is used to deliver kid-specific pieces of an HTTP body of an SMP cache manager response. The HTTP headers of that response are handled elsewhere. We left this code unchanged, but the existence of the special no-headers case does complicate storeClientCopy() API documentation, implementation, and understanding. Co-authored-by: Eduard Bagdasaryan <eduard.bagdasaryan@measurement-factory.com>
- Loading branch information
Showing
24 changed files
with
1,090 additions
and
727 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.