forked from gojek/wrest
-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge branch 'caching' of github.com:kaiwren/wrest
- Loading branch information
Showing
24 changed files
with
1,494 additions
and
108 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -21,3 +21,5 @@ TAGS | |
.bundle | ||
.redcar | ||
.rvmrc | ||
.idea | ||
spec_all_rubies.sh |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,154 @@ | ||
# Caching in Wrest # | ||
|
||
[RFC 2616's Caching section ](http://www.w3.org/Protocols/rfc2616/rfc2616-sec13.html) describes in detail how Caching is to be implemented by the clients. | ||
|
||
A response should obey the following conditions to be considered cacheable by Wrest: | ||
|
||
* Only responses to GET requests are cached. | ||
* The response code must be 200, 203, 300, 301, 302, 304 or 307. | ||
* The Cache-Control headers should not have neither no-cache nor no-store flag. | ||
* There should not be Pragma: no-cache header. (this header is only used by HTTP 1.0 servers) | ||
* Either Cache-Control: max-age or the Expires headers (or both) should be set. (Cache-control: max-age always take priority over Expires header.) | ||
* If only Expires header is set, it should not be lesser than the response's Date header. It should also be greater than the time when the response was received by the client. | ||
* The date headers (Date, Expires) should be in [RFC 1123 format](http://www.ietf.org/rfc/rfc1123.txt). | ||
* The Vary header should not be present at all. (The Vary mechanism is used to conditionally control caching, which Wrest does not currently implement. Section 14.44 of the RFC 2616 describes the Vary tag in detail) | ||
|
||
Whenever a GET request is sent to Wrest, it consults the Cache Store for a matching entry. If an entry is found and has not expired, it is returned back as the response without making a request to the server. | ||
|
||
A cache entry is considered to be fresh (not expired) if: | ||
|
||
* Its freshness lifetime is greater than zero. | ||
* Freshness lifetime of a cache entry is its Cache-control: max-age if max-age is defined. If max-age is not defined, it would be the cache entry's Expires header-Current Time. | ||
(note: either max-age or Expires header is liable to be present for the cache entry since only such response's are cached at all). | ||
|
||
**AND** | ||
|
||
* Its freshness lifetime is greater than the cache entry's age. | ||
* Age of a cache entry is: Current Date & Time - the cached response's Date header, or the value of the Age header in the cached response, whichever is greater. | ||
|
||
If a cache entry is available, but expired, Wrest sees if the entry can be validated. A cache entry can be validated if: | ||
|
||
* It has a Last-Modified header, or an ETag header, or both. | ||
|
||
If a cache-entry can be validated, Wrest sends the actual GET request to the server, alongwith: | ||
|
||
* If-Modified-Since : <Last-Modified value of the cache entry> (if the header Last-Modified was present in the cache entry), and/or | ||
* If-None-Match: <ETag of the cache entry> (if ETag was present in the cache entry) | ||
|
||
The server determines whether the response cached at the client is still valid by looking at the values of the If-Modified-Since/If-None-Match headers. It sends a 304 (Not Modified) response without a body, if the response available with the client is still valid. | ||
|
||
Wrest, upon receiving the 304 will update the existing cache entry with the headers provided in the 304 (RFC 2616 13.5.3 Combining Headers) and return the cached response to the client. | ||
|
||
If the server determines the cached entry at the client side is invalid, it sends a full response (usually 200 Ok), which Wrest passes to the client after updating the existing cache entry with the new response. | ||
|
||
If the cache-entry is expired, but cannot be validated, then Wrest sends a full blown GET request to the server. The response is passed to the client after updating the existing cache entry with the new response. | ||
|
||
#### Edge Case for HTML documents #### | ||
|
||
<META HTTP-EQUIV="Pragma" CONTENT="no-cache"> | ||
|
||
Firefox respects the Pragma header in the HTML document (nsHttpResponseHead.h:NoCache). Wrest cannot since it does not parse the response body. | ||
|
||
|
||
## A Rough note on how the browsers (Firefox and Chrome) implement caching ## | ||
|
||
Browsers usually cache all responses including non-cacheable ones. These are for use in the browser History (Forward, Back buttons). [ [RFC 2616](http://www.ietf.org/rfc/rfc2616.txt) 13.13 History Lists] | ||
The non-cachebility restriction is usually observed after fetching a cache entry - if the stored response was not cacheable, it is not used. | ||
|
||
A large chunk of caching logic for Firefox 3 is in the file netwerk/protcols/http/nsHttpChannel.cpp inside its source tree. | ||
|
||
The browsers are optimistic with respect to caching - if a response does not explicitly specify an Expiration mechanism, it uses its own heuristics to calculate an Expiry time. However Wrest is pessimistic - if a document does not specifiy an explicit cache expiration mechanism, the response is not cached at all. | ||
|
||
The following is a rough outline that I'd written to understand how the browsers implement caching. However, they do not necessarily reflect the browsers' behaviour accurately and has been heaviliy adapted to suit Wrest. | ||
|
||
## Firefox: nsHttpChannell::CheckCache() ## | ||
|
||
do_fetch if method.head != cache.head | ||
do_fetch if not (method.head = 'GET' || method.head = 'HEAD') | ||
|
||
use_cache if Cache-Control: max-age validates. Refer cache_expired? | ||
|
||
re_validate if: | ||
|
||
* Expires: header is a past date OR cache_expired? | ||
* the cache entry has 'must-revalidate' header. [RFC 2616 14.9.4](http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.9.4) | ||
|
||
## doValidation ## | ||
|
||
Add an If-Modified-Since to the request if the cache has a Last-Modified value. | ||
Add an If-None-Match to the request if the cache had an ETag | ||
|
||
Send Request. | ||
|
||
If a full response is received, update cache and return the result. | ||
If a Not-Modified received, return the cache itself. | ||
|
||
## Do Not Store in Cache If ## | ||
|
||
* Original request was not (GET or HEAD) | ||
|
||
* Any response with a code other than given MUST NOT be cached. | ||
(success codes) 200,203 (cacheable redirects) 300, 301, 302, 304, 307. | ||
[from Mozilla: nsHttpResponseHead.cpp::MustValidate(), also we cannot support 206 (partial content)] | ||
|
||
* this is a response to a cache validation request: ie: the original request contained | ||
an 'if-modified-since' or 'if-match' (http://codesearch.google.com/codesearch/p#OAMlx_jo-ck/src/net/http/http_cache_transaction.cc&l=45) | ||
|
||
* has tags 'cache-control: no-cach or no-store', or 'pragma: no-cache' [HTTP 1.0] | ||
|
||
* does not provide any explicit expiration time. to maintain maximum semantic transparency, we only cache those responses that explicitly permit caching. [RFC 2616 13.2.2](http://www.w3.org/Protocols/rfc2616/rfc2616-sec13.html#sec13.2.2) | ||
|
||
* if no max-age defined AND the cache expires in its past itself: cache.expires < cache.date | ||
|
||
* the response has the Vary tag at all | ||
[TODO: implement fully. | ||
(http://www.subbu.org/blog/2007/12/vary-header-for-restful-applications) | ||
(http://devel.squid-cache.org/vary/vary-header.html) ] | ||
|
||
|
||
## cache_expired? ## | ||
|
||
Firefox: nsHttpResponseHead.cpp: ComputeCurrentAge | ||
[Chrome: RequiresValidation in http_response_headers.cc](http://codesearch.google.com/codesearch/p?hl=en#OAMlx_jo-ck/src/net/http/http_response_headers.cc&q=RequiresValidation&exact_package=chromium&sa=N&cd=2&ct=rc) | ||
|
||
freshness_time=freshness_lifetime | ||
if fresh <= 0 | ||
return true | ||
end | ||
|
||
return freshness_time <= current_age | ||
|
||
|
||
## current_age ## | ||
|
||
Verbatim from [Chrome's http_response_headers.cc](http://codesearch.google.com/codesearch/p?hl=en#OAMlx_jo-ck/src/net/http/http_response_headers.cc&q=RequiresValidation&exact_package=chromium&l=817) | ||
|
||
date_value = headers['Date'] || response_time; | ||
age_value=headers['Age'] || 0 | ||
|
||
apparent_age = response_time - date_value | ||
corrected_received_age = max(apparent_age, age_value); | ||
response_delay = response_time - request_time; | ||
corrected_initial_age = corrected_received_age + response_delay; | ||
resident_time = Time.now - response_time; | ||
|
||
corrected_initial_age + resident_time; | ||
|
||
|
||
## freshness_lifetime ## | ||
|
||
This is a [link to Chrome source code](http://codesearch.google.com/codesearch/p?hl=en#OAMlx_jo-ck/src/net/http/http_response_headers.cc&q=GetFreshnessLifetime&exact_package=chromium&l=848) where freshness_lifetime is defined. | ||
|
||
# References # | ||
|
||
* [RFC 2616 Section 13 : HTTP Caching protocol](http://www.w3.org/Protocols/rfc2616/rfc2616-sec13.html) | ||
* [Mozilla HTTP Caching FAQ](http://www.mozilla.org/projects/netlib/http/http-caching-faq.html) | ||
* [Mark Nottingham's Caching Tutorial](http://www.mnot.net/cache_docs/) | ||
* [Redbot for analyzing HTTP headers](http://redbot.org) | ||
|
||
|
||
### Alternate Cache Implementations ### | ||
|
||
[Resourceful - Ruby HTTP client that does caching](https://github.com/pezra/resourceful/blob/master/lib/resourceful/response.rb#L25) | ||
|
||
[Python Httplib2 library](http://code.google.com/p/httplib2/source/browse/python3/httplib2/__init__.py?r=c86239ee0b6271309be2374f0ebfffd4455b7fb7#237) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,105 @@ | ||
module Wrest | ||
|
||
class CacheProxy | ||
class << self | ||
def new(get, cache_store) | ||
if cache_store | ||
DefaultCacheProxy.new(get, cache_store) | ||
else | ||
NullCacheProxy.new(get) | ||
end | ||
end | ||
end | ||
|
||
class NullCacheProxy | ||
def initialize(get) | ||
@get = get | ||
end | ||
def get | ||
@get.invoke_without_cache_check | ||
end | ||
end | ||
|
||
class DefaultCacheProxy | ||
HOP_BY_HOP_HEADERS = ["connection", | ||
"keep-alive", | ||
"proxy-authenticate", | ||
"proxy-authorization", | ||
"te", | ||
"trailers", | ||
"transfer-encoding", | ||
"upgrade"] | ||
|
||
def initialize(get, cache_store) | ||
@get = get | ||
@cache_store = cache_store | ||
end | ||
|
||
def get | ||
cached_response = @cache_store[@get.hash] | ||
return get_fresh_response if cached_response.nil? | ||
|
||
if cached_response.expired? | ||
if cached_response.can_be_validated? | ||
get_validated_response_for(cached_response) | ||
else | ||
get_fresh_response | ||
end | ||
else | ||
cached_response | ||
end | ||
end | ||
|
||
def update_cache_headers_for(cached_response, new_response) | ||
# RFC 2616 13.5.3 (Combining Headers) | ||
cached_response.headers.merge!(new_response.headers.select {|key, value| not (HOP_BY_HOP_HEADERS.include? key.downcase)}) | ||
end | ||
|
||
def cache(response) | ||
@cache_store[@get.hash] = response.clone if response && response.cacheable? | ||
end | ||
|
||
#:nodoc: | ||
def get_fresh_response | ||
@cache_store.delete @get.hash | ||
|
||
response = @get.invoke_without_cache_check | ||
|
||
cache(response) | ||
|
||
response | ||
end | ||
|
||
#:nodoc: | ||
def get_validated_response_for(cached_response) | ||
new_response = send_validation_request_for(cached_response) | ||
if new_response.code == "304" | ||
update_cache_headers_for(cached_response, new_response) | ||
cached_response | ||
else | ||
cache(new_response) | ||
new_response | ||
end | ||
end | ||
|
||
#:nodoc: | ||
# Send a cache-validation request to the server. This would be the actual Get request with extra cache-validation headers. | ||
# If a 304 (Not Modified) is received, Wrest would use the cached_response itself. Otherwise the new response is cached and used. | ||
def send_validation_request_for(cached_response) | ||
last_modified = cached_response.last_modified | ||
etag = cached_response.headers["etag"] | ||
|
||
cache_validation_headers = {} | ||
cache_validation_headers["if-modified-since"] = last_modified unless last_modified.nil? | ||
cache_validation_headers["if-none-match"] = etag unless etag.nil? | ||
|
||
new_headers =@get.headers.clone.merge cache_validation_headers | ||
new_options =@get.options.clone.tap { |opts| opts.delete :cache_store } # do not run this through the caching mechanism. | ||
|
||
new_request = Wrest::Native::Get.new(@get.uri, @get.parameters, new_headers, new_options) | ||
|
||
new_request.invoke | ||
end | ||
end | ||
end | ||
end |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,27 @@ | ||
require 'dalli' | ||
|
||
module Wrest::Components::CacheStore | ||
class Memcached | ||
|
||
def initialize(server_urls=nil, options={}) | ||
@memcached = Dalli::Client.new(server_urls, options) | ||
end | ||
|
||
def [](key) | ||
@memcached.get(key) | ||
end | ||
|
||
def []=(key, value) | ||
@memcached.set(key, value) | ||
end | ||
|
||
# should be compatible with Hash - return value of the deleted element. | ||
def delete(key) | ||
value = self[key] | ||
|
||
@memcached.delete key | ||
|
||
return value | ||
end | ||
end | ||
end |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
require "#{Wrest::Root}/wrest/components/cache_store/memcached" |
Oops, something went wrong.