Skip to content

Commit

Permalink
[close #59] Use relative path for cache keys
Browse files Browse the repository at this point in the history
This fix is targeted at 3.x. I haven't tested 4.x for presence of this problem, but I would be surprised if it didn't exist there as well.

Right now the cache is using absolute paths to generate cache keys, this doesn't work if you end up running `rake assets:precompile` in different directories. Say you've moved your project or are running on a different server with a different absolute path. Or say, on every build on Heroku ever. This was reported in #59.

This change introduces UnloadedAsset class that is responsible for among other things generating consistent cache keys.

Previously when running https://gist.github.com/schneems/85f592ba2773761dfcf3 the output would show that the second build took over 38 seconds to build. With this patch applied it takes around 3.385542936011916 seconds.

DON'T MERGE THIS IN YET it needs a cleanup, some better docs, cause geez the 90% of the implicit existing undocumented behavior here is kinda crazy. Not all tests are passing yet either.
  • Loading branch information
schneems committed Aug 9, 2015
1 parent 543a5a2 commit fb2824f
Show file tree
Hide file tree
Showing 3 changed files with 139 additions and 41 deletions.
10 changes: 9 additions & 1 deletion lib/sprockets/base.rb
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,14 @@ def cached
end
alias_method :index, :cached

def clean_path_for_cache(path)
if relative_path = PathUtils.split_subpath(root, path)
relative_path
else
path
end
end

# Internal: Compute digest for path.
#
# path - String filename or directory path.
Expand All @@ -50,7 +58,7 @@ def file_digest(path)
# negligently reset thus appearing as if the file hasn't changed on
# disk. Also, the mtime is only read to the nearest second. Its
# also possible the file was updated more than once in a given second.
cache.fetch("file_digest:#{path}:#{stat.mtime.to_i}") do
cache.fetch("file_digest:#{clean_path_for_cache(path)}:#{stat.mtime.to_i}") do
self.stat_digest(path, stat)
end
end
Expand Down
7 changes: 0 additions & 7 deletions lib/sprockets/dependencies.rb
Original file line number Diff line number Diff line change
Expand Up @@ -51,13 +51,6 @@ def add_dependency(uri)
end
alias_method :depend_on, :add_dependency

# Internal: Resolve set of dependency URIs.
#
# Returns Array of resolved Objects.
def resolve_dependencies(uris)
uris.map { |uri| resolve_dependency(uri) }
end

# Internal: Resolve dependency URIs.
#
# Returns resolved Object.
Expand Down
163 changes: 130 additions & 33 deletions lib/sprockets/loader.rb
Original file line number Diff line number Diff line change
Expand Up @@ -5,70 +5,138 @@
require 'sprockets/file_reader'
require 'sprockets/mime'
require 'sprockets/path_utils'
require 'sprockets/path_digest_utils'
require 'sprockets/processing'
require 'sprockets/processor_utils'
require 'sprockets/resolve'
require 'sprockets/transformers'
require 'sprockets/uri_utils'

module Sprockets
# Used to parse and store the URI to an unloaded asset
# Generates keys used to store and retrieve items from cache
class UnloadedAsset

# Uri is a complete uri to a file including schema
# such as: "file:///full/path/app/assets/javascripts/application.js?type=application/javascript"
#
# The root is the directory of the application without schema
# such as: /full/path
def initialize(uri, env)
@uri = uri
@env = env # for peet sake, needed because apparently overriding methods is a valid way of providing memoization
@root = env.root
@path = relative_path(uri)
@params = @filename = nil
end
attr_reader :path, :root, :uri

def filename
unless @filename
load_file_params
end
@filename
end

def params
unless @params
load_file_params
end
@params
end

# Used to retrieve an asset from the cache
# based on relative path
def asset_key
"asset-uri:#{path}"
end

# Used to retrieve an array of previously generated assets for current uri
# Why do we need this? Because an asset may have # blah ()
def dependency_key
@dependency_digest ||= @env.file_digest(filename)
"asset-uri-cache-dependencies:#{path}:#{@dependency_digest}"
end

# Used to retrieve an asset object directly from the cache based on digest
# The object at the given path may have been modified so we use
# The digest (passed in) to determine the correct asset.
def digest_key(digest)
"asset-uri-digest:#{path}:#{digest}"
end

private
def load_file_params
@filename, @params = URIUtils.parse_asset_uri(uri)
end

def relative_path(uri)
if relative_path = PathUtils.split_subpath(root, uri.sub(/\Afile:\/\//, "".freeze))
relative_path
else
uri
end
end
end
# The loader phase takes a asset URI location and returns a constructed Asset
# object.
module Loader
include DigestUtils, PathUtils, ProcessorUtils, URIUtils
include Engines, Mime, Processing, Resolve, Transformers


# Public: Load Asset by AssetURI.
#
# uri - AssetURI
#
# Returns Asset.
def load(uri)
filename, params = parse_asset_uri(uri)
if params.key?(:id)
unless asset = cache.get("asset-uri:#{VERSION}:#{uri}", true)
id = params.delete(:id)
uri_without_id = build_asset_uri(filename, params)
asset = load_asset_by_uri(uri_without_id, filename, params)
unloaded = UnloadedAsset.new(uri, self)
if unloaded.params.key?(:id)
unless asset = cache.get(unloaded.asset_key, true)
id = unloaded.params.delete(:id)
uri_without_id = build_asset_uri(unloaded.filename, unloaded.params)
asset = load_from_unloaded(UnloadedAsset.new(uri_without_id, self))
if asset[:id] != id
@logger.warn "Sprockets load error: Tried to find #{uri}, but latest was id #{asset[:id]}"
end
end
else
asset = fetch_asset_from_dependency_cache(uri, filename) do |paths|
asset = fetch_asset_from_dependency_cache(unloaded) do |paths|
if paths
digest = digest(resolve_dependencies(paths))
if id_uri = cache.get("asset-uri-digest:#{VERSION}:#{uri}:#{digest}", true)
cache.get("asset-uri:#{VERSION}:#{id_uri}", true)
digest = DigestUtils.digest(resolve_dependencies(paths))
if uri_from_cache = cache.get(unloaded.digest_key(digest), true)
cache.get(UnloadedAsset.new(uri_from_cache, self).asset_key, true)
end
else
load_asset_by_uri(uri, filename, params)
load_from_unloaded(unloaded)
end
end
end
Asset.new(self, asset)
end

private
def load_asset_by_uri(uri, filename, params)
unless file?(filename)
raise FileNotFound, "could not find file: #{filename}"

def load_from_unloaded(unloaded)
unless file?(unloaded.filename)
raise FileNotFound, "could not find file: #{unloaded.filename}"
end

load_path, logical_path = paths_split(config[:paths], filename)
load_path, logical_path = paths_split(config[:paths], unloaded.filename)

unless load_path
raise FileOutsidePaths, "#{filename} is no longer under a load path: #{self.paths.join(', ')}"
raise FileOutsidePaths, "#{unloaded.filename} is no longer under a load path: #{self.paths.join(', ')}"
end

logical_path, file_type, engine_extnames, _ = parse_path_extnames(logical_path)
name = logical_path

if pipeline = params[:pipeline]
if pipeline = unloaded.params[:pipeline]
logical_path += ".#{pipeline}"
end

if type = params[:type]
if type = unloaded.params[:type]
logical_path += config[:mime_types][type][:extensions].first
end

Expand All @@ -86,8 +154,8 @@ def load_asset_by_uri(uri, filename, params)
result = call_processors(processors, {
environment: self,
cache: self.cache,
uri: uri,
filename: filename,
uri: unloaded.uri,
filename: unloaded.filename,
load_path: load_path,
name: name,
content_type: type,
Expand All @@ -101,28 +169,28 @@ def load_asset_by_uri(uri, filename, params)
length: source.bytesize
)
else
dependencies << build_file_digest_uri(filename)
dependencies << build_file_digest_uri(unloaded.filename)
metadata = {
digest: file_digest(filename),
length: self.stat(filename).size,
digest: file_digest(unloaded.filename),
length: self.stat(unloaded.filename).size,
dependencies: dependencies
}
end

asset = {
uri: uri,
uri: unloaded.uri,
load_path: load_path,
filename: filename,
filename: unloaded.filename,
name: name,
logical_path: logical_path,
content_type: type,
source: source,
metadata: metadata,
dependencies_digest: digest(resolve_dependencies(metadata[:dependencies]))
dependencies_digest: DigestUtils.digest(resolve_dependencies(metadata[:dependencies]))
}

asset[:id] = pack_hexdigest(digest(asset))
asset[:uri] = build_asset_uri(filename, params.merge(id: asset[:id]))
asset[:uri] = build_asset_uri(unloaded.filename, unloaded.params.merge(id: asset[:id]))

# Deprecated: Avoid tracking Asset mtime
asset[:mtime] = metadata[:dependencies].map { |u|
Expand All @@ -133,18 +201,47 @@ def load_asset_by_uri(uri, filename, params)
nil
end
}.compact.max
asset[:mtime] ||= self.stat(filename).mtime.to_i
asset[:mtime] ||= self.stat(unloaded.filename).mtime.to_i



cache.set("asset-uri:#{VERSION}:#{asset[:uri]}", asset, true)
cache.set("asset-uri-digest:#{VERSION}:#{uri}:#{asset[:dependencies_digest]}", asset[:uri], true)
store = UnloadedAsset.new(asset[:uri], self)
cache.set(store.asset_key, asset, true)

cache.set(unloaded.digest_key(asset[:dependencies_digest]), store.path, true) # wat

asset
end

def fetch_asset_from_dependency_cache(uri, filename, limit = 3)
key = "asset-uri-cache-dependencies:#{VERSION}:#{uri}:#{file_digest(filename)}"
history = cache.get(key) || []

# Internal: Resolve set of dependency URIs.
#
# Returns back array of things that the given uri dpends on
# For example the environment version, if you're using a different version of sprockets
# then the dependencies should be different, this is used only for generating cache key
def resolve_dependencies(uris)
uris.map do |uri|
dependency = resolve_dependency(uri)
case dependency
when Array
dependency.map do |dep|
if dep && dep.is_a?(String) && relative_path = PathUtils.split_subpath(root, dep)
relative_path
else
dep
end
end
else
dependency
end
end
end


def fetch_asset_from_dependency_cache(unloaded, limit = 3)
key = unloaded.dependency_key

history = cache.get(key) || []
history.each_with_index do |deps, index|
if asset = yield(deps)
cache.set(key, history.rotate!(index)) if index > 0
Expand Down

0 comments on commit fb2824f

Please sign in to comment.