Detect asset changes by saving source_digests in manifest.yml, and speed up non-digest assets #7866

Closed
wants to merge 1 commit into
from

Projects

None yet

6 participants

@ndbroadbent
Contributor

Please see the Rails 4 pull request on sprockets-rails for more information: rails/sprockets-rails#21

This is a backport of the work for Rails 3.2. Please note that a new version of sprockets will need to be released with this commit: ndbroadbent/sprockets@82cc1de (from my rails3_assets_speedup branch)

These changes have already been released as a gem called turbo-sprockets-rails3, and it's working well for many applications.

@jeremy jeremy and 1 other commented on an outdated diff Oct 7, 2012
actionpack/lib/sprockets/static_compiler.rb
end
def compile
- manifest = {}
- env.each_logical_path(paths) do |logical_path|
- if asset = env.find_asset(logical_path)
- digest_path = write_asset(asset)
- manifest[asset.logical_path] = digest_path
- manifest[aliased_path_for(asset.logical_path)] = digest_path
+ start_time = Time.now.to_f
+
+ env.each_logical_path do |logical_path|
+ next unless compile_path?(logical_path)
@jeremy
jeremy Oct 7, 2012 Member

3-2-stable upgraded to Sprockets 2.2+ and switched to env.each_logical_path(paths) instead of handling compile_path? itself.

@ndbroadbent
ndbroadbent Oct 7, 2012 Contributor

Oh, right! I've rebased my sprockets patch onto Sprockets 2.2.1 as well as updating this pull request

@jeremy jeremy and 1 other commented on an outdated diff Oct 7, 2012
actionpack/lib/sprockets/static_compiler.rb
+ if RUBY_VERSION.to_f >= 1.9
+ @source_digests = encode_hash_as_utf8 @source_digests
+ @digest_files = encode_hash_as_utf8 @digest_files
+ end
+
+ if @manifest
+ write_manifest(source_digests: @source_digests, digest_files: @digest_files)
+ end
+
+ # Store digests in Rails config. (Important if non-digest is run after primary)
+ config = ::Rails.application.config
+ config.assets.digest_files = @digest_files
+ config.assets.source_digests = @source_digests
+
+ elapsed_time = ((Time.now.to_f - start_time) * 1000).to_i
+ env.logger.debug "Processed #{'non-' unless @digest}digest assets in #{elapsed_time}ms"
@jeremy
jeremy Oct 7, 2012 Member

This is now a very large, complex method. Possible to break it up into smaller, more digestible pieces?

@ndbroadbent
ndbroadbent Oct 7, 2012 Contributor

Ok, have split the asset processing part out into a separate #process_assets method that is called from #compile

@jeremy jeremy and 1 other commented on an outdated diff Oct 7, 2012
actionpack/lib/sprockets/static_compiler.rb
@@ -41,10 +90,17 @@ def write_asset(asset)
end
end
+
+ private
+
@jeremy
jeremy Oct 7, 2012 Member

Breaking change?

@ndbroadbent
ndbroadbent Oct 7, 2012 Contributor

Sorry about that, it just seemed like a good thing to do, since those methods are only used internally. I've removed the private

@jeremy jeremy and 1 other commented on an outdated diff Oct 7, 2012
actionpack/lib/sprockets/static_compiler.rb
@manifest_path = options.delete(:manifest_path) || target
+ @zip_files = options.delete(:zip_files) || /\.(?:css|html|js|svg|txt|xml)$/
@jeremy
jeremy Oct 7, 2012 Member

Is this new?

@ndbroadbent
ndbroadbent Oct 7, 2012 Contributor

Sorry about that, unneeded artifact from sprockets 2.6

@jeremy jeremy and 1 other commented on an outdated diff Oct 7, 2012
actionpack/lib/sprockets/assets.rake
@@ -43,34 +43,55 @@ namespace :assets do
config = Rails.application.config
config.assets.compile = true
config.assets.digest = digest unless digest.nil?
- config.assets.digests = {}
-
- env = Rails.application.assets
- target = File.join(Rails.public_path, config.assets.prefix)
- compiler = Sprockets::StaticCompiler.new(env,
- target,
- config.assets.precompile,
- :manifest_path => config.assets.manifest,
- :digest => config.assets.digest,
- :manifest => digest.nil?)
- compiler.compile
+ config.assets.digest_files ||= {}
+ config.assets.source_digests ||= {}
@jeremy
jeremy Oct 7, 2012 Member

Apps rely on config.assets.digests so changing it in our stable branch won't fly. How about keeping the existing manifest and introducing a new one alongside it?

@ndbroadbent
ndbroadbent Oct 7, 2012 Contributor

Ok, that sounds good, will create a separate manifest for source digests.

Will keep the change in the Rails 4 PR though.
I've updated the Rails helpers to use the config.assets.digest_files key, and this change hasn't caused any problems so far. I've also submitted a pull request to the asset_sync gem to support the new manifest format, as well as the old one.

@jeremy jeremy and 1 other commented on an outdated diff Oct 7, 2012
actionpack/lib/sprockets/assets.rake
+ generator.generate
+ else
+ compiler = Sprockets::StaticCompiler.new(env, target, config.assets.precompile,
+ :digest => config.assets.digest,
+ :manifest => digest.nil?,
+ :manifest_path => config.assets.manifest,
+ :digest_files => config.assets.digest_files,
+ :source_digests => config.assets.source_digests
+ )
+ compiler.compile
+ end
+
+ unless config.assets.clean_after_precompile == false
+ cleaner = Sprockets::StaticCleaner.new(env, target, config.assets.digest_files)
+ cleaner.remove_old_assets!
+ end
@jeremy
jeremy Oct 7, 2012 Member

I don't see a default value for config.assets.clean_after_precompile set, so cleaning will happen by default. We keep old asset digests around so stale references to them still work (such as assets referenced from old emails)

@ndbroadbent
ndbroadbent Oct 7, 2012 Contributor

Sorry about that, wasn't meant to be in here. I thought that was a useful feature at first, but didn't realize that old versions of assets might still be requested by cached pages, etc. I've since solved this problem within Capistrano, since invalidation depends on knowledge of previous releases: capistrano/capistrano#281

So I've removed this 'cleaning' feature from the Rails 4 PR, as well as my gem, but forgot to remove it from this Rails 3 branch.

@jeremy jeremy commented on the diff Oct 7, 2012
actionpack/lib/sprockets/assets.rake
end
- task :all do
- Rake::Task["assets:precompile:primary"].invoke
- # We need to reinvoke in order to run the secondary digestless
- # asset compilation run - a fresh Sprockets environment is
- # required in order to compile digestless assets as the
- # environment has already cached the assets on the primary
- # run.
- ruby_rake_task("assets:precompile:nondigest", false) if Rails.application.config.assets.digest
+ task :all => ["assets:cache:clean"] do
@jeremy
jeremy Oct 7, 2012 Member

See comment above -- always cleaning before generating assets means we we're orphaning references to old assets.

@ndbroadbent
ndbroadbent Oct 7, 2012 Contributor

Asset cleaning has been removed from the pull request, please see above

@jeremy jeremy and 1 other commented on an outdated diff Oct 7, 2012
actionpack/lib/sprockets/static_compiler.rb
end
end
- write_manifest(manifest) if @manifest
+
+ # Encode all filenames & digests as UTF-8 for Ruby 1.9,
+ # otherwise YAML dumps other string encodings as !binary
+ if RUBY_VERSION.to_f >= 1.9
+ @source_digests = encode_hash_as_utf8 @source_digests
+ @digest_files = encode_hash_as_utf8 @digest_files
+ end
@jeremy
jeremy Oct 7, 2012 Member

Why is this an issue?

@ndbroadbent
ndbroadbent Oct 7, 2012 Contributor

It works fine without it, but it just makes the manifest.yml file look really ugly, and bug reports become harder to debug.

@jeremy jeremy and 1 other commented on an outdated diff Oct 7, 2012
actionpack/lib/sprockets/static_non_digest_generator.rb
+ [file, digest_file[DIGEST_REGEX, 1]]
+ }.flatten]
+ end
+
+
+ # Generate non-digest assets by making a copy of the digest asset,
+ # with digests stripped from js and css. The new files are also gzipped.
+ # Other assets are copied verbatim.
+ def generate
+ start_time = Time.now.to_f
+
+ env.each_logical_path do |logical_path|
+ if File.basename(logical_path)[/[^\.]+/, 0] == 'index'
+ logical_path.sub!(/\/index\./, '.')
+ end
+ next unless compile_path?(logical_path)
@jeremy
jeremy Oct 7, 2012 Member

Ditto for comment above about env.each_logical_path(paths), and the index handling has changed to use aliases.

@ndbroadbent
ndbroadbent Oct 7, 2012 Contributor

Updated

@jeremy jeremy and 1 other commented on an outdated diff Oct 7, 2012
actionpack/lib/sprockets/static_non_digest_generator.rb
+
+ digest_path = @digest_files[logical_path]
+ abs_digest_path = "#{@target}/#{digest_path}"
+ abs_logical_path = "#{@target}/#{logical_path}"
+
+ mtime = File.mtime(abs_digest_path)
+
+ # Remove known digests from css & js
+ if abs_digest_path.match(/\.(?:js|css)$/)
+ asset_body = File.read(abs_digest_path)
+
+ # Find all hashes in the asset body with a leading '-'
+ asset_body.gsub!(DIGEST_REGEX) do |match|
+ # Only remove if known digest
+ $1.in?(@asset_digests.values) ? '' : match
+ end
@jeremy
jeremy Oct 7, 2012 Member

This feels.. dangerous :)

@ndbroadbent
ndbroadbent Oct 7, 2012 Contributor

IMO, it's not as dangerous as it might look. What it does is scan through the file looking for any strings that match /-([0-9a-f]{32})\./, for example: application**-15e3a26e343f4cee7500049e0de72b0a.**css (including leading - and trailing .)

It then does a second check to make sure that digest is contained in the manifest, which means you can safely add references to a compiled file from a different application, and it won't be stripped.

@jeremy jeremy commented on the diff Oct 7, 2012
actionpack/lib/sprockets/static_non_digest_generator.rb
+ mtime = File.mtime(abs_digest_path)
+
+ # Remove known digests from css & js
+ if abs_digest_path.match(/\.(?:js|css)$/)
+ asset_body = File.read(abs_digest_path)
+
+ # Find all hashes in the asset body with a leading '-'
+ asset_body.gsub!(DIGEST_REGEX) do |match|
+ # Only remove if known digest
+ $1.in?(@asset_digests.values) ? '' : match
+ end
+
+ # Write non-digest file
+ File.open abs_logical_path, 'w' do |f|
+ f.write asset_body
+ end
@jeremy
jeremy Oct 7, 2012 Member

This may change the file's encoding

@ndbroadbent
ndbroadbent Oct 7, 2012 Contributor

Have just tested with some Japanese characters, and didn't see any problems with encoding.

@jeremy
Member
jeremy commented Oct 7, 2012

Looks promising! Storing digests and checking on subsequent compile is great.

Generating non-digest assets from possibly-existing digests assets feels like it may be a step too far though. It's bolted on and needs some serious digging to strip digest URLs. I think it'd be better to omit.

I'd really rather see non-digest assets go away entirely -- much simpler :)

Also, this makes a handful of breaking changes: renaming config.assets.digests, renaming related methods, making some methods private, etc. I think the change can be made less intrusively, without changing methods or breaking any current apps, but it'd mean introducing a second manifest of source digests rather than changing the existing manifest at all.

@neersighted

👍

@ndbroadbent
Contributor

Hi there, thanks so much for your review! I've updated the pull request with all the changes you mentioned:

  • Source digests are now saved to a separate file: sources_manifest.yml
  • Renamed config.assets.digest_files back to config.assets.digests (P.S. This wouldn't break any plain Rails 3.2 apps, since all the helper methods were also updated internally. But other gems like asset_sync might have had some issues)
  • private method back to public

RE: non-digest assets - I think there's definitely a case to keep them around if they can be compiled in 60ms. The only reason for #5379 is that they make the compile take twice as long, so I don't think anyone will care about making them optional if they don't add any extra overhead.

However, the non-digest speedup is really only a secondary benefit. The main issue is that we wouldn't have any way to know if the non-digest asset was out of date, so we would need to recompile it every time. Unless we also had a nondigest_manifest.yml file to keep track of the non-digest digests... Otherwise, this would really limit the benefit of this pull request.

I do agree that it would be much simpler if non-digest assets could go away entirely, but I think they still have some purpose (images in emails, etc.)

@ndbroadbent
Contributor

Also please note that I've rebased my sprockets patch onto 2.2.1, and have opened an issue requesting the merge and release of sprockets 2.2.2.

@davidjrice

@ndbroadbent I agree with @jeremy I would like to see non-digested files go away entirely.

With the beginnings of work I have started in #5379 removing everything else in the public directory and generating them at compile time is much preferred in my eyes.

A nondigest manifest would be a good addition to cut down the generation time even further. However, adding a manifest removes one of the biggest benefits of the asset pipeline. That repository histories are not bloated by loads of "bundling, jammit whatever." commits. So it should definitely be optional.

Not having to generate non-digest variants as a starter, would cut the processing time down by 50% and also remove what IMHO is a pretty complicated (and hacky) approach to reset the rake env to do the nondigest compile.

The final issue I'm wrestling with in #5379 is making sprockets understand the files in /public. The problem here is that sprockets is mounted to /assets and the non-digest files we care about are public/404.html etc.

@guilleiguaran
Member

@davidjrice @ndbroadbent @jeremy non-digested assets won't be precompiled in the next version of sprockets-rails :)

@davidjrice

@guilleiguaran does that mean 404.html / 500.html could be pre-generated? Therefore removing one of the major use cases for non-digested assets.

@ndbroadbent
Contributor

@davidjrice - If you are checking in your assets to source control, you're going to have a lot of bloated commits anyway. I don't think an extra manifest will make much difference, compared to the rest of your changed assets.

Not having to generate non-digest assets would definitely cut down the processing time by 50%, but this pull request also cuts down the processing time by 50%. It only takes about 64 milliseconds to strip digests and copy the files.
Also, the hacky rake env reset has been removed from recent versions of the asset pipeline, as well as this pull request.

I do agree that it would great to generate all the static pages in public/. You might be interested in the Rake task I've written to precompile all my static pages, including error pages: https://gist.github.com/3868731

I'm not sure what you mean about sprockets needing to understand files in public/. If a template is compiled to public/404.html, it will just be served by nginx or Rack::Static.

@guilleiguaran - Sounds good! Very happy to remove that code if it's not relevant anymore.

@guilleiguaran
Member

@ndbroadbent can you check the current sprockets-rails and tell us what of your changes still being relevant with the new version of sprockets-rails? 😃

@guilleiguaran
Member

@davidjrice I think I will add a rake assets:precompile:all to copy all digested assets to non-digested versions and call it a day!

@josh
Member
josh commented Oct 22, 2012

All this should be irrelevant now.

@ndbroadbent
Contributor

OK, cool! Hooray for faster compiles, either way 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment