Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Absolute resolved file path slows precompilation between deploys #59

Closed
mferrier opened this issue Jun 4, 2015 · 22 comments
Closed

Absolute resolved file path slows precompilation between deploys #59

mferrier opened this issue Jun 4, 2015 · 22 comments

Comments

@mferrier
Copy link

mferrier commented Jun 4, 2015

Hi, I'm running into an issue with the way Sprockets resolves asset paths and wanted to see if there was a way to work around it, and if not, what part of the code would be best to modify for a PR.

Basically, the fact that Sprockets resolves asset URIs to a full file:///full/path/to/a/file means that if a project directory which has done a precompile gets moved to a new directory, the precompile loses the advantage of the FileCache and manifest, because Sprockets thinks the files aren't the same because their full path doesn't match, so it doesn't use the files in the cache.

Our project happens to have a lot of assets, which means even a precompile with a manifest and a populated cache causes Sprockets to recalculate the hashes for each asset, and it ends up taking 15-20 minutes whereas in Rails 3 with turbo-sprockets-rails3, the compile took around 5 minutes.

If we run the precompile again without changing the parent directory, it finishes in 10 seconds.

Our Rails 4 app deployment process looks like:

  1. Each release pushed to a timestamped directory like: /data/app/releases/1, /data/app/releases/2, /data/app/releases/3, etc
  2. There's a "current" symlink that points to the most recent release: /data/app/current -> /data/app/releases/3
  3. The Sprockets precompile target directory is a symlink and is shared between all releases, e.g. /data/app/releases/3/public/assets -> /data/app/shared/assets, /data/app/releases/2/public/assets -> /data/app/shared/assets, etc.
  4. The Sprockets cache FileStore directory is a symlink and is shared between all releases, e.g. /data/app/releases/3/tmp/cache -> /data/app/shared/cache, /data/app/releases/2/tmp/cache-> /data/app/shared/cache, etc.
  5. When we deploy, we first push to a new release directory, e.g. /data/app/releases/4
  6. Then, we run the rake assets:precompile task in the new release directory
  7. Since the Sprockets cache filenames are based on the absolute path of the assets, Sprockets doesn't reuse the cached files and the precompile is slower than it has to be.

So I guess my questions are:

  1. Is there any way to have Sprockets use the relative pathname when naming .cache files instead of the full URI?
  2. If not, is there any other way around this?
  3. If not, would you guys be up for a PR that adds a configuration directive or some other way to add this behaviour?

I thought perhaps this wasn't a possibility because of the fact that multiple asset paths could share the same relative paths, but the fact that this was achieved in Rails 3 with turbo-sprockets-rails3 suggests it's possible.

Thanks!

@rafaelfranca
Copy link
Member

Thank you for the issue. Could you tell which version of sprockets are you using?

@mferrier
Copy link
Author

mferrier commented Jun 5, 2015

We're using sprockets-3.1.0.

I should also mention that the manifest file is shared between releases and is in the compilation target directory.

@mferrier
Copy link
Author

mferrier commented Jun 5, 2015

FYI I just checked and the situation is the same in 3.2.0.

@mferrier
Copy link
Author

mferrier commented Jun 8, 2015

@rafaelfranca any feedback? or a pointer towards where in the code I should look into creating a PR? thank you!

@rafaelfranca
Copy link
Member

I didn't have time to look on it yet, but you must look at Sprockets::Environment and CachedEnvironment about this issue.

@cfassel
Copy link

cfassel commented Jun 24, 2015

same problem with 3.2.0 and Rails 4.2.2

@aaron
Copy link

aaron commented Jun 26, 2015

I've had the same problem with long asset precompile times on deploys. Downgrading sprockets to 2.12.4 fixed it for me.

@edslocomb
Copy link

Sprockets 3.0.3 does not have this problem, but the issue is severe in sprockets >= 3.1.0

Every capistrano deployment for one of our apps is takes 40 minutes to complete the assets:precompile step when no assets have changed. Running rake assets:precompile in the most recent (successful) release directory takes about 20 seconds.

There were a ton of cache keys changed between 3.0.3 and 3.1.0; I haven't yet found a single obvious change that introduced this awful behavior.

And I see 4.x has now been merged into master, so I'm not even sure which branch I should be trying to fix?

@coorasse
Copy link

I noticed the same problem with sprockets 3.2.0 and ckeditor 4.1.2 precompilation.
There is an issue open there: galetahub/ckeditor#573.
Rolling back to sprockets 2.12.4 solved the issue for me. I'll try now with 3.0.3 as suggested by @edslocomb

@zachallaun
Copy link

@davidalbert and I ran into this yesterday while investigating a substantial increase in our Heroku deploy times. We did not initially realize that this issue was the cause, but it is: every push to Heroku builds your application in a random directory (e.g. /tmp/build_98aaaa61490faef1dc1eb3ba2d69fbd5) that changes between each deploy. This means that sprockets 3 asset caching is completely broken for any application deployed on Heroku. In fact, caching degrades build performance due to the overhead of repeated writes and the cost of copying the (unusable) cache directory during the build step.

Dave and I created a small example Rails application to reproduce: https://github.com/davidbalbert/test-3.2. If anyone is curious, clone, bundle install, and follow these steps:

~/test-3.2 $ time RAILS_ENV=production bin/rake assets:precompile
# Observe long compile time

~/test-3.2 $ time RAILS_ENV=production bin/rake assets:precompile
# Observe short compile time due to cache present in tmp/cache

~/test-3.2 $ cd .. && mv test-3.2 no-cache && cd no-cache

~/no-cache $ time RAILS_ENV=production bin/rake assets:precompile
# Observe long compile time, even though tmp/cache is in place
# and no assets have changed

We began looking into potential fixes, but unfortunately weren't familiar enough with sprockets to make meaningful progress. Please let us know if we can be helpful in contributing to a fix.

@vincentwoo
Copy link

@zachallaun great work sleuthing. perhaps we should involve @hone?

@schneems
Copy link
Member

schneems commented Aug 4, 2015

Thanks for the example app. I have a few things to say A) Sprockets 3 is really complicated. B) I think this is where the problem is coming from

cache.fetch("file_digest:#{path}:#{stat.mtime.to_i}") do
self.stat_digest(path, stat)
end

cache.fetch("file_digest:#{path}:#{stat.mtime.to_i}") do
  self.stat_digest(path, stat)
end

The path variable here is the absolute path, for example:

"/Users/richardschneeman/Documents/projects/tmp/test-move-3.2-6/app/assets/javascripts/application.js"

This still doesn't explain why it used to work, and what exactly changed. My current guess is that a cache key is generated with the absolute filename, and stored in ./tmp, then when the directory is moved, the new absolute path generates a new cache key so it thinks that it isn't on disk and doesn't realize it until the last minute:

if File.exist?(target)
logger.debug "Skipping #{target}, already exists"
else
logger.info "Writing #{target}"
asset.write_to target
end

if File.exist?(target)
  logger.debug "Skipping #{target}, already exists"
else
  logger.info "Writing #{target}"
  asset.write_to target
end

When you deploy this example app the first time you see the assets written:

-----> Preparing app for Rails asset pipeline
       Running: rake assets:precompile
       I, [2015-08-04T23:39:36.734489 #1087]  INFO -- : Writing /tmp/build_a3e3555bd3593e048943fcbcfdd67e97/public/assets/application-47e3846a90680e1d494256b1e67f4c6c7d988a75c1bc9b56aa4942c148d25d23.js
       I, [2015-08-04T23:39:36.754071 #1087]  INFO -- : Writing /tmp/build_a3e3555bd3593e048943fcbcfdd67e97/public/assets/application-e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855.css
       Asset precompilation completed (33.69s)

On repeated deploys:

remote: -----> Preparing app for Rails asset pipeline
remote:        Running: rake assets:precompile
remote:        Asset precompilation completed (36.94s)

It still takes a long time but doesn't output Writing.

@zachallaun
Copy link

@schneems Thanks for looking further into this! I certainly agree with you on point A. There seem to be many layers of caching going on, and I believe that all of them that rely on a path or a URI are broken.

Here are another couple of lines that are likely impacting performance heavily:

I think the cache misses indicated above might be the largest performance issue, as they seem to be caching the result of load_asset_by_uri, which is the method actually running all processors on a particular asset. (See here.)

@zachallaun
Copy link

@josh, It looks like you wrote much of the code around the areas we've been discussion. If you have time and are willing, it would be great to hear your opinion on this.

@schneems
Copy link
Member

schneems commented Aug 5, 2015

That first line is never executed in the example you gave me

puts [filename, params].inspect
["/Users/richardschneeman/Documents/projects/tmp/test-move-3.2-6/app/assets/javascripts/application.js", {:type=>"application/javascript"}]

params.key?(:id) will always be false.

If someone could set up a script that exits with a 1 when this takes longer than 10 seconds then maybe we could use git bisect to figure out where the problem was introduced.

@schneems
Copy link
Member

schneems commented Aug 5, 2015

Here's some more info for you. I used git bisect to find the regression with this script: https://gist.github.com/schneems/85f592ba2773761dfcf3

After putting that in a local clone of the sprockets repo and pointing the example app to use a path to the local repo I ran

git bisect start
git bisect bad 543a5a27190c26de8f3a1b03e18aed8da0367c63
git bisect good 9937e2d3798350154d6d34aa62fb7a51d5d7d5a0
git bisect run ./move-project-test.rb

Which shows that the last "good" commit is: 4a16c90

The commit that introduced the regression is:
d25057e

This change was introduced by @josh in sstephenson#555. Maybe he has some ideas as to the failure mode at play here, it's not obvious to me.

@schneems
Copy link
Member

schneems commented Aug 6, 2015

That script isn't perfect and that bisect was a red herring.

schneems added a commit that referenced this issue Aug 8, 2015
This fix is targeted at 3.x. I haven't tested 4.x for presence of this problem, but I would be surprised if it didn't exist there as well.

Right now the cache is using absolute paths to generate cache keys, this doesn't work if you end up running `rake assets:precompile` in different directories. Say you've moved your project or are running on a different server with a different absolute path. Or say, on every build on Heroku ever. This was reported in #59.

This change introduces UnloadedAsset class that is responsible for among other things generating consistent cache keys.

Previously when running https://gist.github.com/schneems/85f592ba2773761dfcf3 the output would show that the second build took over 38 seconds to build. With this patch applied it takes around 3.385542936011916 seconds.

DON'T MERGE THIS IN YET it needs a cleanup, some better docs, cause geez the 90% of the implicit existing undocumented behavior here is kinda crazy. Not all tests are passing yet either.
@schneems
Copy link
Member

schneems commented Aug 8, 2015

Pretty sure it was 100% the cache keys. I made a WIP pull request here if you want to take a look #89. Still needs some love, but you'll get the gist.

schneems added a commit that referenced this issue Aug 9, 2015
This fix is targeted at 3.x. I haven't tested 4.x for presence of this problem, but I would be surprised if it didn't exist there as well.

Right now the cache is using absolute paths to generate cache keys, this doesn't work if you end up running `rake assets:precompile` in different directories. Say you've moved your project or are running on a different server with a different absolute path. Or say, on every build on Heroku ever. This was reported in #59.

This change introduces UnloadedAsset class that is responsible for among other things generating consistent cache keys.

Previously when running https://gist.github.com/schneems/85f592ba2773761dfcf3 the output would show that the second build took over 38 seconds to build. With this patch applied it takes around 3.385542936011916 seconds.

DON'T MERGE THIS IN YET it needs a cleanup, some better docs, cause geez the 90% of the implicit existing undocumented behavior here is kinda crazy. Not all tests are passing yet either.
schneems added a commit that referenced this issue Aug 9, 2015
This fix is targeted at 3.x. I haven't tested 4.x for presence of this problem, but I would be surprised if it didn't exist there as well.

Right now the cache is using absolute paths to generate cache keys, this doesn't work if you end up running `rake assets:precompile` in different directories. Say you've moved your project or are running on a different server with a different absolute path. Or say, on every build on Heroku ever. This was reported in #59.

This change introduces UnloadedAsset class that is responsible for among other things generating consistent cache keys.

Previously when running https://gist.github.com/schneems/85f592ba2773761dfcf3 the output would show that the second build took over 38 seconds to build. With this patch applied it takes around 3.385542936011916 seconds.

DON'T MERGE THIS IN YET it needs a cleanup, some better docs, cause geez the 90% of the implicit existing undocumented behavior here is kinda crazy. Not all tests are passing yet either.
schneems added a commit that referenced this issue Aug 10, 2015
This fix is targeted at 3.x. It looks like this problem doesn't exist on sprockets 4.x. Why exactly? I have no clue, I would appreciate independent verification of this. Please open up a new issue and @-mention me if you can reproduce this problem with Sprockets 4.x.

Right now the cache is using absolute paths to generate cache keys, this doesn't work if you end up running `rake assets:precompile` in different directories. Say you've moved your project or are running on a different server with a different absolute path. Or say, on every build on Heroku ever. This was reported in #59.

This change introduces UnloadedAsset class that is responsible for among other things generating consistent cache keys.

Previously when running https://gist.github.com/schneems/85f592ba2773761dfcf3 the output would show that the second build took over 38 seconds to build. With this patch applied it takes around 3.38 seconds.

Most of this commit is docs, as the current behavior and use of the cache is not really documented, and extremely difficult to follow in the code. All existing behavior is preserved, all tests pass, an extra test was added to ensure that a project built using a previously built directory will be fast.
@schneems
Copy link
Member

I've got a working solution at #89. I'm done with cleanup, added docs and a test for current behavior. I also updated the PR description. I tried to reproduce the problem with Sprockets 4.x and couldn't so it looks like this issue is isolated to only 3.x though I have no idea why and would like others to double check me to see if they can reproduce the original issue with sprockets 4.x.

schneems added a commit that referenced this issue Aug 10, 2015
This fix is targeted at 3.x. This problem may or may not exist on 4.x I've been unable to get 4.x working with a Rails app.


Right now the cache is using absolute paths to generate cache keys, this doesn't work if you end up running `rake assets:precompile` in different directories. Say you've moved your project or are running on a different server with a different absolute path. Or say, on every build on Heroku ever. This was reported in #59.

This change introduces UnloadedAsset class that is responsible for among other things generating consistent cache keys.

Previously when running https://gist.github.com/schneems/85f592ba2773761dfcf3 the output would show that the second build took over 38 seconds to build. With this patch applied it takes around 3.38 seconds.

Most of this commit is docs, as the current behavior and use of the cache is not really documented, and extremely difficult to follow in the code. All existing behavior is preserved, all tests pass, an extra test was added to ensure that a project built using a previously built directory will be fast.
schneems added a commit to schneems/sprockets that referenced this issue Aug 12, 2015
This fix is targeted at 3.x. This problem may or may not exist on 4.x I've been unable to get 4.x working with a Rails app.


Right now the cache is using absolute paths to generate cache keys, this doesn't work if you end up running `rake assets:precompile` in different directories. Say you've moved your project or are running on a different server with a different absolute path. Or say, on every build on Heroku ever. This was reported in rails#59.

This change introduces UnloadedAsset class that is responsible for among other things generating consistent cache keys.

Previously when running https://gist.github.com/schneems/85f592ba2773761dfcf3 the output would show that the second build took over 38 seconds to build. With this patch applied it takes around 3.38 seconds.

Most of this commit is docs, as the current behavior and use of the cache is not really documented, and extremely difficult to follow in the code. All existing behavior is preserved, all tests pass, an extra test was added to ensure that a project built using a previously built directory will be fast.
schneems added a commit that referenced this issue Aug 12, 2015
This fix is targeted at 3.x. This problem may or may not exist on 4.x I've been unable to get 4.x working with a Rails app.


Right now the cache is using absolute paths to generate cache keys, this doesn't work if you end up running `rake assets:precompile` in different directories. Say you've moved your project or are running on a different server with a different absolute path. Or say, on every build on Heroku ever. This was reported in #59.

This change introduces UnloadedAsset class that is responsible for among other things generating consistent cache keys.

Previously when running https://gist.github.com/schneems/85f592ba2773761dfcf3 the output would show that the second build took over 38 seconds to build. With this patch applied it takes around 3.38 seconds.

Most of this commit is docs, as the current behavior and use of the cache is not really documented, and extremely difficult to follow in the code. All existing behavior is preserved, all tests pass, an extra test was added to ensure that a project built using a previously built directory will be fast.
schneems added a commit that referenced this issue Aug 12, 2015
This fix is targeted at 3.x. This problem may or may not exist on 4.x I've been unable to get 4.x working with a Rails app.


Right now the cache is using absolute paths to generate cache keys, this doesn't work if you end up running `rake assets:precompile` in different directories. Say you've moved your project or are running on a different server with a different absolute path. Or say, on every build on Heroku ever. This was reported in #59.

This change introduces UnloadedAsset class that is responsible for among other things generating consistent cache keys.

Previously when running https://gist.github.com/schneems/85f592ba2773761dfcf3 the output would show that the second build took over 38 seconds to build. With this patch applied it takes around 3.38 seconds.

Most of this commit is docs, as the current behavior and use of the cache is not really documented, and extremely difficult to follow in the code. All existing behavior is preserved, all tests pass, an extra test was added to ensure that a project built using a previously built directory will be fast.
schneems added a commit that referenced this issue Aug 12, 2015
schneems added a commit to schneems/sprockets that referenced this issue Aug 12, 2015
This fix is targeted at 4.x.

Right now the cache is using absolute paths to generate cache keys, this doesn't work if you end up running `rake assets:precompile` in different directories. Say you've moved your project or are running on a different server with a different absolute path. Or say, on every build on Heroku ever. This was reported in rails#59 and rails#90

This commit is a introduces UnloadedAsset class that is responsible for among other things generating consistent cache keys.

Previously when running https://gist.github.com/schneems/85f592ba2773761dfcf3 the output would show that the second build took over 38 seconds to build. With this patch applied it takes around 3.38 seconds.

Most of this commit is docs, as the current behavior and use of the cache is not really documented, and extremely difficult to follow in the code. All existing behavior is preserved, all tests pass, an extra test was added to ensure that a project built using a previously built directory will be fast.
@mferrier
Copy link
Author

Confirmed this fixed our problem. Thanks to everyone for the fix!


EDIT: I spoke too soon. Our team had to revert the gem version bump because it appears now that dependent files aren't being recompiled when the dependency changes. We'll post more details (or open a new issue) when we have more information.

@rafaelfranca
Copy link
Member

Please open a new issue when you find the information.

@bughit
Copy link

bughit commented Aug 13, 2015

please take a look at #96

malida pushed a commit to malida/sprockets that referenced this issue Aug 14, 2015
This fix is targeted at 4.x.

Right now the cache is using absolute paths to generate cache keys, this doesn't work if you end up running `rake assets:precompile` in different directories. Say you've moved your project or are running on a different server with a different absolute path. Or say, on every build on Heroku ever. This was reported in rails#59 and rails#90

This commit is a introduces UnloadedAsset class that is responsible for among other things generating consistent cache keys.

Previously when running https://gist.github.com/schneems/85f592ba2773761dfcf3 the output would show that the second build took over 38 seconds to build. With this patch applied it takes around 3.38 seconds.

Most of this commit is docs, as the current behavior and use of the cache is not really documented, and extremely difficult to follow in the code. All existing behavior is preserved, all tests pass, an extra test was added to ensure that a project built using a previously built directory will be fast.
malida pushed a commit to malida/sprockets that referenced this issue Aug 16, 2015
This fix is targeted at 4.x.

Right now the cache is using absolute paths to generate cache keys, this doesn't work if you end up running `rake assets:precompile` in different directories. Say you've moved your project or are running on a different server with a different absolute path. Or say, on every build on Heroku ever. This was reported in rails#59 and rails#90

This commit is a introduces UnloadedAsset class that is responsible for among other things generating consistent cache keys.

Previously when running https://gist.github.com/schneems/85f592ba2773761dfcf3 the output would show that the second build took over 38 seconds to build. With this patch applied it takes around 3.38 seconds.

Most of this commit is docs, as the current behavior and use of the cache is not really documented, and extremely difficult to follow in the code. All existing behavior is preserved, all tests pass, an extra test was added to ensure that a project built using a previously built directory will be fast.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants