Sort static files just once. #3204

Merged
merged 3 commits into from Dec 27, 2014

Conversation

Projects
None yet
5 participants
@parkr
Member

parkr commented Dec 11, 2014

Ref: #2075.

@parkr

This comment has been minimized.

Show comment
Hide comment
@parkr

parkr Dec 11, 2014

Member

@gjtorikian can you try this?

Member

parkr commented Dec 11, 2014

@gjtorikian can you try this?

@parkr parkr added the fix label Dec 11, 2014

@gjtorikian

This comment has been minimized.

Show comment
Hide comment
@gjtorikian

gjtorikian Dec 11, 2014

Member

Shaved about forty seconds! I'll take it!

Member

gjtorikian commented Dec 11, 2014

Shaved about forty seconds! I'll take it!

@parkr

This comment has been minimized.

Show comment
Hide comment
@parkr

parkr Dec 11, 2014

Member

😮 Wooooow. Can you run stackprof again with this code and see what else is causing the bottleneck? 40 seconds of a 5-min build is a big win, but it shouldn't take more than 1 min to compile the entire site.

The sorting should probably also take place after generate... hm.

Member

parkr commented Dec 11, 2014

😮 Wooooow. Can you run stackprof again with this code and see what else is causing the bottleneck? 40 seconds of a 5-min build is a big win, but it shouldn't take more than 1 min to compile the entire site.

The sorting should probably also take place after generate... hm.

@@ -15,6 +17,7 @@ def initialize(site, base, dir, name, collection = nil)
@dir = dir
@name = name
@collection = collection
+ @relative_path = File.join(*[@dir, @name].compact)

This comment has been minimized.

@gjtorikian

gjtorikian Dec 11, 2014

Member

Would this ever have a nil value? Just curious about the compact.

@gjtorikian

gjtorikian Dec 11, 2014

Member

Would this ever have a nil value? Just curious about the compact.

This comment has been minimized.

@parkr

parkr Dec 11, 2014

Member

if both @dir and @name are nil, then it'd be File.join() with nothing. It should be theoretically impossible for @name to be empty, but @dir can be.

@parkr

parkr Dec 11, 2014

Member

if both @dir and @name are nil, then it'd be File.join() with nothing. It should be theoretically impossible for @name to be empty, but @dir can be.

@gjtorikian

This comment has been minimized.

Show comment
Hide comment
@gjtorikian

gjtorikian Dec 11, 2014

Member

To be clear, here's what's going on in my build:

  • It's five minutes total, but we have several collections and several static files, not to mention minifying JS
  • One collection is a clear bottleneck, taking two and a half minutes to complete
  • Zoning in on just this one collection, it takes a minute-and-a-half to build (the other minute is coming from somewhere else, not dealing with it now)

That's where I discovered this sort issue.

Here are the top three rows from stackprof:

      3761  (16.0%)        3761  (16.0%)     Jekyll::StaticFile#relative_path
      2461  (10.5%)        2453  (10.4%)     Set#initialize_copy
      2306   (9.8%)        1075   (4.6%)     block in Jekyll::URL#generate_url

Clear drop-off after Set#initialize_copy.

Interestingly, that relative_path now is coming from the sitemap plugin:

def html_files
  @site.static_files.select { |file| File.extname(file.relative_path) == ".html" }
end

If I remove this plugin, I gain just a few more seconds.

The other one seems to come from site.rb:

def documents
  collections.reduce(Set.new) do |docs, (_, collection)|
    docs + collection.docs + collection.files
end.to_a
Member

gjtorikian commented Dec 11, 2014

To be clear, here's what's going on in my build:

  • It's five minutes total, but we have several collections and several static files, not to mention minifying JS
  • One collection is a clear bottleneck, taking two and a half minutes to complete
  • Zoning in on just this one collection, it takes a minute-and-a-half to build (the other minute is coming from somewhere else, not dealing with it now)

That's where I discovered this sort issue.

Here are the top three rows from stackprof:

      3761  (16.0%)        3761  (16.0%)     Jekyll::StaticFile#relative_path
      2461  (10.5%)        2453  (10.4%)     Set#initialize_copy
      2306   (9.8%)        1075   (4.6%)     block in Jekyll::URL#generate_url

Clear drop-off after Set#initialize_copy.

Interestingly, that relative_path now is coming from the sitemap plugin:

def html_files
  @site.static_files.select { |file| File.extname(file.relative_path) == ".html" }
end

If I remove this plugin, I gain just a few more seconds.

The other one seems to come from site.rb:

def documents
  collections.reduce(Set.new) do |docs, (_, collection)|
    docs + collection.docs + collection.files
end.to_a
@parkr

This comment has been minimized.

Show comment
Hide comment
@parkr

parkr Dec 11, 2014

Member

We could cache the extname call for sure... As for the #documents call, how often is it being called itself?

Member

parkr commented Dec 11, 2014

We could cache the extname call for sure... As for the #documents call, how often is it being called itself?

@gjtorikian gjtorikian referenced this pull request in jekyll/jekyll-sitemap Dec 11, 2014

Closed

Option to not build locally? #60

@gjtorikian

This comment has been minimized.

Show comment
Hide comment
@gjtorikian

gjtorikian Dec 11, 2014

Member
2501    98.85%  block in Jekyll::Site#documents
16  0.63%   block in Sass::Selector::SimpleSequence#do_extend
13  0.51%   Sass::Selector::SimpleSequence#with_more_sources

I think that is what you wanted? 😟 I'm not sure! We can also put this PR on pause and talk IRL.

Member

gjtorikian commented Dec 11, 2014

2501    98.85%  block in Jekyll::Site#documents
16  0.63%   block in Sass::Selector::SimpleSequence#do_extend
13  0.51%   Sass::Selector::SimpleSequence#with_more_sources

I think that is what you wanted? 😟 I'm not sure! We can also put this PR on pause and talk IRL.

@parkr

This comment has been minimized.

Show comment
Hide comment
@parkr

parkr Dec 11, 2014

Member

2501 98.85% block in Jekyll::Site#documents

WHOA. Site#documents is called 2500 times?! geepers creepers. Let's talk IRL when you have time.

Member

parkr commented Dec 11, 2014

2501 98.85% block in Jekyll::Site#documents

WHOA. Site#documents is called 2500 times?! geepers creepers. Let's talk IRL when you have time.

@gjtorikian

This comment has been minimized.

Show comment
Hide comment
@gjtorikian

gjtorikian Dec 12, 2014

Member

If you saw the site you'd probably weep. That's not such a crazy figure: https://help.github.com/sitemap.xml

Member

gjtorikian commented Dec 12, 2014

If you saw the site you'd probably weep. That's not such a crazy figure: https://help.github.com/sitemap.xml

@parkr

This comment has been minimized.

Show comment
Hide comment
@parkr

parkr Dec 18, 2014

Member

@gjtorikian Thanks for taking the time to check these. Can you try out d10a941? :)

Member

parkr commented Dec 18, 2014

@gjtorikian Thanks for taking the time to check these. Can you try out d10a941? :)

@gjtorikian

This comment has been minimized.

Show comment
Hide comment
@gjtorikian

gjtorikian Dec 18, 2014

Member

Wow:

$ script/benchmark 
Building took 81.17 seconds (1.35 minutes)

For reference, here's the exact same site--all of it!--using the copy of Jekyll provided by github-pages (2.4.0):

$ script/benchmark 
Building took 232.16 seconds (3.87 minutes)

This is better than fantastic. I can run another stackprof if you like, if you think that'll suss out more perf gains. 😄

Member

gjtorikian commented Dec 18, 2014

Wow:

$ script/benchmark 
Building took 81.17 seconds (1.35 minutes)

For reference, here's the exact same site--all of it!--using the copy of Jekyll provided by github-pages (2.4.0):

$ script/benchmark 
Building took 232.16 seconds (3.87 minutes)

This is better than fantastic. I can run another stackprof if you like, if you think that'll suss out more perf gains. 😄

@parkr

This comment has been minimized.

Show comment
Hide comment
@parkr

parkr Dec 18, 2014

Member

HOLY MOLY! Ok cool. Yeah another stackprof would be splendid.

Member

parkr commented Dec 18, 2014

HOLY MOLY! Ok cool. Yeah another stackprof would be splendid.

@gjtorikian

This comment has been minimized.

Show comment
Hide comment
@gjtorikian

gjtorikian Dec 18, 2014

Member

For full disclosure, the benchmark script is simply:

time = Benchmark.realtime do
  `jekyll build --config=_config.yml,_config.dev.yml`
end

Here's the stackprof:

     TOTAL    (pct)     SAMPLES    (pct)     FRAME
     21072  (39.1%)       21072  (39.1%)     Jekyll::StaticFile#relative_path
     27623  (51.3%)        6551  (12.2%)     block in Jekyll::Site#site_payload
      4594   (8.5%)        4580   (8.5%)     Set#initialize_copy
      2331   (4.3%)        1080   (2.0%)     block in Jekyll::URL#generate_url
      1124   (2.1%)         813   (1.5%)     URI::Escape#escape
      5159   (9.6%)         750   (1.4%)     Liquid::Context#variable
       680   (1.3%)         680   (1.3%)     block (2 levels) in Gem::BasicSpecification#contains_requirable_file?
      5783  (10.7%)         624   (1.2%)     Liquid::Context#resolve
       613   (1.1%)         613   (1.1%)     Jekyll::Utils#slugify
       533   (1.0%)         533   (1.0%)     Liquid::Context#has_interrupt?
       509   (0.9%)         508   (0.9%)     Jekyll::Convertible#converters
       507   (0.9%)         507   (0.9%)     Jekyll::EntryFilter#symlink?
       498   (0.9%)         482   (0.9%)     Set#add
       442   (0.8%)         442   (0.8%)     Jekyll::Tags::IncludeTag#resolved_includes_dir
       550   (1.0%)         346   (0.6%)     block in FileUtils::Entry_#copy_file
      7301  (13.6%)         331   (0.6%)     block in Jekyll::Site#read_directories
       311   (0.6%)         311   (0.6%)     URI::Parser#escape
       297   (0.6%)         297   (0.6%)     Nokogiri::XML::Document#decorate
      1131   (2.1%)         283   (0.5%)     FileUtils#fu_each_src_dest0
       267   (0.5%)         267   (0.5%)     Nokogiri::XML::XPathContext#register_namespaces
       253   (0.5%)         253   (0.5%)     Liquid::Context#[]=
       251   (0.5%)         251   (0.5%)     FileUtils#fu_mkdir
       794   (1.5%)         244   (0.5%)     FileUtils::Entry_#copy_file
     27392  (50.8%)         238   (0.4%)     block in Liquid::Block#render_all
       232   (0.4%)         232   (0.4%)     block in Jekyll::Convertible#output_ext
       220   (0.4%)         220   (0.4%)     block in Jekyll::URL#sanitize_url
       204   (0.4%)         204   (0.4%)     block (2 levels) in FileUtils::Entry_#copy_file
       178   (0.3%)         178   (0.3%)     Jekyll::StaticFile#path
      1201   (2.2%)         177   (0.3%)     block in Jekyll::Site#site_payload
       323   (0.6%)         176   (0.3%)     Jekyll::Document#read

I can pass the file along to you too if it helps.

Member

gjtorikian commented Dec 18, 2014

For full disclosure, the benchmark script is simply:

time = Benchmark.realtime do
  `jekyll build --config=_config.yml,_config.dev.yml`
end

Here's the stackprof:

     TOTAL    (pct)     SAMPLES    (pct)     FRAME
     21072  (39.1%)       21072  (39.1%)     Jekyll::StaticFile#relative_path
     27623  (51.3%)        6551  (12.2%)     block in Jekyll::Site#site_payload
      4594   (8.5%)        4580   (8.5%)     Set#initialize_copy
      2331   (4.3%)        1080   (2.0%)     block in Jekyll::URL#generate_url
      1124   (2.1%)         813   (1.5%)     URI::Escape#escape
      5159   (9.6%)         750   (1.4%)     Liquid::Context#variable
       680   (1.3%)         680   (1.3%)     block (2 levels) in Gem::BasicSpecification#contains_requirable_file?
      5783  (10.7%)         624   (1.2%)     Liquid::Context#resolve
       613   (1.1%)         613   (1.1%)     Jekyll::Utils#slugify
       533   (1.0%)         533   (1.0%)     Liquid::Context#has_interrupt?
       509   (0.9%)         508   (0.9%)     Jekyll::Convertible#converters
       507   (0.9%)         507   (0.9%)     Jekyll::EntryFilter#symlink?
       498   (0.9%)         482   (0.9%)     Set#add
       442   (0.8%)         442   (0.8%)     Jekyll::Tags::IncludeTag#resolved_includes_dir
       550   (1.0%)         346   (0.6%)     block in FileUtils::Entry_#copy_file
      7301  (13.6%)         331   (0.6%)     block in Jekyll::Site#read_directories
       311   (0.6%)         311   (0.6%)     URI::Parser#escape
       297   (0.6%)         297   (0.6%)     Nokogiri::XML::Document#decorate
      1131   (2.1%)         283   (0.5%)     FileUtils#fu_each_src_dest0
       267   (0.5%)         267   (0.5%)     Nokogiri::XML::XPathContext#register_namespaces
       253   (0.5%)         253   (0.5%)     Liquid::Context#[]=
       251   (0.5%)         251   (0.5%)     FileUtils#fu_mkdir
       794   (1.5%)         244   (0.5%)     FileUtils::Entry_#copy_file
     27392  (50.8%)         238   (0.4%)     block in Liquid::Block#render_all
       232   (0.4%)         232   (0.4%)     block in Jekyll::Convertible#output_ext
       220   (0.4%)         220   (0.4%)     block in Jekyll::URL#sanitize_url
       204   (0.4%)         204   (0.4%)     block (2 levels) in FileUtils::Entry_#copy_file
       178   (0.3%)         178   (0.3%)     Jekyll::StaticFile#path
      1201   (2.2%)         177   (0.3%)     block in Jekyll::Site#site_payload
       323   (0.6%)         176   (0.3%)     Jekyll::Document#read

I can pass the file along to you too if it helps.

@parkr

This comment has been minimized.

Show comment
Hide comment
@parkr

parkr Dec 18, 2014

Member

That stackprof looks old – can you re-run with the latest code here please? :)

Also your benchmark could be full-Ruby:

require 'jekyll'
time = Benchmark.realtime do
  Jekyll::Commands::Build.process({ "config" => %w[_config.yml _config.dev.yml] })
end
Member

parkr commented Dec 18, 2014

That stackprof looks old – can you re-run with the latest code here please? :)

Also your benchmark could be full-Ruby:

require 'jekyll'
time = Benchmark.realtime do
  Jekyll::Commands::Build.process({ "config" => %w[_config.yml _config.dev.yml] })
end
@parkr

This comment has been minimized.

Show comment
Hide comment
@parkr

parkr Dec 19, 2014

Member

I LOVE THE PARALLEL GEM IT IS AMAZING

~/jekyll/jekyll#sort-static-files-once$ time be jekyll build -s site
Configuration file: site/_config.yml
            Source: site
       Destination: /Users/parker/jekyll/jekyll/_site
      Generating...
                    done.
 Auto-regeneration: disabled. Use --watch to enable.

real    0m1.414s
user    0m1.986s
sys 0m1.220s
~/jekyll/jekyll#sort-static-files-once$ git co master
Switched to branch 'master'
Your branch is up-to-date with 'origin/master'.
~/jekyll/jekyll#master$ time be jekyll build -s site
Configuration file: site/_config.yml
            Source: site
       Destination: /Users/parker/jekyll/jekyll/_site
      Generating...
                    done.
 Auto-regeneration: disabled. Use --watch to enable.

real    0m6.303s
user    0m6.014s
sys 0m0.285s
~/jekyll/jekyll#master$
Member

parkr commented Dec 19, 2014

I LOVE THE PARALLEL GEM IT IS AMAZING

~/jekyll/jekyll#sort-static-files-once$ time be jekyll build -s site
Configuration file: site/_config.yml
            Source: site
       Destination: /Users/parker/jekyll/jekyll/_site
      Generating...
                    done.
 Auto-regeneration: disabled. Use --watch to enable.

real    0m1.414s
user    0m1.986s
sys 0m1.220s
~/jekyll/jekyll#sort-static-files-once$ git co master
Switched to branch 'master'
Your branch is up-to-date with 'origin/master'.
~/jekyll/jekyll#master$ time be jekyll build -s site
Configuration file: site/_config.yml
            Source: site
       Destination: /Users/parker/jekyll/jekyll/_site
      Generating...
                    done.
 Auto-regeneration: disabled. Use --watch to enable.

real    0m6.303s
user    0m6.014s
sys 0m0.285s
~/jekyll/jekyll#master$
@gjtorikian

This comment has been minimized.

Show comment
Hide comment
@gjtorikian

gjtorikian Dec 19, 2014

Member

Hah. Let me know when you need a rerun. I'll hold off for now.

Member

gjtorikian commented Dec 19, 2014

Hah. Let me know when you need a rerun. I'll hold off for now.

@alfredxing

This comment has been minimized.

Show comment
Hide comment
@alfredxing

alfredxing Dec 19, 2014

Member

😮

Member

alfredxing commented Dec 19, 2014

😮

@parkr

This comment has been minimized.

Show comment
Hide comment
@parkr

parkr Dec 19, 2014

Member

Turns out we have to restructure some stuff in order to use parallel.

Member

parkr commented Dec 19, 2014

Turns out we have to restructure some stuff in order to use parallel.

@parkr parkr merged commit f81601e into master Dec 27, 2014

1 check passed

continuous-integration/travis-ci The Travis CI build passed
Details

@parkr parkr deleted the sort-static-files-once branch Dec 27, 2014

parkr added a commit that referenced this pull request Dec 27, 2014

@gjtorikian

This comment has been minimized.

Show comment
Hide comment
@gjtorikian

gjtorikian Dec 28, 2014

Member

Definitely a gigantic improvement here. Thanks a ton! Running off of master:

The build took 72.82 seconds (1.21 minutes)

Unfortunately it looks like Liquid-3 is handling tags a bit differently. I know this isn't the right issue for that, but I'm not sure where to post this:

Member

gjtorikian commented Dec 28, 2014

Definitely a gigantic improvement here. Thanks a ton! Running off of master:

The build took 72.82 seconds (1.21 minutes)

Unfortunately it looks like Liquid-3 is handling tags a bit differently. I know this isn't the right issue for that, but I'm not sure where to post this:

@parkr

This comment has been minimized.

Show comment
Hide comment
@parkr

parkr Jan 2, 2015

Member

It looks like YAML keys with digits in them are no longer allowed.

This could have to do with the strict parser. If you search for strict in lib/jekyll.rb and remove that line, does it work?

Liquid seems to be parsing vendor/JS files, too

If these files are include_relative'd anywhere, then this'll happen as a result of the rendering process. :/

Member

parkr commented Jan 2, 2015

It looks like YAML keys with digits in them are no longer allowed.

This could have to do with the strict parser. If you search for strict in lib/jekyll.rb and remove that line, does it work?

Liquid seems to be parsing vendor/JS files, too

If these files are include_relative'd anywhere, then this'll happen as a result of the rendering process. :/

@mattr-

This comment has been minimized.

Show comment
Hide comment
@mattr-

mattr- Jan 2, 2015

Member

Liquid seems to be parsing vendor/JS files, too

If these files are include_relative'd anywhere, then this'll happen as a result of the rendering process. :/

That would mean that we'd need to work around this change in behavior before releasing Liquid 3 in a Jekyll 2.x version, correct?

Member

mattr- commented Jan 2, 2015

Liquid seems to be parsing vendor/JS files, too

If these files are include_relative'd anywhere, then this'll happen as a result of the rendering process. :/

That would mean that we'd need to work around this change in behavior before releasing Liquid 3 in a Jekyll 2.x version, correct?

@parkr

This comment has been minimized.

Show comment
Hide comment
@parkr

parkr Jan 2, 2015

Member

That would mean that we'd need to work around this change in behavior before releasing Liquid 3 in a Jekyll 2.x version, correct?

We'd have to have like {% include_relative blah.js norender %} – I think we manually render in the IncludeTag but not sure off the top of my head. I don't know if it affects Jekyll 2 – this is a Jekyll 3 PR.

Member

parkr commented Jan 2, 2015

That would mean that we'd need to work around this change in behavior before releasing Liquid 3 in a Jekyll 2.x version, correct?

We'd have to have like {% include_relative blah.js norender %} – I think we manually render in the IncludeTag but not sure off the top of my head. I don't know if it affects Jekyll 2 – this is a Jekyll 3 PR.

@gjtorikian

This comment has been minimized.

Show comment
Hide comment
@gjtorikian

gjtorikian Jan 8, 2015

Member

This could have to do with the strict parser. If you search for strict in lib/jekyll.rb and remove that line, does it work?

Yeeeeeep, that was it; turning strict off made it happy. Whatever, though, I'm fine with it not accepting numbers.

We'd have to have like {% include_relative blah.js norender %}

shrug janl/mustache.js#418

In the meantime, it's just one character.

Member

gjtorikian commented Jan 8, 2015

This could have to do with the strict parser. If you search for strict in lib/jekyll.rb and remove that line, does it work?

Yeeeeeep, that was it; turning strict off made it happy. Whatever, though, I'm fine with it not accepting numbers.

We'd have to have like {% include_relative blah.js norender %}

shrug janl/mustache.js#418

In the meantime, it's just one character.

@wjdp wjdp referenced this pull request in newtheatre/history-project Oct 27, 2015

Merged

Upgrade to Jekyll3 #118

@jekyll jekyll locked and limited conversation to collaborators Feb 27, 2017

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.