Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sort static files just once. #3204

Merged
merged 3 commits into from Dec 27, 2014
Merged

Sort static files just once. #3204

merged 3 commits into from Dec 27, 2014

Conversation

parkr
Copy link
Member

@parkr parkr commented Dec 11, 2014

Ref: #2075.

@parkr
Copy link
Member Author

parkr commented Dec 11, 2014

@gjtorikian can you try this?

@parkr parkr added the fix label Dec 11, 2014
@gjtorikian
Copy link
Member

Shaved about forty seconds! I'll take it!

@parkr
Copy link
Member Author

parkr commented Dec 11, 2014

😮 Wooooow. Can you run stackprof again with this code and see what else is causing the bottleneck? 40 seconds of a 5-min build is a big win, but it shouldn't take more than 1 min to compile the entire site.

The sorting should probably also take place after generate... hm.

@@ -15,6 +17,7 @@ def initialize(site, base, dir, name, collection = nil)
@dir = dir
@name = name
@collection = collection
@relative_path = File.join(*[@dir, @name].compact)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this ever have a nil value? Just curious about the compact.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if both @dir and @name are nil, then it'd be File.join() with nothing. It should be theoretically impossible for @name to be empty, but @dir can be.

@gjtorikian
Copy link
Member

To be clear, here's what's going on in my build:

  • It's five minutes total, but we have several collections and several static files, not to mention minifying JS
  • One collection is a clear bottleneck, taking two and a half minutes to complete
  • Zoning in on just this one collection, it takes a minute-and-a-half to build (the other minute is coming from somewhere else, not dealing with it now)

That's where I discovered this sort issue.

Here are the top three rows from stackprof:

      3761  (16.0%)        3761  (16.0%)     Jekyll::StaticFile#relative_path
      2461  (10.5%)        2453  (10.4%)     Set#initialize_copy
      2306   (9.8%)        1075   (4.6%)     block in Jekyll::URL#generate_url

Clear drop-off after Set#initialize_copy.

Interestingly, that relative_path now is coming from the sitemap plugin:

def html_files
  @site.static_files.select { |file| File.extname(file.relative_path) == ".html" }
end

If I remove this plugin, I gain just a few more seconds.

The other one seems to come from site.rb:

def documents
  collections.reduce(Set.new) do |docs, (_, collection)|
    docs + collection.docs + collection.files
end.to_a

@parkr
Copy link
Member Author

parkr commented Dec 11, 2014

We could cache the extname call for sure... As for the #documents call, how often is it being called itself?

@gjtorikian
Copy link
Member

2501    98.85%  block in Jekyll::Site#documents
16  0.63%   block in Sass::Selector::SimpleSequence#do_extend
13  0.51%   Sass::Selector::SimpleSequence#with_more_sources

I think that is what you wanted? 😟 I'm not sure! We can also put this PR on pause and talk IRL.

@parkr
Copy link
Member Author

parkr commented Dec 11, 2014

2501 98.85% block in Jekyll::Site#documents

WHOA. Site#documents is called 2500 times?! geepers creepers. Let's talk IRL when you have time.

@gjtorikian
Copy link
Member

If you saw the site you'd probably weep. That's not such a crazy figure: https://help.github.com/sitemap.xml

@parkr
Copy link
Member Author

parkr commented Dec 18, 2014

@gjtorikian Thanks for taking the time to check these. Can you try out d10a941? :)

@gjtorikian
Copy link
Member

Wow:

$ script/benchmark 
Building took 81.17 seconds (1.35 minutes)

For reference, here's the exact same site--all of it!--using the copy of Jekyll provided by github-pages (2.4.0):

$ script/benchmark 
Building took 232.16 seconds (3.87 minutes)

This is better than fantastic. I can run another stackprof if you like, if you think that'll suss out more perf gains. 😄

@parkr
Copy link
Member Author

parkr commented Dec 18, 2014

HOLY MOLY! Ok cool. Yeah another stackprof would be splendid.

@gjtorikian
Copy link
Member

For full disclosure, the benchmark script is simply:

time = Benchmark.realtime do
  `jekyll build --config=_config.yml,_config.dev.yml`
end

Here's the stackprof:

     TOTAL    (pct)     SAMPLES    (pct)     FRAME
     21072  (39.1%)       21072  (39.1%)     Jekyll::StaticFile#relative_path
     27623  (51.3%)        6551  (12.2%)     block in Jekyll::Site#site_payload
      4594   (8.5%)        4580   (8.5%)     Set#initialize_copy
      2331   (4.3%)        1080   (2.0%)     block in Jekyll::URL#generate_url
      1124   (2.1%)         813   (1.5%)     URI::Escape#escape
      5159   (9.6%)         750   (1.4%)     Liquid::Context#variable
       680   (1.3%)         680   (1.3%)     block (2 levels) in Gem::BasicSpecification#contains_requirable_file?
      5783  (10.7%)         624   (1.2%)     Liquid::Context#resolve
       613   (1.1%)         613   (1.1%)     Jekyll::Utils#slugify
       533   (1.0%)         533   (1.0%)     Liquid::Context#has_interrupt?
       509   (0.9%)         508   (0.9%)     Jekyll::Convertible#converters
       507   (0.9%)         507   (0.9%)     Jekyll::EntryFilter#symlink?
       498   (0.9%)         482   (0.9%)     Set#add
       442   (0.8%)         442   (0.8%)     Jekyll::Tags::IncludeTag#resolved_includes_dir
       550   (1.0%)         346   (0.6%)     block in FileUtils::Entry_#copy_file
      7301  (13.6%)         331   (0.6%)     block in Jekyll::Site#read_directories
       311   (0.6%)         311   (0.6%)     URI::Parser#escape
       297   (0.6%)         297   (0.6%)     Nokogiri::XML::Document#decorate
      1131   (2.1%)         283   (0.5%)     FileUtils#fu_each_src_dest0
       267   (0.5%)         267   (0.5%)     Nokogiri::XML::XPathContext#register_namespaces
       253   (0.5%)         253   (0.5%)     Liquid::Context#[]=
       251   (0.5%)         251   (0.5%)     FileUtils#fu_mkdir
       794   (1.5%)         244   (0.5%)     FileUtils::Entry_#copy_file
     27392  (50.8%)         238   (0.4%)     block in Liquid::Block#render_all
       232   (0.4%)         232   (0.4%)     block in Jekyll::Convertible#output_ext
       220   (0.4%)         220   (0.4%)     block in Jekyll::URL#sanitize_url
       204   (0.4%)         204   (0.4%)     block (2 levels) in FileUtils::Entry_#copy_file
       178   (0.3%)         178   (0.3%)     Jekyll::StaticFile#path
      1201   (2.2%)         177   (0.3%)     block in Jekyll::Site#site_payload
       323   (0.6%)         176   (0.3%)     Jekyll::Document#read

I can pass the file along to you too if it helps.

@parkr
Copy link
Member Author

parkr commented Dec 18, 2014

That stackprof looks old – can you re-run with the latest code here please? :)

Also your benchmark could be full-Ruby:

require 'jekyll'
time = Benchmark.realtime do
  Jekyll::Commands::Build.process({ "config" => %w[_config.yml _config.dev.yml] })
end

@parkr
Copy link
Member Author

parkr commented Dec 19, 2014

I LOVE THE PARALLEL GEM IT IS AMAZING

~/jekyll/jekyll#sort-static-files-once$ time be jekyll build -s site
Configuration file: site/_config.yml
            Source: site
       Destination: /Users/parker/jekyll/jekyll/_site
      Generating...
                    done.
 Auto-regeneration: disabled. Use --watch to enable.

real    0m1.414s
user    0m1.986s
sys 0m1.220s
~/jekyll/jekyll#sort-static-files-once$ git co master
Switched to branch 'master'
Your branch is up-to-date with 'origin/master'.
~/jekyll/jekyll#master$ time be jekyll build -s site
Configuration file: site/_config.yml
            Source: site
       Destination: /Users/parker/jekyll/jekyll/_site
      Generating...
                    done.
 Auto-regeneration: disabled. Use --watch to enable.

real    0m6.303s
user    0m6.014s
sys 0m0.285s
~/jekyll/jekyll#master$

@gjtorikian
Copy link
Member

Hah. Let me know when you need a rerun. I'll hold off for now.

@alfredxing
Copy link
Member

✨😮✨

@parkr
Copy link
Member Author

parkr commented Dec 19, 2014

Turns out we have to restructure some stuff in order to use parallel.

@parkr parkr merged commit f81601e into master Dec 27, 2014
@parkr parkr deleted the sort-static-files-once branch December 27, 2014 04:05
parkr added a commit that referenced this pull request Dec 27, 2014
@gjtorikian
Copy link
Member

Definitely a gigantic improvement here. Thanks a ton! Running off of master:

The build took 72.82 seconds (1.21 minutes)

Unfortunately it looks like Liquid-3 is handling tags a bit differently. I know this isn't the right issue for that, but I'm not sure where to post this:

@parkr
Copy link
Member Author

parkr commented Jan 2, 2015

It looks like YAML keys with digits in them are no longer allowed.

This could have to do with the strict parser. If you search for strict in lib/jekyll.rb and remove that line, does it work?

Liquid seems to be parsing vendor/JS files, too

If these files are include_relative'd anywhere, then this'll happen as a result of the rendering process. :/

@mattr-
Copy link
Member

mattr- commented Jan 2, 2015

Liquid seems to be parsing vendor/JS files, too

If these files are include_relative'd anywhere, then this'll happen as a result of the rendering process. :/

That would mean that we'd need to work around this change in behavior before releasing Liquid 3 in a Jekyll 2.x version, correct?

@parkr
Copy link
Member Author

parkr commented Jan 2, 2015

That would mean that we'd need to work around this change in behavior before releasing Liquid 3 in a Jekyll 2.x version, correct?

We'd have to have like {% include_relative blah.js norender %} – I think we manually render in the IncludeTag but not sure off the top of my head. I don't know if it affects Jekyll 2 – this is a Jekyll 3 PR.

@gjtorikian
Copy link
Member

This could have to do with the strict parser. If you search for strict in lib/jekyll.rb and remove that line, does it work?

Yeeeeeep, that was it; turning strict off made it happy. Whatever, though, I'm fine with it not accepting numbers.

We'd have to have like {% include_relative blah.js norender %}

shrug janl/mustache.js#418

In the meantime, it's just one character.

@jekyll jekyll locked and limited conversation to collaborators Feb 27, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants