-
-
Notifications
You must be signed in to change notification settings - Fork 9.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Two massive performance improvements for large sites #6730
Conversation
…d of all the documents When you have 10,000 posts, this saves a _ton_ of time.
Instead, assign the current document in the SiteDrop and generate the related_posts if requested by Liquid.
# recent posts. | ||
# We should remove this in 4.0 and switch to `{{ post.related_posts }}`. | ||
def related_posts | ||
return nil unless @current_document.is_a?(Jekyll::Document) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we also return early if the current document is not a Post
..?
Like for a site with two posts and another collection with 1000 documents..?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we also return early if the current document is not a
Post
..?
I don't think so. A site could be using site.related_posts
on a Document to display the most recent blog posts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point.. 👍
FYI: no perceptual difference for build time on real websites — tested with Jekyll's own website (200+ documents) and https://github.com/mmistakes/front-matter-defaults (800+ documents). |
I second Frank's opinion.. I too couldn't see any huge difference with local tests.. |
Not too much difference in build time on a site with a couple hundred posts, but I can see the value here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even without a measurable performance improvement, most of these changes make sense to me.
As long as we are confident that this won't have unintended side effects, I am 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While the changes look good to me.., something doesn't feel right.. (irrespective of discernible perf gains)
Definitely requires some more testing.. or new tests to our suite..
Blocking this for the time-being..
@ashmaroli Generate a site with 10,000 posts and you'll notice the difference here. Why are you interested in blocking this PR? What doesn't feel right? I used this script to generate 10,000 posts:
I called Before: https://rbspy.parkermakes.tools/rbspy-2018-01-30-o4qFcKDwSo.svg |
Tested locally on Windows.. no issues..
@parkr Since https://github.com/rbspy/rbspy doesn't run on Windows yet, I needed to test things out manually.. and that is a huge time-consumer because Anyways, finally walked through each change here and groked through the implications.. def most_recent_posts
- @most_recent_posts ||= (site.posts.docs.reverse - [post]).first(10)
+ @most_recent_posts ||= (site.posts.docs.last(11).reverse - [post]).first(10)
end |
I set up a fresh new test workspace and generated 5500 posts using Parker's script above..
|
🔥 🚒 @jekyllbot: merge +minor |
This is a HUGE performance improvement for a large site! We should release this soon. |
If we can, we should push this down to a stable 3.7 release or just release 3.8 now with the massive performance gains. |
site.related_posts
unless we request it.site.related_posts
without LSI much more efficiently.I've been using
@jvns
's newrbspy
program. I generated a random site of 10,000 posts (no layouts) to test this. When I got back the flamegraph, it was obvious thatmost_recent_posts
was the culprit for a significant amount of time. After I applied this patch, regeneration went from 150s to 50s. We saved about 100s (1m40s) of time just by not generating related posts unless we needed it!