New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incremental regeneration #380

Closed
zanshin opened this Issue Aug 7, 2011 · 51 comments

Comments

Projects
None yet
@zanshin
Copy link

zanshin commented Aug 7, 2011

Can some form of incremental regeneration (ala graysky@39ae8c7) be added to the core of Jekyll? This would greatly enhance the tool for those sites (like mine) that have nearly 1800 entries and counting, and which take upwards of 5 minutes to generate.

@belkadan

This comment has been minimized.

Copy link

belkadan commented Aug 7, 2011

I have a different implementation at https://github.com/belkadan/jekyll, too. It's helped plenty.

@sindresorhus

This comment has been minimized.

Copy link

sindresorhus commented Jan 28, 2012

👍 This would be immensely useful.

@stereobooster

This comment has been minimized.

Copy link
Contributor

stereobooster commented Feb 6, 2012

connected #118

@igrigorik

This comment has been minimized.

Copy link

igrigorik commented Jun 3, 2012

Any reason why this can't be merged? Would be rather useful.. stuck with the same wait loop here.

@matthiasbeyer

This comment has been minimized.

Copy link

matthiasbeyer commented Dec 19, 2012

I would love it to be merged, too! My machine is not as high-end as yours, and with +350 posts it takes up to 4 minutes right now.

Don't know if it is a plugin or jekyll itself, but every user would benefit if there is a performance-improvement!

@parkr

This comment has been minimized.

Copy link
Member

parkr commented Dec 19, 2012

We're working on a fix for this. Soon :)

@parkr

This comment has been minimized.

Copy link
Member

parkr commented Mar 17, 2013

As @tombell would point out, we need to regenerate the entire site every time anything changes – we just can't know how pages depend upon each other.

I'll wait to close this until I hear from @mojombo, though.

@parkr

This comment has been minimized.

Copy link
Member

parkr commented Mar 19, 2013

Talked to @mojombo – we're going to work on a solution for this that regenerates changed files and all the files that "depend" on that file. Not until after 1.0 though!

@jwebcat

This comment has been minimized.

Copy link

jwebcat commented Mar 26, 2013

@parkr any news on this? I would love this feature as well.

@mattr-

This comment has been minimized.

Copy link
Member

mattr- commented Mar 26, 2013

This feature won't land until after 1.0.0

@jwebcat

This comment has been minimized.

Copy link

jwebcat commented Mar 26, 2013

@mattr- I am new to ruby. If I patch jekyll.rb and the changes with a diff from either repo above will it break my install of Jekyll 1.0.0.beta2 ? I apologize if this is a dumb question. Thank you for response 👍

@mattr-

This comment has been minimized.

Copy link
Member

mattr- commented Mar 26, 2013

There's really no way to know ahead of time. The best way to know if it
will break is to apply the patch and run the tests.

@caiogondim

This comment has been minimized.

Copy link

caiogondim commented Apr 8, 2013

That would really help a lot =)

@AlexanderEkdahl

This comment has been minimized.

Copy link
Contributor

AlexanderEkdahl commented Apr 8, 2013

Could you create new topic branch for this feature? Opening it up for discussion so that others might contribute/help. It would indeed be an awesome feature to have!

@parkr

This comment has been minimized.

Copy link
Member

parkr commented Apr 8, 2013

We're focused on shipping 1.0 at this point so not quite yet. Soon!

@parkr

This comment has been minimized.

Copy link
Member

parkr commented Apr 8, 2013

We don't even know if this is possible in practical terms. We have a plan, but we need to see how that plan maps to reality before we can really think about doing this.

@Jack000

This comment has been minimized.

Copy link

Jack000 commented Apr 24, 2013

+1. I wouldn't mind managing dependency tree myself, just need a way to generate a single post.

@maul-esel

This comment has been minimized.

Copy link
Contributor

maul-esel commented Apr 25, 2013

@Jack000: 👍 that's what I thought - there's so many possible dependencies, they can't be all managed by jekyll. Instead, the user should take care of them.

@csaunders

This comment has been minimized.

Copy link

csaunders commented Oct 18, 2013

So this looks like a good candidate to talk about the conversation I was having with @mattr- about a caching/performance improvements.

I'm not sure if liquid provides this functionality yet, but would having something in liquid that informs you of it's dependencies be useful? We could build up a tree of dependencies as well as a lookup table for each of those files along with some kind of file signature that would allow us to quickly check if a file needs to be regenerated or not.

@Jack000

This comment has been minimized.

Copy link

Jack000 commented Oct 18, 2013

I'm not sure if liquid is the right place to put the dependency tree. This seems like something that should go in a plugin.

As I understand it the speed problem is caused by writing rather than reading. I can't speak to all use cases but in my case what I need is something in Jekyll that allows writing of a single post (like with the limit_posts flag), but reading of all posts in the plugin system. This way I'd be able to add a single new post, while being able to populate related posts etc in the sidebar using a plugin.

for me, progressive generation is the biggest issue with jekyll. I operate a medium-sized site and it's taking 15 minutes to generate, which is becoming rather unsustainable.

@mattr-

This comment has been minimized.

Copy link
Member

mattr- commented Oct 18, 2013

Do you have numbers? # of posts, pages, etc.

On Fri, Oct 18, 2013 at 1:01 PM, Jack000 notifications@github.com wrote:

I'm not sure if liquid is the right place to put the dependency tree. This seems like something that should go in a plugin.
As I understand it the speed problem is caused by writing rather than reading. I can't speak to all use cases but in my case what I need is something in Jekyll that allows writing of a single post (like with the limit_posts flag), but reading of all posts in the plugin system. This way I'd be able to add a single new post, while being able to populate related posts etc in the sidebar using a plugin.

for me, progressive generation is the biggest issue with jekyll. I operate a medium-sized site and it's taking 15 minutes to generate, which is becoming rather unsustainable.

Reply to this email directly or view it on GitHub:
#380 (comment)

@Jack000

This comment has been minimized.

Copy link

Jack000 commented Oct 18, 2013

our blog has 487 posts, and takes 6:00 to generate

I abused jekyll a bit for the main site:
923 posts and maybe 100 or so static pages. It takes 18:04 to generate, but after testing it seems jekyll is only responsible for 7:00 of that. I'm running version v0.11.2

There's just a bit of culture shock coming from the wordpress/drupal world where you expect things to be instant. Since most of the time we're just adding one post it seems a bit wasteful to generate the whole site every time.

@csaunders

This comment has been minimized.

Copy link

csaunders commented Oct 18, 2013

The reason for the need for a dependency tree is because jekyll actually doesn't know anything about what files are required to generate a page, at least when it comes to using liquid tags such as {% include %}

Adding that feature such that you can query a liquid template to know what it's dependencies are and check for changes at the Jekyll level is what I'm looking into.

There may be some aspects of jekyll that won't require changes though; such as the headers, which are all within the jekyll domain.

@Jack000

This comment has been minimized.

Copy link

Jack000 commented Oct 18, 2013

I was under the impression that by the time you get to the {% include %} part, that the file is already being written. If there's a way in liquid to tell jekyll "nothing's changed, no need to write this file", that'll work too.

I have no idea where the actual bottleneck is, so someone more familiar with the code might have better feedback.

I run jekyll as a post-receive hook, and I get people making 20 little commits and wondering why their post hasn't shown up an hour later..

@mattr-

This comment has been minimized.

Copy link
Member

mattr- commented Oct 19, 2013

I've thought about the dependency tree, but I'm hoping to avoid it because getting to it is a bit nasty.

@csaunders

This comment has been minimized.

Copy link

csaunders commented Oct 21, 2013

I'm working on a patch to liquid which would allow you to query the loaded
liquid template about its dependencies.

I was looking around and was thinking that would help.
On Oct 18, 2013 10:34 PM, "Matt Rogers" notifications@github.com wrote:

I've thought about the dependency tree, but I'm hoping to avoid it because
getting to it is a bit nasty.


Reply to this email directly or view it on GitHubhttps://github.com//issues/380#issuecomment-26641618
.

@csaunders

This comment has been minimized.

Copy link

csaunders commented Oct 24, 2013

If you ever want to chat about it, I'm hanging out in the #jekyll channel on freenode.

@csaunders

This comment has been minimized.

Copy link

csaunders commented Oct 24, 2013

Internally liquid templates know about all their child tags/nodes. When I have spare time I've been working on a patch to make it such that liquid templates can inform you about what other templates are they are including.

With that information we can probably build a global lookup table that maps template.liquid => [dependencies.liquid] and build the dependency tree from there.

This is what I have in mind in terms of that dependency tree:

lookup = {
  # This should be a reverse index?
  'template.liquid' => ['header.liquid', 'body.liquid', 'footer.liquid']
}
hashes = {
  'template.liquid' => 'ababababba',
  'header.liquid' => 'dcdcdcdcdcdc',
  'body.liquid' => 'efefefefef',
  'footer.liquid' => '0010010'
}

We'd keep a manifest to the last successful compilation which contains hashes of all the files, and once we find a change just need to find out the files that depend on the changed file.

@parkr

This comment has been minimized.

Copy link
Member

parkr commented Oct 24, 2013

@mattr- Can you push up your branch? This will help our conversation here.

@ghost ghost assigned mattr- Nov 11, 2013

@mattr- mattr- referenced this issue Nov 27, 2013

Closed

Incremental regeneration #1761

1 of 8 tasks complete
@vjeux

This comment has been minimized.

Copy link

vjeux commented Dec 9, 2013

Ping.

I'm blogging for React website (http://facebook.github.io/react/) which is written with Jekyll and it's getting more an more a pain. When we first launched the website it took few seconds to update to a change but now that we have much more content it takes a minute to update every time I change a single line on a blog post.

I really don't care about regenerating the entire freaking website, I just want to regenerate the page I'm looking at. Could it be possible to mark the whole website as dirty and whenever I open a page on the website then it would just reload that file.

Essentially, instead of phrasing the problem as "this file changed, what other files did it impact", phrase it as "I want to see this page, do the entire build process but skip everything that's not this file". I feel like this would be a lot more manageable for the use case of changing one file iteratively.

I don't really know jekyll enough to see if it's possible or not but I figured I would say it out loud and see if that's not too crazy

@mattr-

This comment has been minimized.

Copy link
Member

mattr- commented Dec 9, 2013

I think if you look at the cucumber features in the incremental-regeneration branch, we're covering your use case. If not, please let us know.

On Sun, Dec 8, 2013 at 6:59 PM, Christopher Chedeau
notifications@github.com wrote:

Ping.
I'm blogging for React website (http://facebook.github.io/react/) which is written with Jekyll and it's getting more an more a pain. When we first launched the website it took few seconds to update to a change but now that we have much more content it takes a minute to update every time I change a single line on a blog post.
I really don't care about regenerating the entire freaking website, I just want to regenerate the page I'm looking at. Could it be possible to mark the whole website as dirty and whenever I open a page on the website then it would just reload that file.
Essentially, instead of phrasing the problem as "this file changed, what other files did it impact", phrase it as "I want to see this page, do the entire build process but skip everything that's not this file". I feel like this would be a lot more manageable for the use case of changing one file iteratively.

I don't really know jekyll enough to see if it's possible or not but I figured I would say it out loud and see if that's not too crazy

Reply to this email directly or view it on GitHub:
#380 (comment)

@migurski

This comment has been minimized.

Copy link

migurski commented Dec 10, 2013

+1 to all this; incremental regeneration would be immensely helpful for a ~1000 page site I’m considering Jekyll for.

@tuananh

This comment has been minimized.

Copy link

tuananh commented Dec 31, 2013

+1 for this. It's the only thing that holds me back from using jekyll. personal blog/website is fine but 1000+ posts sites are not.

@phoet

This comment has been minimized.

Copy link

phoet commented Jan 15, 2014

👍 for a jekyll serve --watch --changed-only

that would come in really handy when testing out small changes. i suppose that is exactly what most people do on large sites.

it could also be nice to trigger a full regeneration through a signal, like it is done in watchr rspec

@skadavan

This comment has been minimized.

Copy link

skadavan commented Feb 24, 2014

+1 for this.

@zanshin

This comment has been minimized.

Copy link

zanshin commented Feb 24, 2014

On a late 2009 MacBook Pro with 8 GB of RAM and a 2.66 GHx Core 2 Duo processor here are the stats for my Jekyll blog:

mark at blackperl in ~/Projects/websites/zanshin on master
± l _posts | wc -l
2141

mark at blackperl in ~/Projects/websites/zanshin on master
± time rake build
WARNING: Nokogiri was built against LibXML version 2.9.0, but has dynamically loaded 2.8.0
Configuration file: /Users/mark/Projects/websites/zanshin/_config.yml
Source: /Users/mark/Projects/websites/zanshin
Destination: /Users/mark/Projects/websites/zanshin/_site
Generating... done.
noglob rake build 70.71s user 3.71s system 69% cpu 1:47.72 total

2,141 posts generated in 1 minutes 47 seconds.

I like Octopress’ isolate feature, which allows you to work on and generate a single posting. Beyond that I don’t see how incremental generate would work. Each new posting changes the composition of each page of my site: the newest posting pushed the previous last posting on the first page to the top of page 2, and so on.

— Mark

@jafskot

This comment has been minimized.

Copy link

jafskot commented Feb 24, 2014

Relevant/related:

#2087

User-space Functional Logic Renderings with Globalization Capabilities

Overview of Idea, Concept

Provide core Jekyll functionality allowing the end-user designer to create, have access to, and take advantage of initialization-style logic (Liquid, Variables, Functions) of which outcome would then be globally available throughout the build and render phases without redundant reiteration.

@parkr parkr modified the milestones: 3.0, 2.0 Mar 16, 2014

@AvverbioPronome

This comment has been minimized.

Copy link

AvverbioPronome commented May 18, 2014

as the uploading time is a bigger (to me) problem than the generating time, there should be a way to avoid syncing the entire site on trivial changes, I think. jekyll should, anytime it builds a page, make a diff between the generated one and the already-in-_site one, and only replace the _site version if there are differences. (this solves a different problem. I know. :( )

@jaybe-jekyll

This comment has been minimized.

Copy link
Member

jaybe-jekyll commented May 18, 2014

@9peppe if you are referring to the synchronization/upload of generated site contents to a remote location such as a web host, the rsync command along with its switches/options such as --checksum will assist with incremental-type synchronization regardless of using jekyll or merely synchronizing abritrary data to/from machines.

Example

$ rsync --rsh='ssh -p 22222' -vv --checksum --recursive --update --keep-dirlinks --stats --human-readable --progress --itemize-changes _site/  user1@host.example.com:www/example.com/www/
@AvverbioPronome

This comment has been minimized.

Copy link

AvverbioPronome commented May 18, 2014

@jaybe-jekyll yes, that's what I'm referring to. I'd like use rsync, but I don't have sftp access, only ftp[s], so I am constrained to use stuff like ftpsync, which only uses file creation/modification times.

@parkr

This comment has been minimized.

Copy link
Member

parkr commented Nov 5, 2014

Tracking this in #3060, will try to get some version of this into 3.0.

@parkr parkr closed this Nov 5, 2014

@parkr

This comment has been minimized.

Copy link
Member

parkr commented Nov 5, 2014

Ehhh, just kidding. Sorry guys. This tracks incremental regeneration between Jekyll processes, #3060 tracks incremental regeneration within one jekyll (build|serve) --watch command.

@parkr parkr reopened this Nov 5, 2014

@Naatan

This comment has been minimized.

Copy link

Naatan commented Nov 5, 2014

That's a weird joke.

@alfredxing alfredxing referenced this issue Nov 17, 2014

Merged

Incremental regeneration #3116

4 of 4 tasks complete
@parkr

This comment has been minimized.

Copy link
Member

parkr commented Jan 10, 2015

#3116 solves this, at least as a first iteration. If you have time, I'd appreciate it if y'all could try building your sites with the current master and see how it works out for you.

@parkr

This comment has been minimized.

Copy link
Member

parkr commented Apr 13, 2015

Going to close this.

@parkr parkr closed this Apr 13, 2015

@jekyll jekyll locked and limited conversation to collaborators Feb 27, 2017

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.