Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Loading…

Self-updating build queue #623

Closed
wants to merge 2 commits into from

5 participants

@t6d
t6d commented

I changed the way Middleman keeps track of the resources it has yet to built. Instead of initially creating the list of files to built from the sitemap, I implemented a queue that updates itself during the build process. This allows resources to inject other resources into the sitemap dynamically.

The idea originated from a helper I wrote that dynamically generates thumbnail images using RMagick and adds these images to the sitemap. This works fine in development. The build process, however, ignores the generated thumbnails since they have been added to the sitemap later on when the list of resources to render has already been determined.

This pull request does not change any of the existing build behavior. It merely adds the possibility of having resources that generate other resources on-demand.

t6d added some commits
@t6d t6d Adds a build queue that keeps track of the resources that have yet to…
… be built including those resources that have been added dynamically during runtime
b2d75a5
@t6d t6d Adds a test to ensure that a page that was added to the sitemap after…
… the build process had already started gets rendered eventually
308c00a
@tdreyno
Owner

Good idea, this may help with Compass' sprites too.

Won't be able to add such a change to the stable release (3.0.x), but we'll pull this in for 3.1.x

@bhollis
Owner

I'm a little confused - could you share your plugin? If your plugin adds resources to the sitemap via a manupulate_resource_list plugin, you shouldn't need this.

@bhollis
Owner

@t6d do you have the source of your plugin up anywhere?

@t6d

@bhollis Sorry, the source is not available anywhere. I doubt that manipulate_resource_list would make a difference. The sitemap is not the problem here. Updating the sitemap works just fine. This even works during compilation. The problem is that even though the sitemap gets updated it doesn't make a difference. This is due to the fact that the list of resources to build is generated in advance and not updated during runtime.

I'll again try to explain what my code does: My code injects thumbnails to the sitemap on demand. Meaning whenever there is a call to thumbnail_image('/path/to/image') in one of my pages, a new resource is created with the appropriate image data and injected into the sitemap. This works just fine and as expected in development. The problem is the following code snippet:

# taken from middleman-core/lib/middleman-core/cli/build.rb
resources = @app.sitemap.resources.sort_by do |r|
  sort_order.index(r.ext) || 100
end

resources.each do |resource|
  # building the resource (code omitted)
end

Here, you generate the list of resources and then build each of these resources. If one of these resources now generates another resource, e.g. a thumbnail image, then the sitemap gets updated, but the list of resources does obviously not get updated because it is only generated once in advance.

@bhollis
Owner

Ah, now I understand. That makes perfect sense. I'd support merging this, then.

@tdreyno
Owner

So if the sitemap doesn't rebuild as these new items are added, that means they can't be effected by other plugins/code, which seems weird. Adding an html page wouldn't run directory_indexes on it.

Seems like we actually need some kind of "preflight" mode. We already try to do this for Compass, which generates sprite assets during build. This gets nasty and recursive fast.

@t6d

@tdreyno Again, the sitemap itself is not the problem here. The problem is that the resource I'm creating is added to the sitemap when evaluating the template. This apparently happens after the list of sites to build has already been determined. As I understand it, the process works the following way:

(1) Generate sitemap --> (2) Generate list of sites to build --> (3) Evaluate templates and write results to disk

Now, as long as we are in Step 1 modifications to the sitemap made by plugins will have effect. If content gets added to the sitemap in Step 3, however, this will have no effect as the list of resources to build has already been determined. All my implementation does is checking for changes made to the sitemap in Step 3. If such a change has occurred the list of resources to build is updated. The BuildQueue simply keeps track of resources that have been built and those that have yet to be processed by their URL – which should be suffices as unique identifier.

The problem as described here, never occurs in development. Only in production. I would assume that this is due to the fact that in development the content is directly served from the sitemap. Meaning the moment I update the sitemap the content is available to the client. Considering how a web pages is served this makes perfectly sense. For instance, if I have a page that contains a thumbnail which is dynamically generated, the resulting HTML sent back to the client will contain a reference to this thumbnail. Before sending back any HTML though the sitemap will get updated. Meaning, all subsequent requests will have access to this updated sitemap. Now, when the client evaluates the HTML, it will stumble across the reference to the thumbnail and retrieve the data from the server which of course will work since the data is actually contained in the sitemap. Meaning, all is good. This however does not work if you generate the list of resources to build in advance.

I don't think you need a "preflight" mode. All you need is to keep track of what resources need to be built and what resources already have been built. This is what my implementation actually does.

To quickly sum it up: The sitemap is not the problem. The way you keep track of sites yet to build is.

@tdreyno
Owner

I really like the build queue idea, it'll give us a nice place to optimize copying binary files and to control the build process outside of the CLI. I'm just concerned about the API exposed and the expectations that it implies.

I'd like to merge this as a nice improvement to internal structure without deciding on whether templates should be dynamically generating assets to build yet.

@btucker

I was recently stung by the same issue this PR attempts to solve. While this does solve the problem, it comes at a pretty heavy-- O(n^3) I think -- performance hit since it has to iterate over multiple arrays for each iteration of the resource list to discover additions.

One simple alternative would be to delay this resource discovery until the end of each sweep through the resources array. Ideally, it seems like the better solution would be some refinement to the resource list manipulators infrastructure to specifically handle the case of adding resources (whether at the beginning of the build process, or at render time). Something like a Middleman::Sitemap::Store#add which could both add the resource to the sitemap & also push it into the build queue if one exists.

@btucker

I'd like to propose this simple solution to this problem instead of moving to the full-on build queue (at least for the time being): btucker@5e9ff87

Would this be accepted into 3.0-stable?

@bhollis
Owner

@btucker, I like your patch, though I'd switch to a Set instead of just an Array. Could you set this up as a new PR with tests?

@bhollis
Owner

I'd still like to pull @btucker's code in with some slight improvements (using Set).

@karlfreeman
Owner

@bhollis I'd like to see that happen too! :+1:

@tdreyno
Owner

Closing this, hopefully @btucker will have some time to submit the PR. We'll get it in first thing to 4.0

@tdreyno tdreyno closed this
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Commits on Oct 1, 2012
  1. @t6d

    Adds a build queue that keeps track of the resources that have yet to…

    t6d authored
    … be built including those resources that have been added dynamically during runtime
  2. @t6d

    Adds a test to ensure that a page that was added to the sitemap after…

    t6d authored
    … the build process had already started gets rendered eventually
This page is out of date. Refresh to see the latest.
View
9 middleman-core/features/builder.feature
@@ -47,4 +47,11 @@ Feature: Builder
Scenario: Build alias (b)
Given a fixture app "large-build-app"
When I run `middleman b`
- Then was successfully built
+ Then was successfully built
+
+ Scenario: Resources on demand
+ Given a successfully built app at "resources-on-demand"
+ When I cd to "build"
+ Then the following files should exist:
+ | index.html |
+ | generated-on-demand.html |
View
1  middleman-core/fixtures/resources-on-demand/config.rb
@@ -0,0 +1 @@
+ignore '/template.html'
View
1  middleman-core/fixtures/resources-on-demand/source/index.html.erb
@@ -0,0 +1 @@
+<% page '/generated-on-demand.html', :proxy => '/template.html' %>
View
0  middleman-core/fixtures/resources-on-demand/source/template.html.erb
No changes.
View
68 middleman-core/lib/middleman-core/cli/build.rb
@@ -2,7 +2,59 @@
# CLI Module
module Middleman::Cli
-
+
+ # Queue to manage the build process
+ class BuildQueue
+
+ ##
+ # Creates a new build queue that keeps track of pages dynamically added to the
+ # sitemap during runtime.
+ #
+ # Sorts paths intially by the specified order. This is primarily so Compass can
+ # find files in the build folder when it needs to generate sprites for the
+ # css files.
+ def initialize(sitemap, initial_sort_order = %w(.png .jpeg .jpg .gif .bmp .svg .svgz .ico .woff .otf .ttf .eot .js .css))
+ sitemap.ensure_resource_list_updated!
+
+ @scheduled = sitemap.resources.sort_by do |r|
+ initial_sort_order.index(r.ext) || 100
+ end
+ @finished = []
+ @sitemap = sitemap
+ end
+
+ def each
+ while not empty? do
+ yield next_page
+ update_schedule
+ end
+ end
+
+ def empty?
+ @scheduled.empty?
+ end
+
+ def scheduled?(resource)
+ @scheduled.any? { |scheduled_resource| scheduled_resource.url == resource.url }
+ end
+
+ def finished?(resource)
+ @finished.any? { |finished_resource| finished_resource.url == resource.url }
+ end
+
+ private
+
+ def next_page
+ @finished.push(@scheduled.shift)
+ @finished.last
+ end
+
+ def update_schedule
+ @scheduled.push(*@sitemap.resources.reject { |r| scheduled?(r) || finished?(r) })
+ end
+
+ end
+
# The CLI Build class
class Build < Thor
include Thor::Actions
@@ -223,9 +275,6 @@ def queue_current_paths
# Actually build the app
# @return [void]
def execute!
- # Sort order, images, fonts, js/css and finally everything else.
- sort_order = %w(.png .jpeg .jpg .gif .bmp .svg .svgz .ico .woff .otf .ttf .eot .js .css)
-
# Pre-request CSS to give Compass a chance to build sprites
logger.debug "== Prerendering CSS"
@@ -239,20 +288,11 @@ def execute!
# Double-check for compass sprites
@app.files.find_new_files((Pathname(@app.source_dir) + @app.images_dir).relative_path_from(@app.root_path))
- @app.sitemap.ensure_resource_list_updated!
-
- # Sort paths to be built by the above order. This is primarily so Compass can
- # find files in the build folder when it needs to generate sprites for the
- # css files
logger.debug "== Building files"
- resources = @app.sitemap.resources.sort_by do |r|
- sort_order.index(r.ext) || 100
- end
-
# Loop over all the paths and build them.
- resources.each do |resource|
+ BuildQueue.new(@app.sitemap).each do |resource|
next if @config[:glob] && !File.fnmatch(@config[:glob], resource.destination_path)
output_path = base.render_to_file(resource)
Something went wrong with that request. Please try again.