Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP

Loading…

Assetgraph (grunt-reduce) for production minification #1234

Open
Munter opened this Issue · 4 comments

2 participants

@Munter

I was asked to give a brief overview of what Assetgraph is, and how it can fit into the Yeoman project and the whole Grunt setup, at the irc meeting yesterday. So here goes.

Assetgraph is a project that aims to model web assets and their relations to each other in a graph.

The original intention behind this was to create a web asset build system that could automate web performance optimization. This project is Assetgraph-builder and has a grunt wrapper called grunt-reduce.

How it works

You seed the graph with an initial asset, typically your index.html or a collection of several html.files. The you run the populate transform on assetgraph, which will start an automated population of the graph by parsing each initial asset in the graph, finding all outgoing relations and populating these recursively. We aim to implement any type of relation that is well specified and supported in any way. The list is quite extensive by now.

So now you have a populated graph that includes all the assets that are depended on by your initial seed asset (index.html). All of this with a minimal configuration of just that one root file reference.

To work with this graph we've implemented a query model that has a lot of similarities with the MongoDB query model. The essence of it is that you can find assets or relations based on a simple query object where properties can be matched recursively with arrays, strings, booleans, regular expressions or functions with some boolean and/or/not wrappers. This gives you fine grained control of what assets or relations you want to work on and iterate over them.

The third leg of assetgraph are the transforms, which are high-level convenience functions that can rework the graph for you in predefined ways. Each of these take an assetgraph as first argument and a continuation callback at second argument, thus allowing for extensibility and chaining of multiple transforms on the same graph instance.

What does it mean?

There are some major differences from the way file based systems like grunt work to how assetgraph works.

First of all every input file is only read once and the graph is kept in memory through all the transforms, then written to disk on demand. So file IO is drastically reduced compared to a long chain of grunt minification tasks. This has some potential speed implications, hopefully improvements :)

Configuration is drastically reduced. Since the individual graph transformations are chained operations on the already instantiated and populated graph you don't need ell the repetition that a grunt minification task pipeline has. You only configure where your assets are once, and then you still only need to configure your initial assets, which can be quite simple.

Only referenced files are included. This means that you won't be including unused libraries just because they happen to be located in a directory you globbed for bundling or similar. Combining this with a dependency based approach, for example with RequireJS, you can completely skip any need for manifest files and move over to trust that the libs you need will be there by automation and no libs you don't need will be included. The caveat here is that is a certain asset relation syntax isn't supported by assetgraph, the asset will not be in the graph and consequently not be written to the desired output directory.

References are automatically updated. If you rename a file in the graph, replace a file reference with a bundle reference or similar, any reference wil be automatically updated. This means that previously hard problems like reving files becomes somewhat simpler than when trying to solve the problem without an implicit knowledge of the whole dependency graph. (This should address issues like these #1228)

How can this be implemented?

Luckily I already put in some time to create a grunt wrapper for assetgraph-builder, called grunt-reduce. This in itself is pretty much a plug in replacement for the whole grunt minification chain that most generators currently set up.

One implication of switching to grunt-reduce is that the workflow might be a bit different from what some people are doing with grunt at the moment. Most importantly assetgraph is supposed to understand anything a browser understands. So it doesn't care for most preprocessing like sass, typescript, coffeescript etc. These preprocessing steps should still be run in grunt, probably with a watcher task to get updates to the browser quickly.

Running grunt-reduce should be manually triggered, not a part of some watcher chain. Production building is a slow process, no matter how much we optimize it. Also I would consider it absolutely best practice to have a working development version running in your app-folder with the least amount of abstractions. This should improve the debugging workflow.

Caveats

Sadly this is not all as easy as waving a magic wand. While assetgraph does away with most of the ugliness of the corresponding grunt minification chain, there are also some clutches that need to be known.

One particular problem is that sometimes developers might be using javascript strings as file references. Lets take an example of an app that loads a static json file run time to load in some configuration. The developer does this with a manually run XHR-call. The file reference would now be something like path/to/file.json. There is no way assetgraph can interpret any and all strings as file references. So to properly refer to this file the developer will have to do this: GETSTATICURL('path/to/file.json'). GETSTATICURL is a function that simply returns the first argument at run time, but at build time gets picked up as a file relation and replaced by a raw string with the updated file path on build.

RequireJS plugins. Currently assetgraph understands only some of them: text, json, css, less. It also picks up files ending with .ko as knockout template references. These are mostly hacks. Assetgraph doesn't evaluate the plugin code itself, and the random naming convention that developers can use for plugins provide insufficient semantics to handle any type of plugin the same way r.js would. We're planning to mitigate this by actually evaluating the build time part of the plugin code in the same way r.js would or simply run r.js on the limited subset of plugins to enable support for any plugin. This is a work in progress.

Next step

My current plan is to make an example implementation of grunt-reduce on a fork of generator-webapp so you can all try this out in action.

I haven't been telling you half of the cool stuff that we can do with this, as I am hoping you will check the readmes of the referenced projects if you are interested. Safe to say I have high hopes for the dependency based approach to build systems.

Please ask any and all questions. I will be very happy to answer to the best of my knowledge.

Ping @papandreou (the brain behind assetgraph)

@Munter

I have a Yeoman generator up and running that implements grunt-reduce as a replacement for the current web optimization build step. It's very bare bone and doesn't include preprocessing, watching etc yet. But it should be ready to begin running some test builds to compare performance and compatibility with the current build chain.

https://github.com/Munter/generator-webapp-assetgraph

My proposed workflow is a bit different from the one currently in generator-webapp. I propose developing in a way so every asset in the web application should actually be placed in app, so that any asset reference is actually a valid file reference on disk. This has the benefit of being immediately accessible by using any static file server instead of having to set up the current thing where the server first looks in .tmp for generated sources and falls back to app. Further more it reduces complexity and makes it more immediately understandable to the developer what is going on and where the files come from. I have an ulterior motive here of course, as a working web app with valid file references is also a pre-requisite of even getting assetgraph to pick up asset relations.

From here the entire build step is handled by grunt-reduce.

Feedback appreciated.

@addyosmani
Owner

@Munter the team have had plenty of discussions around assetgraph and we want to thank you for your on-going work in this area. It's really important but also quite exciting to see evolve :)

We'll probably continue to give you feedback in the team meetings on your progress but you might also like to link folks on this thread up to the work you're doing on TodoMVC too that is somewhat related.

@Munter

Right. I've started a small project which I have dubbed TodoMVC challenge. The point of this challenge is to see how Assetgraph fairs when moving out of the lab (only building the code that its creators write) and gets exposed to code in the wild.

Since the basic premise of Assetgraph is that it should understand the web, it should also be able to build a very wide variety of code bases without introducing errors. So this is the challenge. Throw every TodoMVC app at assetgraph-builder and see what happens.

This has clearly exposed some bugs. Of the ones I've found so far there are 2 or 3 ones that are the key to getting most of the apps to build correctly.

The goal is of course to get every app built correctly without having to alter the original source code. Getting some experience looking into the actual code bases of the apps, and especially the libraries they use, this seems unlikely even with bug fixes in assetgraph. There are simply to many custom module loaders in the world to cover the semantics of all of them, unless of course those projects get involved in keeping assetgraph up to date.

I am quite certain that when we have done our bug fixes, I will be able to define a best practice that is guaranteed to work with assetgraph and that this best practice coincides with what is best practice for an efficient developer workflow in general.

Right now it's all about putting in the time to test, analyze errors and fix bugs though :)

@Munter

Some feedback after my month of TodoMVC challenge and assetgraph improvements.

I am now through my first iteration, meaning I have built every single TodoMVC app and documented success, failure and bugs related to failures. I have even fixed some of the bugs along the way and made assetgraph a whole lot more stable because of that.

If I look beyond the obvious bugs that I still need to fix in Assetgraph there is a pretty clear picture of the main problems with the remaining apps. These are some of them:

  • The MVC library itself or the app implementation is minifier unsafe. This is not something we are able to detect by automation. I don't see a way assetgraph will ever support this, but then again I don't see any other tool being able to either. This can only be resolved by the developer of that specific implementation having to annotate the build to not mangle variables or some other workarounds. in TodoMVC challenge terms this accounts for at least 2 apps if I don't count AngularJS implementations that can be fixed with ngmin (assetgraph/assetgraph-builder#112)
  • MVC libraries or apps call RequireJS's define function with three arguments, a syntax reserved for build output, which assetgraph doesn't pick up. This is a tough one seen from both sides. From a developers perspective this should just work out of the box, because it does in the pre-build use case. I am open to taking another look at supporting this at some point.
  • By far the most problems come from custom module loader implementations. Since assetgraph is working on a basis of extrapolating dependencies from syntax, this means implementation of each custom module loader requires significant investment in time and adds a lot of extra complexity to the code base. We are not willing to invest in that unless we see some real usage of the specific libraries we should implement. Ideally these projects should contribute each their own part to assetgraph if we are in any way to scale and maintain this.

From these indicators I can extrapolate some workflow practices that would give a bigger success rate when using assetgraph-builder / grunt-reduce:

  • Don't depend on minifier unsafe code
  • Use the unminified source in development
  • Don't use MVC libraries with custom module loaders.

My further recommendations for setting up development workflows with assetgraph-builder / grunt-reduce are these:

  • app should always contain a working web application. If you depend on preprocessors to generate the files needed to get to a working app state, build them to the correct path in app.
  • Only use watch tasks on source files for preprocessing and for livereload
  • Grunt serve from app and iterate on the non-minified code
  • Build to dist only on demand

If these recommendations are followed you will see a drastic decrease in work done in the development iteration part of the workflow. You will be working on and serving the raw files, reducing the levels of abstraction between the browser and editor, thereby also reducing configuration requirements and all the potential errors that come with complex configuration.

Gulp would be an obvious choice for all the preprocessing tasks, making them able to run in streams and in parrallel. From there livereload should pick up changes and update the browser session showing assets served directly from app.

When ready to deploy, run grunt reduce and maybe do another test iteration with a static file server serving from dist. This is of course only needed if you suspect the tools are misbehaving.

I hope I can make it to the meeting today.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.