Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP

Loading…

Fingerprinting: MD5 digest vs modified time #16

Closed
malandrew opened this Issue · 8 comments

3 participants

@malandrew

Currently, express-cdn uses modified time as its cache invalidation strategy. Rails 2.x used to use this approach and switched to an MD5 digest approach for reasons outlined here:
http://edgeguides.rubyonrails.org/asset_pipeline.html#what-is-fingerprinting-and-why-should-i-care

Since I am using less stylesheets, I ran into problems with the mtime approach because every the css file would be recompiled with a new mtime even though the file contents remain identical to previous compiles.

PS Google has a good document describing caching strategies here:
https://developers.google.com/speed/docs/best-practices/caching

@niftylettuce

Do you suggest using Git's integrated hash?

@malandrew

The approach I used on my fork is simply to perform something like this:

var hash = crypto.createHash("md5");

// ...
for (var b=0; b<assets.length; b+=1) {
  // ...
  hash = hash.update(fs.readFileSync(path.join(options.publicDir, assets[b])));
  // ...
}
// ...
return createTag(src, "/" + name, attributes) + "\n";

The problem with this approach at the moment is that it is computationally intensive and time consuming to generate an MD5 digest of a bunch of files every time you want to generate an HTML tag. I'm guessing that the best approach is to generate the MD5 digest only once and cache it in memory.

Here is my commit with this naive implementation:
malandrew@3e2744a

I'm going to contact bminer and check if he'd be interested in modularizing his node-static-assets module to separate the express middleware from the cache strategies.

Tell me more about what you had in mind regarding the git integrated hash approach. I'm not sure I understand. Do you have a link to an implementation?

@malandrew

Nick, here's a naïve implementation.

https://github.com/malandrew/express-cdn

It's not production ready as I need to go ahead and modify the middleware part so it caches values and doesn't recalculate on every server request. Any ideas on how you'd go about implementing that caching?

Once I figure out how best to implement this, I want to go ahead and add this to the less-middleware module since it's also got the mtime achille's heel that is the source of edgecase bugs.

@niftylettuce
@grydstedt

Andrew, couldn't you set once:true on the middleware to only have it generate the compiled css once? I believe the problem is that express-cdn goes straight to disk to find the css files without requesting it through the middleware, no?

@niftylettuce

@grydstedt for CSS we do a GET request https://github.com/niftylettuce/express-cdn/blob/master/lib/main.js#L279 -- if that answers your question? /cc @malandrew

@niftylettuce

we should take care of this issue, timestamps aren't too cool right now. @grydstedt did you have any ?'s about how CSS gets requested or anything?

@niftylettuce

adding to lazyweb requests in the readme, I think git sha might be better/easier

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.