Skip to content
This repository has been archived by the owner on Mar 18, 2022. It is now read-only.

Fingerprinting: MD5 digest vs modified time #16

Closed
andrewdeandrade opened this issue Aug 23, 2012 · 8 comments
Closed

Fingerprinting: MD5 digest vs modified time #16

andrewdeandrade opened this issue Aug 23, 2012 · 8 comments

Comments

@andrewdeandrade
Copy link
Contributor

Currently, express-cdn uses modified time as its cache invalidation strategy. Rails 2.x used to use this approach and switched to an MD5 digest approach for reasons outlined here:
http://edgeguides.rubyonrails.org/asset_pipeline.html#what-is-fingerprinting-and-why-should-i-care

Since I am using less stylesheets, I ran into problems with the mtime approach because every the css file would be recompiled with a new mtime even though the file contents remain identical to previous compiles.

PS Google has a good document describing caching strategies here:
https://developers.google.com/speed/docs/best-practices/caching

@niftylettuce
Copy link
Collaborator

Do you suggest using Git's integrated hash?

@andrewdeandrade
Copy link
Contributor Author

The approach I used on my fork is simply to perform something like this:

var hash = crypto.createHash("md5");

// ...
for (var b=0; b<assets.length; b+=1) {
  // ...
  hash = hash.update(fs.readFileSync(path.join(options.publicDir, assets[b])));
  // ...
}
// ...
return createTag(src, "/" + name, attributes) + "\n";

The problem with this approach at the moment is that it is computationally intensive and time consuming to generate an MD5 digest of a bunch of files every time you want to generate an HTML tag. I'm guessing that the best approach is to generate the MD5 digest only once and cache it in memory.

Here is my commit with this naive implementation:
andrewdeandrade@3e2744a

I'm going to contact bminer and check if he'd be interested in modularizing his node-static-assets module to separate the express middleware from the cache strategies.

Tell me more about what you had in mind regarding the git integrated hash approach. I'm not sure I understand. Do you have a link to an implementation?

@andrewdeandrade
Copy link
Contributor Author

Nick, here's a naïve implementation.

https://github.com/malandrew/express-cdn

It's not production ready as I need to go ahead and modify the middleware part so it caches values and doesn't recalculate on every server request. Any ideas on how you'd go about implementing that caching?

Once I figure out how best to implement this, I want to go ahead and add this to the less-middleware module since it's also got the mtime achille's heel that is the source of edgecase bugs.

@niftylettuce
Copy link
Collaborator

Andrew,

Sorry for delay -- been working hard on my startup -- if you do a pull
request I have no problem looking it over!

Someone integrated CSS for url(...) attribute in the newest version of
express-cdn v0.0.6 -- check it out.

It needs the appended mtime to the image though, might you look into that?
That was one of the lazyweb requests.

On Tue, Sep 11, 2012 at 12:44 AM, Andrew de Andrade <
notifications@github.com> wrote:

Nick, here's a naïve implementation.

https://github.com/malandrew/express-cdn

It's not production ready as I need to go ahead and modify the middleware
part so it caches values and doesn't recalculate on every server request.
Any ideas on how you'd go about implementing that caching?

Once I figure out how best to implement this, I want to go ahead and add
this to the less-middleware module since it's also got the mtime achille's
heel that is the source of edgecase bugs.


Reply to this email directly or view it on GitHubhttps://github.com//issues/16#issuecomment-8446953.

@grydstedt
Copy link

Andrew, couldn't you set once:true on the middleware to only have it generate the compiled css once? I believe the problem is that express-cdn goes straight to disk to find the css files without requesting it through the middleware, no?

@niftylettuce
Copy link
Collaborator

@grydstedt for CSS we do a GET request https://github.com/niftylettuce/express-cdn/blob/master/lib/main.js#L279 -- if that answers your question? /cc @malandrew

@niftylettuce
Copy link
Collaborator

we should take care of this issue, timestamps aren't too cool right now. @grydstedt did you have any ?'s about how CSS gets requested or anything?

@niftylettuce
Copy link
Collaborator

adding to lazyweb requests in the readme, I think git sha might be better/easier

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants