Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

0-length cache file; writing files to cache; and Last-Modified header. #3

joeytwiddle opened this Issue Sep 4, 2013 · 2 comments


None yet
2 participants

This seems like a pretty rare bug, but when I was using version 0.0.4, somehow I got a 0 length file in my cache. I'm afraid I couldn't reproduce it, but I did wonder if fs.writeFile might be to blame...

Unfortunately in many apps, writing directly to files is often not the correct thing to do. Because there is a small chance that during that moment the process will be killed, or there will be a power cut, and the file will be left only partially written to the disk. When the app is started again, it has no way of knowing that the file is invalid, and just assumes it is valid.

Instead of using lockfiles, a common solution to this is to write to a temporary file (e.g. filename.working) and then after the data is saved, move the working file into its final location. This is good practice, because on most filesystems, a file move/overwrite is atomic: it either works or it doesn't. (You can see apps doing this sort of thing when saving their config files, to avoid the risk of losing the user's config.)

So ... you could consider that approach when writing your cachefiles.

The second related issue was that I had some difficulty getting this 0 length file out of my browser's cache! If I changed the original file a little, the browser would get a new freshly-minified version, but then when I changed it back, it hit the 0-length copy from its cache again!

If I recall correctly, the browser was using Last-Modified, perhaps based on the minfied file's timestamp, rather than the original? If that is the case, you might reconsider that policy. Although I guess this is a non-problem, assuming we never see invalid cachefiles again.

Anyway, thanks for this nice little package. It's great to be able to just drop it in. I hope my bug reports don't drive you too crazy! :)

Some more suggestions for the future: how about minification for HTML/SVG/XML/XSLT/SGML files, and cleanup of old cache files?


breeswish commented Sep 4, 2013

Hello. Thank you for your feedback and solutions!

Now it uses .tmp file when writing cache and uses JSON-format to storage cache. If the cache file is broken (for example, invalid JSON), it will use the original output instead of the cache (further improvements is needed here). I think these changes could solve most of the problems :)

About Last-Modified, I think it should not be taken into consideration although it can indicate file changes. SHA1 checksum has already ensured this, so what to do if last-modified is changed but SHA1 is not changed? :) And since the bug of incomplete or invalid cache has been fixed, maybe there is no need to check last-modified any more. How do you think about this? I'm glad to have a further discussion about this topic with you :)

About HTML/SVG/XML/XSLT/SGML, I will implement it later for currently I haven't found a small and fast enough module to do the minification. I want to keep express-minify small and simple (and stupid XD). I will do further tests of existing XML/SVG minification modules when I have more spare time (I'm preparing TOEFL these monthes >_<).

Anyway, thank you for your so many reports and issues!

Great work on .tmp files!

About Last-Modified, I agree probably nothing needs to be done. I could of course solve my problem by deleting the broken file from the cache folder. Hopefully that's the last we will see of it. :)

About minifying HTML, there is one issue from our recent project that I can pre-warn you about. That is: whitespace between elements sometimes affects the output! For example, sensible minifiers will not compress whitespace between <pre> and <code> tags. (Not entirely unrelated, Jade uses this list when pretty-printing, to avoid adding newlines: https://github.com/visionmedia/jade/blob/master/lib/inline-tags.js )

Unfortunately there are cases when the CSS decides which elements for which whitespace is relevant. For example, removing whitespace between <li> tags is not a problem until someone puts li { display: inline } in the CSS file.

For this reason, there is a danger that removing all whitespace between two tags could potentially cause the minified HTML to display differently from the original. A fix for this might be to squeeze whitespace down to one space character, rather than removing it completely, although it might look a bit weird to anyone unaware of this fairly rare issue. Providing an option to act this way might be preferable to having it by default.

Anyway, that is just something to consider when assessing *ML minifiers. In the case of XML, it really depends how the data is being consumed. Only the developer can decide whether it is safe to minify XML.

Perhaps for maximum flexibility, there could be a plug-in mechanism for express-minifier, so that the developer can choose which minifier to use in which situation. If he wants SVG data minified differently from HTML, he could provide two different functions to express-minifier, which would be called when needed to perform minification.

I do take your point about express-minifier being small and simple, and it is very nice that way. Perhaps the added complexity of *ML minifiers might be best served by a fork/addon project.

Thanks for all your recent developments, making this project really useful! Good luck with the course!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment