New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ETag to static files #667

Merged
merged 2 commits into from Mar 20, 2018

Conversation

Projects
None yet
2 participants
@danielcompton
Contributor

danielcompton commented Mar 12, 2018

  • Calculate checksum for each file when serving it
  • If the file system supports it, store checksum as extended attributes
    on the file to reuse it on the next request

This is still a WIP, there are a few things to work out here before merging. However I'm interested in the overall approach, namely:

  • Calculating a checksum for files to use as an ETag
  • Storing the checksum in extended file attributes if the OS supports it (Windows and Linux do, macOS doesn't currently)

Fixes #664

(fn [request]
(if (= [:get "/"] ((juxt :request-method :uri) request))
(if-let [resp (some-> (resource-response "index.html" {:root (or root "public")
(if-let [resp (some-> (resource-response "index.html" {:root root

This comment has been minimized.

@danielcompton

danielcompton Mar 12, 2018

Contributor

I used root here because I think the web server root is already calculated if missing at L362?

This comment has been minimized.

@bhauman

bhauman Mar 12, 2018

Owner

Let's leave it as it isn't part of this commit.

@@ -232,8 +309,8 @@
;; users handler goes last
(possible-endpoint resolved-ring-handler)
(handle-static-resources http-server-root)
(handle-index http-server-root)
(handle-static-resources http-server-root supports-extended-attributes?)

This comment has been minimized.

@danielcompton

danielcompton Mar 12, 2018

Contributor

Do you prefer to calculate this once and pass it through to both handlers, or calculate it inside each handler?

This comment has been minimized.

@bhauman

bhauman Mar 12, 2018

Owner

I'm thinking we skip the extended attributes and try to keep this brutally simple.

This comment has been minimized.

@bhauman

bhauman Mar 12, 2018

Owner

less code less maintainance

(defn checksum-file
"Generated from Pandect"
;; TODO: full attribution^^
[^File file]

This comment has been minimized.

@danielcompton

danielcompton Mar 12, 2018

Contributor

We could look at modifying this so that the file is only read once while serving it, rather than once here, and once in the response.

@bhauman

This comment has been minimized.

Owner

bhauman commented Mar 12, 2018

This is good stuff.

I think the extended attributes it probably not worth the complexity.

Also, I think that every CLJS developer that has a webserver is going to want to use this code. Is there there middleware that already does this somewhere? @weavejester

Developers will miss this when they switch to their own server. Always keeping in mind that we don't want to be in the server business, as there is no end the number of modifications that folks will need/want.

Also are your perf numbers based on a file system that doesn't support attributes?

All that aside, I'm ready to accept this commit when you are satisfied that it is working well.

@danielcompton

This comment has been minimized.

Contributor

danielcompton commented Mar 12, 2018

Yep, my perf numbers were for a system that didn't support attributes. I also added a simple bench test to calculate the ETag for a 45KB JS file 1000 times in each test. These were the numbers I got on CircleCI:

lein test figwheel-sidecar.components.figwheel-server-test
Calculate checksum every time
"Elapsed time: 113.684545 msecs"
"Elapsed time: 97.967025 msecs"
"Elapsed time: 97.737383 msecs"
"Elapsed time: 97.546271 msecs"
"Elapsed time: 97.633175 msecs"
"Elapsed time: 96.780006 msecs"
"Elapsed time: 98.222397 msecs"
"Elapsed time: 236.206855 msecs"
"Elapsed time: 98.405262 msecs"
"Elapsed time: 98.978996 msecs"
Calculate checksum once and store it as an extended attribute
"Elapsed time: 66.486779 msecs"
"Elapsed time: 32.679466 msecs"
"Elapsed time: 32.602913 msecs"
"Elapsed time: 32.750124 msecs"
"Elapsed time: 32.491853 msecs"
"Elapsed time: 32.686958 msecs"
"Elapsed time: 32.666625 msecs"
"Elapsed time: 32.465312 msecs"
"Elapsed time: 32.587221 msecs"
"Elapsed time: 32.56624 msecs"

I just updated the test to run against a 500KB cljs.core source map file and got these results:

lein test figwheel-sidecar.components.figwheel-server-test
Calculate checksum every time
"Elapsed time: 953.490261 msecs"
"Elapsed time: 931.623432 msecs"
"Elapsed time: 930.07882 msecs"
"Elapsed time: 936.836928 msecs"
"Elapsed time: 934.465546 msecs"
"Elapsed time: 925.765848 msecs"
"Elapsed time: 926.811449 msecs"
"Elapsed time: 936.191236 msecs"
"Elapsed time: 923.588641 msecs"
"Elapsed time: 930.018802 msecs"
Calculate checksum once and store it as an extended attribute
"Elapsed time: 69.923771 msecs"
"Elapsed time: 33.854051 msecs"
"Elapsed time: 33.474538 msecs"
"Elapsed time: 34.039951 msecs"
"Elapsed time: 34.421048 msecs"
"Elapsed time: 33.744608 msecs"
"Elapsed time: 34.120313 msecs"
"Elapsed time: 34.206049 msecs"
"Elapsed time: 34.204573 msecs"
"Elapsed time: 34.575327 msecs"

So on larger files (which are not all that common), the checksum can take up to about 1ms, whereas storing the checksum as a file attribute is a constant factor of about 0.03ms for all file sizes. It's still pretty small numbers either way, but I thought I'd give you that information in case that sways you one way or the other. I'm happy to pull all of the extended attribute stuff out too.

Also, I think that every CLJS developer that has a webserver is going to want to use this code. Is there there middleware that already does this somewhere?

There is https://github.com/yetanalytics/ring-etag-middleware which calculates the Etag based on a SHA1 hash. I'm pretty sure a checksum would be faster, and we don't need cryptographic hashing for the ETag.

I'm planning on writing a blog post about how to properly serve ClojureScript files in development, which will cover the principles here, so people know how to adapt it to their individual systems. I could turn this PR into my own Ring middleware dependency which you add into Figwheel if you'd prefer? That would help others consume it too.

UPDATE: I think that would probably be good for others too if this was packaged up in a library, and then you don't have to maintain the code either.

@bhauman

This comment has been minimized.

Owner

bhauman commented Mar 13, 2018

Yeah if you want to make this a simple middleware that figwheel consumes that would be great for everyone.

@bhauman

This comment has been minimized.

Owner

bhauman commented Mar 19, 2018

Are you still thinking about making this separate middleware?

@danielcompton

This comment has been minimized.

Contributor

danielcompton commented Mar 19, 2018

Yep, just need to separate it out and publish it, will get there soon :)

@danielcompton

This comment has been minimized.

Contributor

danielcompton commented Mar 20, 2018

All ready for review now!

@bhauman bhauman merged commit b3d1d16 into bhauman:master Mar 20, 2018

1 check passed

ci/circleci Your tests passed on CircleCI!
Details
@bhauman

This comment has been minimized.

Owner

bhauman commented Mar 20, 2018

looks great!

@danielcompton danielcompton deleted the danielcompton:etag branch Mar 21, 2018

@danielcompton

This comment has been minimized.

Contributor

danielcompton commented Mar 21, 2018

For anyone interested in the future, I wrote a blog post explaining this further here: https://danielcompton.net/2018/03/21/how-to-serve-clojurescript.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment