Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: io/fs, net/http: define interface for automatic ETag serving #43223

Open
rsc opened this issue Dec 16, 2020 · 38 comments
Open

proposal: io/fs, net/http: define interface for automatic ETag serving #43223

rsc opened this issue Dec 16, 2020 · 38 comments
Labels
Projects
Milestone

Comments

@rsc
Copy link
Contributor

@rsc rsc commented Dec 16, 2020

In the discussion of io/fs and embed, a few people asked for automatic serving of ETag headers for static content, using content hashes. We ran out of time for that in Go 1.16, in part because it's not entirely clear what we should do.

Here is a proposal that may be full of holes.

First, in io/fs, define

package fs

// A ContentHashFile is a file that can return its own content hash efficiently.
type ContentHashFile interface {
	File
	
	// ContentHash returns a content hash of the file that uniquely
	// identifies the file contents, suitable for use as a cache key.
	// The returned hash should be text, such as hexadecimal or base64.
	//
	// ContentHash must NOT compute the hash of the file during the call.
	// That is, it must run in time O(1) not O(length of file).
	// If a content hash is not already available, ContentHash should
	// return an error rather than take the time to compute one.
	ContentHash() (string, error)
}

Second, in net/http, when serving a File, if it implements ContentHashFile and the ContentHash method succeeds and is alphanumeric (no spaces, no Unicode, no symbols, to avoid any kind of header problems), use that result as the default ETag.

@johnjaylward

This comment was marked as off-topic.

@Merovius
Copy link

@Merovius Merovius commented Dec 16, 2020

I feel "the returned hash should be text, such as hexadecimal or base64" is either not restrictive enough, or too restrictive. I think it would be reasonable to use the identifier in a URL, but not all base64 encodings are usable as such. So, IMO, we should either

  1. Be more restrictive and specify what usages need to be fulfilled: "usable as an ETag header", "usable as a URL fragment" (I don't know if one of these implies the other), …
  2. Be more restrictive and specify what encoding should be used: "hex", "base64.RawURLEncoding"…
  3. Be less restrictive and specify that no encoding is used and the caller needs to re-encode based on intended usage: "the returned string may contain arbitrary bytes and must be encoded by the caller based on usage" (we might also use []byte as a signal, though string allows the interface to be allocation-free)

I think the last would be the least convenient in actual usage, but the most flexible.

#43076 is similar to this issue in the proposed solution and the addressed problem (so one or the other should be closed as duplicate, I think). It contains some discussion of other use-cases and how the API would need to be more complicated to serve them - notably, it was suggested that the hash should be able to be of arbitrary length and that it should be possible to specify the required hash-algorithm (useful if used e.g. in subresource integrity). I'm not fully on board with that being needed, but I wanted to bring it up.

@johnjaylward I think O(length of file) is clearer. Note, that if you want to be nitpicky, it is oxymoronic to "run in time O(1) not O(n)", because O(1)⊆O(n). So it'd need to be "run in time O(1) not ω(1)" (or something). Sometimes, clarity takes precedence over technical correctness.

@johnjaylward
Copy link

@johnjaylward johnjaylward commented Dec 16, 2020

@johnjaylward I think O(length of file) is clearer. Note, that if you want to be nitpicky, it is oxymoronic to "run in time O(1) not O(n)", because O(1)⊆O(n). So it'd need to be "run in time O(1) not ω(1)" (or something). Sometimes, clarity takes precedence over technical correctness.

  • O(1) is constant time regardless of input size.
  • O(n) is linear time dependent on input size of n. In this case n would be the size of the file.

A concern that the function execution time is relative to the size of the file is not O(1) or ω(1). It would be some function of n. If n is non-linear, then you'd need to specify that as the class of function being compared (n log n for instance.)

Big O notation should be used correctly, or it should be avoided in favor of plain English. Using a mix of BigO and something else just feels wrong.

Since what I believe @rsc is trying convey is that Constant time should be guaranteed, I think plain English should be preferred.

Something like:

That is, it must run in constant time.

Optionally maybe specifying what isn't expected:

Execution time should not be relative to file size.

I'm also concerned about what this is expected to happen if the ContentHash returns an error before transport. Does the transport fail, or continue with no ETAG?

@rsc

This comment was marked as off-topic.

@rsc
Copy link
Contributor Author

@rsc rsc commented Dec 16, 2020

I'm also concerned about what this is expected to happen if the ContentHash returns an error before transport. Does the transport fail, or continue with no ETAG?

It continues with no ETag.

@johnjaylward
Copy link

@johnjaylward johnjaylward commented Dec 16, 2020

I am explicitly avoiding saying "constant time" because the cryptographers have redefined that term to mean something much more subtle that I do not intend here. If I had written "constant time" in a paragraph that already contains "content hash", we'd be dying on a different pointless hill.

Understood, it is more of a nitpick than anything worth dying over. I'd prefer to drop improper use of O notation from official language docs. I've not read all of Go's docs by any means, but I've not seen this use anywhere in my reading. If it is used elsewhere, then this is just one more.

@ianlancetaylor
Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Dec 16, 2020

Will it ever be necessary to know what kind of content hash function is being used by the underlying file system? For example, perhaps a security sensitive application would be willing to accept a SHA-256 hash but not an MD5 hash.

@rsc
Copy link
Contributor Author

@rsc rsc commented Dec 16, 2020

I don't know, and that's part of why we didn't propose anything for Go 1.16. ContentHash seems too vague at some level, but calling it ETag is too specific. We could call it something else - ContentID? - but it doesn't change the fact that the FS is in charge of which hash gets used, and the caller is expected not to be picky.

@tv42
Copy link

@tv42 tv42 commented Dec 17, 2020

Either the acceptable character set of string has to be defined very clearly, resulting in either a one-size-fits-nobody solution or double encoding for most uses, or encoding should be left to the party that knows what encoding is needed (my preference).

ETag content (inside the quotes) character set is actually pretty different from most other uses:

etagc = "!" / %x23-7E / obs-text
obs-text = %x80-FF
@komuw
Copy link
Contributor

@komuw komuw commented Dec 23, 2020

@dfinkel
Copy link
Contributor

@dfinkel dfinkel commented Dec 30, 2020

What if ContentHash() takes an argument indicating which digest/hash-algorithms it accepts?

Obviously a strawman, but how about making the interface definition something like (clearly modifying the example in the OP):

package fs

import "hash"

// HashAlgorithm implementations
type HashAlgorithm interface {
	NewHash() hash.Hash
	String() string
}

// A ContentHashFile is a file that can return its own content hash efficiently.
type ContentHashFile interface {
	File

	// ContentHash returns a content hash of the file that uniquely
	// identifies the file contents, suitable for use as a cache key.
	// The returned hash should be text, such as hexadecimal or base64.
	//
	// The HashAlgorithm-typed argument is a prioritized list of
	// hash-algorithm, most-proffered first. (a nil argument may return an
	// arbitrary, but consistent, implementation-defined content-hash)
	//
	// ContentHash must NOT compute the hash of the file during the call.
	// That is, it must run in time O(1) not O(length of file).
	// If a matching content hash is not already available, ContentHash
	// should return an error rather than take the time to compute one.
	ContentHash([]HashAlgorithm) (string, error)
}

Then, individual packages (crypto/md5, hash/crc32, golang.org/x/crypto/sha3) can have specific package-level instances of this interface, and implementations can do a quick equality check against the package-level variables.

I figure making the HashAlgorithm interface have a method that returns a hash.Hash makes it actually useful, and could make it easier to implement a more flexible interface that allows for calculating the hash of a file's contents at runtime, but uses a pre-computed value if it's available. (perhaps ContentHash should be called PrecomputedContentHash?)

It might make sense to define the HashAlgorithm interface hash package

Note: an enum-ish type would be another option, but then individual implementations would be limited to just the types defined here.
I almost started writing an enum and the interface to provide flexibility, but then the fs package would have some method implementations that pull in every type that implements hash.Hash. It seems better to define an interface in the packages.

@Merovius
Copy link

@Merovius Merovius commented Dec 30, 2020

@dfinkel I think as #43076 proves, designing an API to do anything we could reasonably want from it is a very deep rabbit hole. I feel like we should at least first agree if this is a rabbit hole we want to go down, before we get ourselves stuck in it. That is, what are the use-cases we would consider hard requirements?

So far, this issue is titled "define interface for automatic ETag serving", so a priori that should be considered the only requirement. And for that, the interface you are suggesting is not just unnecessarily complex, but I would argue even counterproductive - it suggests that a ContentHashFile should use the passed in HashAlgorithms to compute the hash, when that's intentionally documented as forbidden in the original suggestion.

Of course ETag serving doesn't have to be the only requirement. But importantly, we should make a case for other use-cases to be required first, before trying to design an API for them.

@tv42
Copy link

@tv42 tv42 commented Dec 30, 2020

@Merovius If this use case truly were ETags, then the function discussed here should do all the proper ETag formatting, including the double quotes.

Personally, I think that's a silly use case, by itself.

@sylr
Copy link

@sylr sylr commented Dec 31, 2020

HTTP client side caching is not a silly use case ...

@Merovius
Copy link

@Merovius Merovius commented Dec 31, 2020

@tv42 I tend to agree. And, to repeat, I'm not even saying we should limit ourselves to ETag serving. I don't know where, or how to draw the line. But I think a line needs to be drawn, because most of the APIs mentioned in #43076 seem unacceptably complex for the stdlib to me (though that might just be a personal taste).

FWIW, you mentioned Subresource-integrity as another use-case in that issue and I think that would be an interesting case to at least think about.

@dfinkel
Copy link
Contributor

@dfinkel dfinkel commented Dec 31, 2020

@Merovius thanks for the pointer to #43076, I missed it before.

I'll be honest, I don't actually care that much about ETags, but I do see end-to-end checksum protection as critical for storage systems at scale. The cryptographic hashes are great for this goal, but only part of the story. I have a strong preference for crc32c in many places because it's cheap to compute the CRC of the concatenation of two blobs that you already have the CRC for (and most x86 and ARM chips have instructions to accelerate CRC32c computations explicitly).

IMO, creating an interface within io/fs that only serves ETags is a wasted opportunity (and likely confusing going forward). It's a trivial use-case for much more powerful primitives, where ETag only cares about "a content identifier", most other interesting use-cases (e.g. end-to-end checksum protection) require knowing which one, and sometimes requesting a specific one. In many cases (e.g. GCS) there are multiple digests/hashes computed, and it's useful to have both (without having to construct a new fs.FS handle).

The reason I included the NewHash() hash.Hash method in my interface proposal was too make it a more useful interface that isn't just a named interface{} or yet another name for fmt.Stringer. Given that we'd want many packages to support it outside the stdlib, and all of them have implementations of hash.Hash, it seemed appropriate to make this interface a "factory". (I'm definitely open to discussing alternatives)

One thing to keep in mind is that availability of specific hashes for some filesystems is history-dependent. e.g. GCS (and S3) don't store an MD5 if you've used the "Compose" API to concatenate several objects. (although GCS still has a CRC, and S3 has some other value in its ETag)

@tv42
Copy link

@tv42 tv42 commented Dec 31, 2020

@sylr Locking into doing nothing else but ETags would, in my opinion, be silly.

Embracing Subresource Integrity (https://w3c.github.io/webappsec-subresource-integrity/) would mean the return value should be (a set of) 1) hash algo 2) digest 3) optional options. If those are kept separate, then the ETag use case can just use the truncated digest as the verifier.

package fs

type Integrity struct {
        // Algorithm as defined by Content Security Policy Level 2, section 4.2
	Algorithm string
        Digest []byte
        // SRI options are not supported at this time, but may be added later?
}

// A ContentHashFile is a file that can return its own content hash efficiently.
type ContentHashFile interface {
	File
	
	// ContentHash returns a content hash of the file that uniquely
	// identifies the file contents, suitable for use as a cache key.
	//
	// ContentHash must NOT compute the hash of the file during the call.
	// That is, it must run in time O(1) not O(length of file).
	// If a content hash is not already available, ContentHash should
	// return an error rather than take the time to compute one.
        //
        // Callers must not mutate the returned values.
	ContentHash() ([]Integrity, error)
}
@ianlancetaylor
Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Dec 31, 2020

As far as I know, the file systems that make a hash code available only make a single hash code available. If that is true, then I don't see a good argument for requesting a specific hash algorithm, or a set of acceptable hash algorithms. It would suffice to have some way for the file system to report which hash algorithm it is using.

I don't know of any current listing of hash algorithms. So I would tentatively recommend using package paths.

// A ContentHashFile is a file that can return its own content hash efficiently.
type ContentHashFile interface {
	File
	
	// ContentHash returns a content hash of the file that uniquely
	// identifies the file contents, suitable for use as a cache key.
	// The returned hash should be text, such as hexadecimal or base64.
	// The returned algorithm should be a package path, preferably in the
	// standard library, that identifies the hash algorithm being used; for example:
	// "hash/crc32" or "crypto/sha256".
	//
	// ContentHash must NOT compute the hash of the file during the call.
	// That is, it must run in time O(1) not O(length of file).
	// If a content hash is not already available, ContentHash should
	// return an error rather than take the time to compute one.
	ContentHash() (hash, algorithm string, error)
}
@tv42
Copy link

@tv42 tv42 commented Dec 31, 2020

@ianlancetaylor If you force everything to only provide just one hash digest, it's impossible to gradually migrate to another hash algorithm. You can't switch everything over to the new one when the clients don't have the support, but you can't provide the new hash along with the old hash if the API is forcing just one answer.

GCS seems to have CRC32C and MD5 at this time, and has a mechanism for adding more if they wanted to: https://cloud.google.com/storage/docs/hashes-etags https://cloud.google.com/storage/docs/xml-api/reference-headers#xgooghash

Here's AWS S3 talking about MD5 and a tree hash based on SHA256: https://docs.aws.amazon.com/amazonglacier/latest/dev/checksum-calculations.html

Min.IO is S3-compatible so they store MD5s, but it also uses the faster HighwayHash for integrity protection: https://min.io/product/overview

The world will need to move beyond MD5 at some point.

@ianlancetaylor
Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Jan 1, 2021

Thanks. If file systems are already storing and providing multiple different kinds of hashes, then I was mistaken.

I guess one question would be whether it is ever important for users of this interface to ask for multiple hashes. When would you not want the most secure version?

And another question would be whether package paths are a reasonable way to describe the hash.

@Merovius
Copy link

@Merovius Merovius commented Jan 1, 2021

@ianlancetaylor Personally, I don't like the idea of using package paths to describe the hash. They are easy to typo and can't be compiler-verified. I have suggested to use package-scoped values instead. That is, you'd have e.g. crypto/sha256.Handle (no idea what a good name would be) be a singleton value that can be passed to uniquely identify sha256. Third party packages can easily define their own handle that can be passed around.

Of course, the downside is that this requires an import to actually use the handle. I'm not sure how big of a problem that is though - IMO an extra import only has an added cost, if you link an otherwise unused package into the binary and I would argue that if you want to use sha256 or if you are able to handle/compute sha256, you already use it. But I might be overlooking something.

@thomasf
Copy link

@thomasf thomasf commented Jan 1, 2021

In the case of serving content from a AWS S3 you probably just want to straight up use their premade ETag which differs depend on how the files are stored:

AWS DOCS:

The entity tag represents a specific version of the object. The ETag reflects changes only to the contents of an object, not its metadata. The ETag may or may not be an MD5 digest of the object data. Whether or not it is depends on how the object was created and how it is encrypted as described below:

  • Objects created through the AWS Management Console or by the PUT Object, POST Object, or Copy operation:
    • Objects encrypted by SSE-S3 or plaintext have ETags that are an MD5 digest of their data.
    • Objects encrypted by SSE-C or SSE-KMS have ETags that are not an MD5 digest of their object data.
  • Objects created by either the Multipart Upload or Part Copy operation have ETags that are not MD5 digests, regardless of the method of encryption.
@thomasf
Copy link

@thomasf thomasf commented Jan 1, 2021

I don't know but maybe an etag specific interface specifically indended for use by http.FS in the standard library wouldn't be super bad idea.

@dfinkel
Copy link
Contributor

@dfinkel dfinkel commented Jan 1, 2021

@thomasf I think an ETag-specific interface within io/fs would be confusingly (and frustratingly) narrow.

As some here have pointed out, ETag is a relatively trivial use-case, since it only needs a hash, but doesn't care which one, as long as it's consistent. (strictly speaking it doesn't even need to be a content-hash, so long as it changes with the content -- a version number or commit-hash might suffice in some cases, but that's beside the point because there's already mtime handling in net/http)

There are important use-cases involving specific content-hashes, and it seems much better to have a function in net/http to create an ETag from any fs.File implementation that implements fs.ContentHashFile.

If a fs.ContentHashFile implementation wants to return a specific value for ETag, we can have a net/http.EtagHandle for the hash-type (if we're following a variant of what @Merovius and I have proposed). -- then our net/http.FileETag() could pick something appropriate and do correct encoding, but all other use-cases that care which digest/hash is used (or need a specific one) can use the fs.ContentHashFile interface, and don't need to think about ETags at all. (I imagine that in the absence of an etag-specific hash, FileETag() would pick the strongest digest available (and skip over known-collision-prone hashes like CRCs, FNVs, etc.)

Note: I don't have a use-case for exporting net/http.FileETag() so it might just be part of one of the methods of net/http.FileServer.

@thomasf
Copy link

@thomasf thomasf commented Jan 1, 2021

@dfinkel the interface only needs to be defined by http FS and be an optional extension for those who wants to serve http etags. Ifm there also is a content hash interface (which is a separate concern) http.FS could possibly use that if the more specific etag interface was missing. I don't think it's a good idea to it up so far that an interface called ContentHash starts returning etag's based on document revisions.

I guess that you could use your own http middlwares to achieve things like proper forwarding of etags as well. I haven't really thought about all the implications of the different solutions.

@Merovius
Copy link

@Merovius Merovius commented Jan 2, 2021

@thomasf I thought about that as well. But we advocate using wrappers to massage the FS into a form we want served (e.g. using fs.Sub to strip prefixes or a 3rd party wrapper that can be dynamically configured to either serve embedded or on-disk content). So I think the interface actually needs to live in io/fs and be a well-known optional interface.

@Merovius
Copy link

@Merovius Merovius commented Jan 2, 2021

@ianlancetaylor AIUI, subresource integrity requires the hash to be from a specific set - currently sha256, sha384, and sha512. So, even if an FS author wants to use sha3 (or another algorithm that can be reasonably claimed to be more secure than any of those), for that usecase it really needs to be one of those three and you need to be aware which algorithm is used (as the name of the algorithm needs to be part of the tag).

@tv42
Copy link

@tv42 tv42 commented Jan 3, 2021

@ianlancetaylor

I guess one question would be whether it is ever important for users of this interface to ask for multiple hashes. When would you not want the most secure version?

When clients verify the hash, when it's not used as just an opaque identifier.
Subresource Integrity is an example.

Consider: Most of current clients implement hash algo A, which is starting to show weaknesses. The crypto community is starting to feel good about hash algo B. You cannot just start serving only B digests, as most current clients wouldn't know what to do with them. Hence you provide A and B, and wait for clients to implement B before dropping A.

@rsc
Copy link
Contributor Author

@rsc rsc commented Jan 6, 2021

It sounds like there are two concerns.

  1. The API as proposed does not let the caller request a particular implementation.
  2. The API as proposed does not let the implementation say which hash it used. Callers may care so they can double-check the result or confirm that it's one that they will accept (although that latter case bleeds into 1 a little).

For (1), I think that's a different proposal. There will be callers who don't care about the implementation. They just want any hash the file system has available. Like HTTP does for ETags. That's the focus of this proposal. I don't believe we have to engineer it more than that.

For (2), it seems like maybe at least parts of the world are converging on the syntax used by Subresource Integrity
so maybe we should strongly encourage implementations to return a string in the form algo-base64, where algo is encouraged to be "sha256", "sha384", or "sha512". (I don't really want to link to the W3C spec from the fs package but the format is simple enough.)

That makes the result self-identifying and it's still useful as an ETag (arguably more useful). So the interface would be:

// A ContentHashFile is a file that can return its own content hash efficiently.
type ContentHashFile interface {
	File
	
	// ContentHash returns a content hash of the file that uniquely
	// identifies the file contents, suitable for use as a cache key.
	// The returned hash should be of the form algorithm-base64,
	// and implementations are encouraged to use sha256, sha384, or sha512
	// as the algorithm, for interoperability with other systems.
	//
	// ContentHash must NOT compute the hash of the file during the call.
	// That is, it must run in time O(1) not O(length of file).
	// If a content hash is not already available, ContentHash should
	// return an error rather than take the time to compute one.
	ContentHash() (string, error)
}

Thoughts?

@tv42
Copy link

@tv42 tv42 commented Jan 7, 2021

  1. The API as proposed does not let the implementation return multiple hashes. That's needed for e.g. SRI transitions to new hash algorithms, or to e.g. let clients prefer hashes that are cheaper to verify (say, blake3).

Also, just "looks to similar to SRI format" doesn't allow one to pass through the returned value to SRI. If the algo strings aren't what the SRI registry wants, there's going to be a remapping/standard enforcement layer anyway, for anyone who's a careful programmer.

Combined with the "one winner only" concept, SRI's limited hash selection means you might not end up with a SRI-compatible hash digest at all.

@patrickmcnamara
Copy link

@patrickmcnamara patrickmcnamara commented Jan 10, 2021

Is there a reason the proposed interface is very focused on hashes of file content?

ETags can be generated by any means as long as they uniquely identify a specific version of a file, as per RFC7232. The RFC also specifically states that there is no need for the client to know how the ETag is constructed.

I don’t think the interface should be limited to content hashes as some other types of ETag are common, simple and useful, such as timestamps or even just revision numbers. ETags need not be unique across different files. To cater for ETags specifically, I think the interface should allow more than just content hashes. For example, VersionedFile with a Version() (string, error) method (these are bad names but get the idea across).

If this interface is going to be used more generally and for purposes other than ETags, I don’t think that should necessarily limit its originally proposed purpose for ETags.

As well as that, this proposal only allows for ETags when serving Files but having something that works for io.Reader could be useful too although that might be outside of the scope of this proposal.

@rsc
Copy link
Contributor Author

@rsc rsc commented Jan 13, 2021

@patrickmcnamara It's hard to produce unique identifiers across multiple servers and multiple builds of a program. The one way we know that works very well is content hashes. Also we are trying to design something that programs might want for reasons other than ETag generation.

@rsc
Copy link
Contributor Author

@rsc rsc commented Jan 13, 2021

@tv42 Your (3) I was including as part of (1), at least as far as the answer: that's a more complex, different interface. This is the simple interface meant to be used for detecting file changes and uniquely identifying content with some reasonable hash. The format change gives a way for implementations and consumers to understand which one is being used.

So as far as "it's impossible to gradually migrate to another hash algorithm", that's out of scope here. And for the record, CRC32C should not be a valid implementation of ContentHash.

@rsc
Copy link
Contributor Author

@rsc rsc commented Jan 13, 2021

Are there any other objections to #43223 (comment)?
Is this important enough that we should do something at all?

Would people object to dropping this entirely and not setting ETag automatically for embedded content served over net/http?

@rsc
Copy link
Contributor Author

@rsc rsc commented Jan 13, 2021

A third-party library can provide a wrapper around http.FS that adds ETags by walking the fs.FS and pre-computing all the hashes and then using them to set ETags. The amount of embedded data is unlikely to be large enough that the pre-computation will take much time at all. (SHA256 runs at 350 MB/s on my laptop.) That would provide content-hash ETags still without any explicit "go generate" step.

Given that, here is a counter-proposal for thumbs up/thumbs down reactions:

Drop this proposal and let third-party code fill this need.

Thumbs up this comment if you think we should do nothing and abandon this issue. Thumbs down if you think it's important to do something here in the standard library. Thanks!

@Merovius
Copy link

@Merovius Merovius commented Jan 13, 2021

ETags where brought up before the embed-proposal was accepted and the response was "we will make sure that http caching will work". It seems unfortunate to walk that back now, after the proposal has been accepted.

It also feels inelegant to me, to simply discard the hash information the compiler already adds (though TBF, I don't know why it adds it, it doesn't seem to be used, so maybe that is just an artifact of the assumption that we'd want to add ETags).

@mkilpatrick
Copy link

@mkilpatrick mkilpatrick commented Jan 13, 2021

Not directly related to Go, but is tangential -

http.FileServer will not honor an ETag set via a wrapping http.Handler if the FileInfo's ModTime is not the 0 time. If you compile your Go code with Bazel the ModTime will get set to 1/1/2000 due to how Bazel deals with reproducible builds. bazelbuild/bazel#1299

@hherman1
Copy link

@hherman1 hherman1 commented Jan 13, 2021

I don't think its super important to do in the standard library, but I think it would be a nice-to-have. I think if the price of this feature is a new public interface in the standard library "ContentHashFile" then it is not worth it. If this can be added without any new API, then I think its almost a pure win.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
You can’t perform that action at this time.