Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shorter fingerprinting #6241

Open
XhmikosR opened this issue Aug 17, 2019 · 30 comments
Open

Shorter fingerprinting #6241

XhmikosR opened this issue Aug 17, 2019 · 30 comments

Comments

@XhmikosR
Copy link
Contributor

XhmikosR commented Aug 17, 2019

This isn't super important, but it feels like having a 96 (sha384) or 64 (sha256) character hash is too much.

Would it be possible to offer an option to limit this, say to 7 characters? Assuming the algorithm is the same, this shouldn't result in any collisions I think?

I could try doing this myself, but I thought I'd ask.

Thanks!

@bep
Copy link
Member

bep commented Aug 17, 2019

this shouldn't result in any collisions I think?

If that was true, why are they so long in the first place?

So, I agree that for certain uses of this it would be fine (cache-busting), so we should add a length or something option to truncate it.

@bep bep added this to the v0.58 milestone Aug 17, 2019
@XhmikosR
Copy link
Contributor Author

I meant for this specific use case :)

Thanks for considering this.

@davidsneighbour
Copy link
Contributor

If this is about Subresource Integrity hashes please keep in mind, that the standard for the JS and CSS links expects SHA384 for the integrity hashes.

See https://www.w3.org/TR/SRI/

"Cache busting" is a new use for me ;) I always used date strings or random strings for that.

@bep
Copy link
Member

bep commented Aug 18, 2019

Yes, we're not messing with the SRI hashes.

@bep bep modified the milestones: v0.58, v0.59, v0.60 Sep 2, 2019
@bep bep modified the milestones: v0.60, v0.61 Oct 21, 2019
@bep bep modified the milestones: v0.61, v0.62, v0.63 Nov 25, 2019
@bep bep modified the milestones: v0.63, v0.64 Dec 11, 2019
@bep bep modified the milestones: v0.64, v0.65 Jan 22, 2020
@bep bep modified the milestones: v0.65, v0.66 Jan 30, 2020
@bep bep modified the milestones: v0.66, v0.67 Mar 2, 2020
@bep bep modified the milestones: v0.67, v0.68 Mar 9, 2020
@bep bep modified the milestones: v0.68, v0.69 Mar 20, 2020
@bep bep modified the milestones: v0.69, v0.70 Apr 8, 2020
@bep bep added this to the v0.119.0 milestone Sep 15, 2023
@bep bep modified the milestones: v0.119.0, v0.120.0 Oct 4, 2023
@bep bep modified the milestones: v0.120.0, v0.121.0 Oct 31, 2023
@bep bep modified the milestones: v0.121.0, v0.122.0 Dec 6, 2023
@bep bep modified the milestones: v0.122.0, v0.123.0, v0.124.0 Jan 27, 2024
@bep bep modified the milestones: v0.124.0, v0.125.0 Mar 4, 2024
@akbyrd
Copy link

akbyrd commented Mar 16, 2024

Adding context from my use case and an issue with current workarounds when typescript resources are renamed by js.Build.

I have a typescript file I want to compile to javascript, minify, add an SRI hash, and cache-bust with a reasonably short filename.

Given an input like main.ts, I'm looking to get this output

<script src="/main-G7Cf3v.js" integrity="sha256-..." crossorigin="anonymous"></script>

js.Build will rename the file from main.ts to main.js and will update .RelPermalink and .Permalink, but not .Name. This means I don't have a way to get the final extension without also publishing the resource. The strategy I'm currently using below mostly works, but it will end up with src="/main-G7Cf3v.ts" with .ts instead of .js.

{{ with resources.Get "main.ts" }}
	{{ $resource := . | js.Build (dict "minify" true) | fingerprint }}

	{{ $fingerprint = $resource.Data.Integrity }}
	{{ $hash := index (index (findRESubmatch `sha\d*-(.*)` $fingerprint) 0) 1 }}
	{{ $shortHash := $hash | first 6 }}
	{{ $shortHash = replace $shortHash "+" "p" }}
	{{ $shortHash = replace $shortHash "/" "f" }}
	{{ $shortHash = replace $shortHash "=" "e" }}

	{{/* $resource.Name is still main.ts instead of main.js */}}
	{{/* $resource.RelPermalink will cause a second copy to be published */}}

	{{ $newName := replaceRE `\.([^.]*\z)` (printf "-%s.$1" $shortHash) $resource.Name }}
	{{ $resource = $resource.Content | resources.FromString $newName }}

	<script src="{{ $resource.RelPermalink }}" integrity="{{ $fingerprint }" crossorigin="anonymous"></script>
{{ end }}

I can think of a few possible solutions:

  1. Add an option to fingerprint to limit the number of hash characters used to rename the file for cache-busting purposes.
  2. Add a method to move/rename resources. This would effectively allow .RelPermalink to be mutated.
  3. Add way way to get a resource's output name without publishing it.

I'm still new to hugo so hopefully I haven't gotten any of the details wrong.

@jmooring
Copy link
Member

@akbyrd
Copy link

akbyrd commented Mar 16, 2024

I don't think that provides a solution to the filename issue (though I imagine it's more efficient than using resources.FromString?).

There's no way to get the proper name of the original resource (with .js) without publishing it.

@jmooring
Copy link
Member

jmooring commented Mar 16, 2024

I just tested resources.Copy with a fingerprinted file, and it retains the hash in the name. That is, at least for me, unexpected. For example:

{{ resources.Copy "js/app.js" . }} → app.f85a8953420a757cf53434fead706fb0d2864bd93c5f7294fef74d84cfbc20ce.js

So, that's not terribly helpful.

Since you know you're building a js file, is there a problem with hardcoding the extension?

Also, if you can live with a 10 character hash, pass $fingerprint through https://gohugo.io/functions/crypto/fnv32a/.

@akbyrd
Copy link

akbyrd commented Mar 16, 2024

I have a partial template that I use for all my resources. It isn't specific to javascript. It also wouldn't be possible to share with others without hard-coding all possible extension changes, or adding a caveat that it won't work for anything other than ts -> js.

I don't believe the hash persisted through a copy when I tried it, but I could be misremembering.

@akbyrd
Copy link

akbyrd commented Mar 16, 2024

Also, if you can live with a 10 character hash, pass $fingerprint through https://gohugo.io/functions/crypto/fnv32a/.

I believe this still has the same issue: when I name the file, how do I get the correct extension without publishing the resource with the fingerprint in the name? Or hard-coding the extension so it only works with assets types I explicitly handle?

@jmooring
Copy link
Member

jmooring commented Mar 16, 2024

My point with crypto.FNV32a was that you can eliminate 5 lines of code. This has nothing to do with the extension.

@akbyrd
Copy link

akbyrd commented Mar 16, 2024

Ah gotcha. Thanks for the tip.

@jmooring
Copy link
Member

Also see #12143 for access to the extension. I'm also going to log an issue against resources.Copy, which may be rejected, but it seems to me if I say "copy to foo" it should copy to "foo".

@jmooring
Copy link
Member

jmooring commented Mar 16, 2024

Finally, there's an intentionally undocumented method .Key which will give you the extension you want, but there is no API promise on this. This is a cache key that just happens to include the extension, but you might be able to use it until the key format changes.

This seems to work OK with v0.124.0:

{{ $assetPath := "ts/main.ts" }}
{{ with resources.Get $assetPath }}
  {{ $basename := path.BaseName .Name }}
  {{ with . | js.Build | fingerprint }}
    {{ $ext := path.Ext .Key }}
    {{ $hash := crypto.FNV32a .Data.Integrity }}
    {{ with resources.Copy (printf "%s-%d%s" $basename $hash $ext) . }}
      <script src="{{ .RelPermalink }}" integrity="{{ .Data.Integrity }}" crossorigin="anonymous"></script>
    {{ end }}
  {{ end }}
{{ else }}
  {{ errorf "Unable to get global asset %q" $assetPath }}
{{ end }}

This produces something like main-4229906572.js in the root of the public directory.

Use with caution. We may change the value returned by .Key at any time.

@akbyrd

@Seirdy
Copy link
Contributor

Seirdy commented Mar 28, 2024

I use a quick crypto.FNV32a-based fix for short cache-busting fingerprints that doesn’t directly rely on the unstable .Key method.

I use Hugo’s crypto.FNV32a to generate a short hash, then copy the resource to a new path with that fingerprint.
{{ $resource := resources.Get . -}}
{{- $target_path_formatStr := (replaceRE `(\.[^\.]*)$` ".%d$1" .) -}}
{{- $cacheBuster := $resource.Content | crypto.FNV32a -}}
{{- $target_path := printf $target_path_formatStr $cacheBuster -}}
{{- return resources.Copy $target_path $resource -}}
You can see it used in my site’s head element. I invoke it using partialCached so the fingerprinting only happens once per resource:
{{ $icon_svg := partialCached "cache-bust.html" "/favicon.svg" "/favicon.svg" }}
{{- printf `<link rel="icon" sizes="any" href="%s" type="image/svg+xml" />` $icon_svg.RelPermalink | safeHTML }}
Here’s a snippet of the final rendered result:
<link rel="icon" sizes="any" href="/favicon.2229316949.svg" type="image/svg+xml"/>

Encoding it to a higher base and using alphanumerics could shave off 1-2 ch.


Originally posted on seirdy.one: See Original.

@jmooring
Copy link
Member

@Seirdy With this approach you don't have an SRI hash. If you don't fingerprint, the resource's .Data.Integrity is nil. This issue is about short fingerprints in conjunction with fingerprinting.

@bep bep modified the milestones: v0.125.0, v0.126.0 Apr 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants