-
-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Chunk hashes should be based on the output #2839
Comments
I am not sure if I followed your point correctly here as imports ARE included in the final hash, just not in the initial content hash. This is why hashing works the way it does and will not take minification into account—otherwise, we would need to be able to change imported file names in minified files, which is really problematic. Import maps will indeed allow people to solve this issue because then, you do not need Rollup's algorithm at all. Instead, you can just take REAL content hashes of files without their dependencies and add the hashes as query parameters, which would be the perfect solution and enable partial redeploys of individual files. As for your proposal, that would be a larger refactoring but possible. To handle cycles, one could use the old algorithm to handle these cases and display a warning that hashes may not reflect changes in the |
I see. I didn't realize there was an initial hash and then a final hash. I can look into how that can affect things, but I still believe the final hash should include the entire file contents (including import paths) otherwise it can lead to the bugs I described above.
Can you explain why this is problematic? I think it's problematic for hashing to not take minification into account, and I've experienced actually bugs as a result of this (hence opening this issue).
Understandable. I definitely realize these things take time, I just wanted to point out that minification bugs can definitely lead to issues when redeploying if minification itself doesn't affect the chunk hashes. |
Basically minification is just one of many things that can take place in the renderChunk hook which basically enables plugins to freely modify a chunk. Plugins can use it to add or remove imports or entirely change its contents, e.g. transform it to a binary format. There is no way of safely changing import file names after renderChunk. As for the current hashing, the generated hashes take as much into account as possible except changes in the renderChunk hook. This includes the hashes of all dependencies, the rendered exports including their names and order and anything added by intro/outro/banner/footer hooks. And it works well with circular dependencies. |
Of course this stresses even more the value it would have to be able to do hashing after renderChunk instead of before. But I believe combining both approaches where the old approach serves as a fallback to handle cycles and displays a warning might be a good way forward. |
This comment has been minimized.
This comment has been minimized.
@lukastaegert @isidrok does #2921 address this? |
It gives plugin authors the ability to reflect changes during |
maybe related: #3060 |
A same-same-but-different issue: If source files contain |
Edit: Sorry should have read everything before: @jakearchibald I think the problem with caching is deeper than that and mostly related to how ES modules work today. Assume the listed names here are using content has hashes as The code would look like //app.js
import * as React from "./react_1.js`
const CustomComponent = () => return <h1>hey</h1>
//render, etc. Say now you update React, which will change its name and thus busting the cache for //app.js
import * as React from "./react_2.js`
const CustomComponent = () => return <h1>hey</h1>
//render, etc. Essentially it will have a cascading effect when using ES Modules which makes me question if it's worth using seperate modules at all instead of one large one. |
@seivan what you are talking about is cascading cache invalidation which is indeed a pity but will always be better than having a single bundle since you would have to invalidate its cache every time there is a change. Here is a really nice article on how to avoid it (I'm personally using import-maps with Systemjs). This issue is about calculating output hashes based on the final output, preferably only taking into account executable code. |
i am working on a plugin to solve that https://github.com/direktspeed/plugins/tree/plugin-content-hash @rollup/plugin-content-hash Examplerollup.config.js import {createHash} from 'crypto';
const contentHash = (getHash)=>{
return {
name: 'content-hash',
generateBundle: function(options = {}, bundle = {}, isWrite = false) {
if (!isWrite) {
return
}
const updateBundle = (key,value) => {
if (!value.code) {
//Maybe add asset support later
return;
}
const currentHash = getHash(key)
if (currentHash === value.fileName) {
return;
}
const newKey = key.replace(currentHash,createHash('sha256').update(value.code).digest('hex').substring(0,10))
console.log(currentHash,key,newKey)
value.fileName = newKey
//TODO: if file exists we would throw this out of the bundle! no need to rewrite that file
bundle[newKey] = value
delete bundle[key]
Object.values(bundle).map(currentValue=>{
if (currentValue.imports.includes(key)) {
currentValue.imports = [...currentValue.imports.filter(x => x !== key),newKey]
}
if (currentValue.dynamicImports.includes(key)) {
currentValue.dynamicImports = [...currentValue.dynamicImports.filter(x => x !== key),newKey]
}
if (currentValue.implicitlyLoadedBefore.includes(key)) {
currentValue.implicitlyLoadedBefore = [...currentValue.implicitlyLoadedBefore.filter(x => x !== key),newKey]
}
if (currentValue.code.indexOf(key) > -1) {
currentValue.code = currentValue.code.split(key).join(newKey)
updateBundle(currentValue.fileName,currentValue)
}
})
}
for (const [key, value] of Object.entries(bundle)) {
// only works with none asset chunks assets have source
updateBundle(key,value)
}
console.log(bundle)
}
}
}
export default {
input: 'main.js',
output: {
dir: 'dist',
chunkFileNames: '[name]-[hash].js',
format: 'systemjs'
},
plugins: [contentHash(fileName=>{
const split = fileName.split('-');
const hasHash = split.length > 1
return hasHash ? split.pop().split('.')[0] : fileName;
})]
} it works at present i think this will solve it when it is a bit more designed |
Unless I missed something on a skim-read, that's still relying on Rollup to do the right thing with regard to hash cascading right? If Rollup gets it wrong, your result will also be wrong. |
@jakearchibald this simply does consistent hashing based on the final file content after minifying and everything else. |
Hm, maybe we need to clarify on the test a bit. A solution to this problem shouldn't break the hash cascading https://bundlers.tooling.report/hashing/js-entry-cascade/. |
@jakearchibald i see that not breaking all hashes will change all will work right. |
The JavaScript files produced by rollup have SHAs in their names that are not just generated from their final contents*. Because of this, it is tricky to determine their filename. This proposes a different approach to the file_fingerprint filter, changing it so it gets the SHA'ed filename by regexp rather than using the same hashing as the gulp task. *This issue contains some useful information on how rollup generates its hashes: rollup/rollup#2839
The JavaScript files produced by rollup have SHAs in their names that are not just generated from their final contents*. Because of this, it is tricky to determine their filename. This proposes a different approach to the file_fingerprint filter, changing it so it gets the SHA'ed filename by regexp rather than using the same hashing as the gulp task. *This issue contains some useful information on how rollup generates its hashes: rollup/rollup#2839
The JavaScript files produced by rollup have SHAs in their names that are not just generated from their final contents*. Because of this, it is tricky to determine their filename. This proposes a different approach to the file_fingerprint filter, changing it so it gets the SHA'ed filename by regexp rather than using the same hashing as the gulp task. *This issue contains some useful information on how rollup generates its hashes: rollup/rollup#2839
The JavaScript files produced by rollup have SHAs in their names that are not just generated from their final contents*. Because of this, it is tricky to determine their filename. This proposes a different approach to the file_fingerprint filter, changing it so it gets the SHA'ed filename by regexp rather than using the same hashing as the gulp task. *This issue contains some useful information on how rollup generates its hashes: rollup/rollup#2839
The JavaScript files produced by rollup have SHAs in their names that are not just generated from their final contents*. Because of this, it is tricky to determine their filename. This proposes a different approach to the file_fingerprint filter, changing it so it gets the SHA'ed filename by regexp rather than using the same hashing as the gulp task. *This issue contains some useful information on how rollup generates its hashes: rollup/rollup#2839
The JavaScript files produced by rollup have SHAs in their names that are not just generated from their final contents*. Because of this, it is tricky to determine their filename. This proposes a different approach to the file_fingerprint filter, changing it so it gets the SHA'ed filename by regexp rather than using the same hashing as the gulp task. *This issue contains some useful information on how rollup generates its hashes: rollup/rollup#2839
wow, I am amazed that this is still an issue - it gave me an hour of head ache until I found this issue here. This is a big show stopper guys, how is this not officially resolved yet? |
Ping |
@mileslane @BerndWessels it is on the roadmap for rollup v3+ |
Exactly, I am currently working on this actively on a branch |
I know it has been a long time, fix at #4543 |
This issue has been resolved via #4543 as part of rollup@3.0.0-7. Note that this is a pre-release, so to test it, you need to install Rollup via |
This issue has been resolved via #4543 as part of rollup@3.0.0-8. Note that this is a pre-release, so to test it, you need to install Rollup via |
This issue has been resolved via #4543 as part of rollup@3.0.0. You can test it via |
Expected Behavior / Situation
Adding a minification plugin (which will change a chunk's output contents) should also change the chunk's file hash.
Actual Behavior / Situation
The chunk's hash is unchanged by minification plugins, which can result in lots of subtle bugs when deploying files to production.
For example, consider the following scenario:
Modification Proposal
I see hash determination was already discussed when code splitting was added to rollup, but I think there are some problems with that logic -- specifically this part:
If the imports are not included in the content hash calculation, then partial redeploys will break.
For example, consider a case where you have chunk
A
which depends on chunkB
, which depends on chunkC
. If the contents ofC
changes, then the hashes of all three chunks needs to change. The reason is because if the contents ofC
changes then it needs a new filename in order to cache-bust. But ifC
has a new filename then the contents ofB
must also change (e.g. theimport
path) in order to load the correct version ofC
. And if the contents ofB
change then it also needs a new filename, and so on and so forth all the way down toA
.As a result, I don't see any feasible way to correctly determine file hashes without considering the entire file, including its imports and their file paths.
My recommendation is Rollup should handle this itself after all plugins that can modify source code have run. If it processes all chunks in reverse topologically sorted order, then it should be able to properly calculate the hashes of all chunks based on their full, final contents (including import statements).
The main problem/issue I see with this approach is how to handle circular dependencies. However, Rollup already warns on circular dependencies, so the simplest solution seems to be just continue to warn but add some additional text that explains that content hashes are non-deterministic in bundles with circular dependencies.
Aside: hopefully this issue won't be as tricky to solve in the future once all browsers support import maps, because once they do all file hashes can just be added to the import map declaration rather than having to appear in the file itself.
The text was updated successfully, but these errors were encountered: