Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure file.hash is always computed from sha1(file.data). #10330

Merged
merged 3 commits into from Nov 13, 2018

Conversation

Projects
None yet
1 participant
@benjamn
Copy link
Member

commented Nov 12, 2018

With the introduction of lazy compilation in Meteor 1.8, calling

inputFile.addJavaScript({
  ...
  hash: inputFile.getSourceHash(),
  ...
}, function () {
  return compiler.processFilesForTarget(inputFile);
});

becomes problematic, since inputFile.getSourceHash() is usually different from compiler.processFilesForTarget(inputFile).hash, because the latter is computed from the compiled code, whereas the former is computed from the source code.

For example, when we use file.hash to cache imported module identifiers in ImportScanner#_findImportedModuleIdentifiers, we really need to be using the hash of the compiled code, since a single source module can be compiled in different ways. If we cache based on the source hash, there's a risk of reusing the scanned imports from the web.browser version for the
web.browser.legacy version, which can lead to all sorts of problems that are only apparent in legacy browsers.

The quick fix is easy enough: BabelCompiler can simply stop including a hash in the eager options to inputFile.addJavaScript. This fix can be published as a minor update to the babel-compiler and ecmascript packages.

The remaining changes in this commit add another layer of defense against this problem, by ignoring any hash options provided by compiler plugins, in favor of simply computing the hash from the compiled data buffer. These additional changes will become available in the next release of Meteor (likely 1.8.0.1).

Ensure file.hash is always computed from sha1(file.data).
With the introduction of lazy compilation in Meteor 1.8, calling

  inputFile.addJavaScript({
    ...
    hash: inputFile.getSourceHash(),
    ...
  }, function () {
    return compiler.processFilesForTarget(inputFile);
  });

becomes problematic, since inputFile.getSourceHash() is usually different
from compiler.processFilesForTarget(inputFile).hash, because the latter is
computed from the compiled code, whereas the former is computed from the
source code.

For example, when we use file.hash to cache imported module identifiers in
ImportScanner#_findImportedModuleIdentifiers, we really need to be using
the hash of the compiled code, since a single source module can be
compiled in different ways. If we cache based on the source hash, there's
a risk of reusing the scanned imports from the web.browser version for the
web.browser.legacy version, which can lead to all sorts of problems that
are only apparent in legacy browsers.

The quick fix is easy enough: BabelCompiler can simply stop including a
hash in the eager options to inputFile.addJavaScript. This fix can be
published as a minor update to the babel-compiler and ecmascript packages.

The remaining changes in this commit add another layer of defense against
this problem, by ignoring any hash options provided by compiler plugins,
in favor of simply computing the hash from the compiled data buffer.
These additional changes will become available in the next release of
Meteor (likely 1.8.1).

@benjamn benjamn added this to the Release 1.8.1 milestone Nov 12, 2018

@benjamn benjamn self-assigned this Nov 12, 2018

@benjamn benjamn requested review from hwillson and abernix Nov 12, 2018

benjamn added some commits Nov 13, 2018

@benjamn benjamn merged commit f7bd8e9 into devel Nov 13, 2018

19 checks passed

CLA Author has signed the Meteor CLA.
Details
ci/circleci: Clean Up Your tests passed on CircleCI!
Details
ci/circleci: Docs Your tests passed on CircleCI!
Details
ci/circleci: Get Ready Your tests passed on CircleCI!
Details
ci/circleci: Isolated Tests Your tests passed on CircleCI!
Details
ci/circleci: Test Group 0 Your tests passed on CircleCI!
Details
ci/circleci: Test Group 1 Your tests passed on CircleCI!
Details
ci/circleci: Test Group 10 Your tests passed on CircleCI!
Details
ci/circleci: Test Group 2 Your tests passed on CircleCI!
Details
ci/circleci: Test Group 3 Your tests passed on CircleCI!
Details
ci/circleci: Test Group 4 Your tests passed on CircleCI!
Details
ci/circleci: Test Group 5 Your tests passed on CircleCI!
Details
ci/circleci: Test Group 6 Your tests passed on CircleCI!
Details
ci/circleci: Test Group 7 Your tests passed on CircleCI!
Details
ci/circleci: Test Group 8 Your tests passed on CircleCI!
Details
ci/circleci: Test Group 9 Your tests passed on CircleCI!
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
continuous-integration/travis-ci/push The Travis CI build passed
Details

@benjamn benjamn deleted the fix-findImportedModuleIdentifiers-hash-collision branch Nov 13, 2018

This was referenced Nov 14, 2018

benjamn added a commit that referenced this pull request Nov 15, 2018

benjamn added a commit that referenced this pull request Nov 20, 2018

Propagate input hashes all the way through bundling.
Hashes have a number of overlapping but not entirely redundant or
equivalent purposes within the build system.

Hashes of source code are important because they can be computed before
compilation and processing, and thus are useful as keys for caching that
expensive work. Source hashes remain useful even after compilation, as a
way of reflecting the contributions of source-code-sensitive assets like
source maps.

However, source hashes do not tell the whole story, and using them as
cache keys can be risky if the work that's being cached depends on
generated code rather than source code, as we recently discovered with the
findImportedModuleIdentifiers function. The preliminary fix for that
problem (#10330) was to cache findImportedModuleIdentifiers using a hash
of the generated code rather than the source hash.

PR #10330 swung a bit too far in the direction of ignoring source hashes
and considering only hashes of generated code. For example, the URLs of
source maps share the hash of the corresponding resource, but source maps
can change (because of superficial changes in the source code) without
changing the generated code of the resource. Ignoring the source hash when
computing source map URLs resulted in stale source maps with incorrect
line numbers.

A better solution seems to be to propagate the source hash (along with any
hashes of intermediate generated artifacts) all the way through bundling,
so that the final hash of any static resource reflects all information
that could/should change the behavior of that static resource, including
its source map, which embeds the exact source code of all contributing
files in the sourcesContent property. At every step of the way, we merge
all the input hashes into a single hash, so we don't have to keep juggling
multiple hashes, thankfully.

Sub-Resource Integrity (SRI) hashes still need to be computed from just
the final contents of a given asset, so that the browser can verify those
contents without knowing anything about the Meteor build system, but
that's handled separately.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.