Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use WHATWG's URL to implement all of source-map's URL operations. #367

Closed
wants to merge 2 commits into from

Conversation

loganfsmyth
Copy link
Contributor

This is my proposed approach for #275. While #360 is a step in the right direction, it still leaves all of the custom logic in util.relative and util.join which seems less than ideal.

As mentioned in a comment in the code, this also uses an npm dependency instead of the global URL because I found at least one case where FF's URL seems to not follow the spec at the moment. This way we can be confident for now that things actually work across platforms. The whatwg-url library is MIT. Given that source-map is currently BSD, this might also be another motivation for #362 since I don't know how distributing a bundle that is a mix of BSD and MIT code works?

This PR definitely qualifies as breaking, so a good review would be very appreciated.

@@ -236,7 +193,7 @@ exports["test computeSourceURL"] = function(assert) {
assert.equal(libUtil.computeSourceURL("src/", "test.js", "http://example.com"),
"http://example.com/src/test.js");
assert.equal(libUtil.computeSourceURL("src", "/test.js", "http://example.com"),
"http://example.com/src/test.js");
"http://example.com/test.js");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this change is wrong according to the spec. It says:

Line 4: An optional source root, useful for relocating source files on a server or removing repeated values in the “sources” entry. This value is prepended to the individual entries in the “source” field.

Then later:

If the sources are not absolute URLs after prepending of the “sourceRoot”, the sources are resolved relative to the SourceMap (like resolving script src in a html document).

That is, the source root and the source are joined with string concatenation; then URL resolution is done using the source map's URL. Here, IIUC, the source root is src and the source is /test.js, so by my reading the test was correct before this change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is where I'm on the fence too. It depends on if we want to interpret prepend. Doing a straight string concatenation seems terrible, but it's hard to see the existing logic of adding a / if there isn't one as anything but super confusing spec-wise too, since the spec also never mentions prepending with a / when needed. So if we accept that it's supposed to be a url-like concatenation, then an absolute path does override the value in the root.

I'm fine reverting this behavior, but it does seem like this would be the ideal behavior if possible.

What I kind of want to do is land this in Nightly and see if we get complaints about broken sourcemaps, but I'm not sure if that's a good approach or what. I have a hard time thinking of a case where someone's going to have a sources entry starting with / that isn't an absolute path, but you're not wrong that it could technically exist.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess my feeling on it is that the spec is extremely under-specified, and as maintainers of this library we should use it to shepard the spec into a clearly-defined state.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The spec leaves a lot to be desired, and this was a questionable choice, but I don't think this one is ambiguous. I think if you dig through bugzilla (look at the closed bugs that block the source-maps bug) you'll find some real-world cases that motivated changes here. I suggest doing this, and also looking through the source map examples page, before making breaking changes. Also cross-testing against Chrome is useful, as Chrome's source map support is generally better than Firefox's.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair. I see discussions in https://bugzilla.mozilla.org/show_bug.cgi?id=977463 and Lydell pretty much stating exactly what I'd have wanted behavior-wise in #62 (comment). Alas :( At least I know I'm not alone in finding the behavior frustrating.

I'm going to add a change to this PR to see if we can find a good intermediate.

Copy link
Contributor

@tromey tromey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There were a few instances of that change. I think they all have to be fixed.

@fitzgen fitzgen removed their request for review October 18, 2018 13:55
@fitzgen
Copy link
Contributor

fitzgen commented Oct 18, 2018

Removing myself from review request since tromey hopped on it, and only one of us ever needs to r+ a PR since I trust his judgement :)

@coveralls
Copy link

Pull Request Test Coverage Report for Build 524

  • 87 of 89 (97.75%) changed or added relevant lines in 1 file are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage decreased (-0.7%) to 87.457%

Changes Missing Coverage Covered Lines Changed/Added Lines %
lib/util.js 87 89 97.75%
Totals Coverage Status
Change from base Build 518: -0.7%
Covered Lines: 817
Relevant Lines: 913

💛 - Coveralls

@mozilla mozilla deleted a comment from coveralls Oct 18, 2018
@mozilla mozilla deleted a comment from coveralls Oct 18, 2018
@loganfsmyth
Copy link
Contributor Author

@tromey I realized I never left a comment, but this should be ready for another pass. I think the new commit I added should address the core of the joining issue while still allowing broader support for URL-like things. Thoughts?

@tromey
Copy link
Contributor

tromey commented Oct 23, 2018

To my surprise I found that Chrome also has this weird reading of the "append" text, see https://codereview.chromium.org/13898010/diff/5001/Source/devtools/front_end/SourceMap.js. Confirmed by writing a test case.

@@ -1049,26 +1049,6 @@ exports["test sourceRoot + originalPositionFor"] = async function(assert) {
map.destroy();
};

exports["test github issue #56"] = async function(assert) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to me that this test should still work and should not be removed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a new sourceRoot example here that should show this in action. Works in Chrome.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a breaking change, you're not wrong, but it also seems like something that isn't necessary to support. I have a super hard time imagining a case where anyone is actually going to do this. Webpack does

sourceRoot: "webpack:///",
sources: ["/foo/bar.js"],

but that will work fine with the existing code. It seems very unlikely that someone would split up a scheme and a host into two separate pieces in a real map, and if they do that seems like something we should be discouraging.

assert.equal(libUtil.normalize("/a/b//c////d/////"), "/a/b/c/d/");
assert.equal(libUtil.normalize("///a/b//c////d/////"), "///a/b/c/d/");
assert.equal(libUtil.normalize("a/b//c////d"), "a/b/c/d");
assert.equal(libUtil.normalize("/a/b//c////d/////"), "/a/b//c////d/////");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure it matters but this doesn't seem like a "normal" form. What's the deal here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Multiple slashes in URLs have meaning, they are just empty URL pieces. For example, ../../file.js relative to http://example.com/a/b/c/d/// is http://example.com/a/b/c/d/file.js, not http://example.com/a/b/file.js.

new URL("../../file.js", "http://example.com/a/b/c/d///").href;
// "http://example.com/a/b/c/d/file.js"


assert.equal(libUtil.normalize("http://www.example.com"), "http://www.example.com");
assert.equal(libUtil.normalize(".///.././../a/b//./.."), "a/b/");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original result here seems more correct than this one to me. What's the rationale for this change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to above, the slashes have meaning.

new URL(".///.././../a/b//./..", "http://example.com/1/2/3/4/").href
// "http://example.com/1/2/3/4/a/b/"

so the path is a/b/ relative to the base.

// must behave as "some-dir/some-path.js".
//
// With this library's the transition to a more URL-focused implementation, that behavior is
// preserved here. To acheive that, we trim the "/" from absolute-path when a sourceRoot value
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo, "achieve"

@loganfsmyth
Copy link
Contributor Author

@tromey Would you be up for picking a time to talk over this stuff? If my proposed changes are too ambitious, at the end of the day I understand, either way I'd like to settle on something so we can move forward.

I think things are made more difficult because there are kind of two sides to source-maps in general. On one hand, we have Firefox which understandably wants to be permissive and consistent with Chrome. On the other side, we have source-map as a library for consumers to use. I probably came into this focused more on the library side, and thus more willing to drop support for things that seem like underdefined edge cases. I've had some frustrations with source-map at times and was hoping to really make source-map a solid base. As it is, there are several re-implementations of sourcemap parsing, and they all do things slightly differently around root + source handling. It seems like the community as a whole would be well-served by a well-defined, common library for consuming sourcemaps.

My biggest issue really comes down to the fact that treating the "sourceRoot" and "sources" values as opaque strings that only have meaning after concatenation means that there are other pieces of source-map that probably need a big overhaul that complicates things. Primarily, if we're expecting sourceRoot to be opaque, then realistically util.relative, util.normalize, and util.join should be avoided at all costs in the generator and source-node files. For instance, if sourceRoot isn't a URL, then given something like

newMapping.source = util.relative(sourceRoot, newMapping.source);
, how should that that be computed? It technically works right now, but that's implemented as a mess of custom string processing that is implementation-defined custom logic. That code would need to be updates to use the original sources value from the consumer's input, rather than the joined root + source value exposed as .sources on the consumer. Another example is
mapping.source = util.relative(sourceRoot, mapping.source);
There are actually existing bugs here, because there are just some maps that can't be generated safely. For example

const { SourceMapConsumer, SourceMapGenerator } = require(".");

(async function() {
  const module = await new SourceMapConsumer({
    version: 3,
    file: "built-module.js",
    sourceRoot: "webpack:///folder/",
    sources: ["module.js"],
    mappings: "AAAA",
  });

  const gen = new SourceMapGenerator({
    sourceRoot: "other/",
  });
  gen.addMapping({
    source: "built-module.js",
    generated: { line: 1, column: 0 },
    original: { line: 1, column: 0 },
  });
  gen.addMapping({
    source: "index.js",
    generated: { line: 2, column: 0 },
    original: { line: 1, column: 0 },
  });

  gen.applySourceMap(module);

  const result = gen.toJSON();
  console.log(result);
  //  { version: 3,
  //    sources: [ 'webpack:///folder/module.js', 'index.js' ],
  //    names: [],
  //    mappings: 'AAAA;ACAA',
  //    sourceRoot: 'other/' }
})();

produces a map where the joined path is other/webpack:///folder/module.js which is incorrect given that the concatenated root+source was an absolute URL. The call to applySourceMap would need to delete the sourceRoot value and change the sources entry to other/index.js.

Copy link
Contributor

@fitzgen fitzgen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM -- can you run the benchmarks with and without this PR to double check that removing the LRU cache doesn't regress perf? We might want to keep that and wrap it around the URL parsing.

In which case we would have an LRU caching URL parser ;-p

@jasonLaster
Copy link

LRU caching URL parser

anytime you can have a palindrome you should...

@loganfsmyth loganfsmyth changed the base branch from normalize-consumer-urls to master November 13, 2018 23:03
@loganfsmyth
Copy link
Contributor Author

loganfsmyth commented Nov 15, 2018

This got auto-closed because I deleted its base branch. Github won't let me reopen even though I fixed that, so I've opened #371.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants