Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#265: Replace PhantomJS with jsdom #280

Merged
merged 16 commits into from Jul 7, 2017
Merged

#265: Replace PhantomJS with jsdom #280

merged 16 commits into from Jul 7, 2017

Conversation

vseventer
Copy link
Contributor

@vseventer vseventer commented Dec 31, 2016

See discussion at #265 for background.

@RyanZim
Copy link
Member

RyanZim commented Dec 31, 2016

@vseventer Can you force-push to re-trigger the travis build?

@coveralls
Copy link

coveralls commented Dec 31, 2016

Coverage Status

Coverage increased (+0.2%) to 96.983% when pulling 6258e6c on vseventer:jsdom into 37730d3 on giakki:master.

@RyanZim
Copy link
Member

RyanZim commented Dec 31, 2016

@giakki @XhmikosR @mikelambert Thoughts on this?

@RyanZim
Copy link
Member

RyanZim commented Jan 19, 2017

Current status of this: #265 (comment)

@RyanZim
Copy link
Member

RyanZim commented Jan 19, 2017

You can test this branch with this package: https://www.npmjs.com/package/uncss-jsdom.

@gkatsanos
Copy link

jsdom saved me from 2h of trying to find why phantomjs wasnt doing the job. please merge it :)

@davidtheclark
Copy link
Contributor

@RyanZim I tried out this branch and had a problem. JSDom did not find an asset (a JS file) that PhantomJS was loading without trouble. I wonder if there's something wrong with the asset lookup.

@mikelambert
Copy link
Collaborator

What is an "asset" specifically, and how are you loading it? Is it a basic <script> tag? Anything interesting in how it's specified or used? Any repro case you can link to?

@davidtheclark
Copy link
Contributor

Here is a simple test repo that reproduces the problem I was seeing: https://github.com/davidtheclark/uncss-jsdom-test

With the PhantomJS version, this test works. With uncss-jsdom, I see this error:

Error: Could not load script: "file:///test.js"
    at /Users/davidclark/Downloads/uncss-jsdom-test/node_modules/jsdom/lib/jsdom/browser/resource-loader.js:44:23
    at Object.check (/Users/davidclark/Downloads/uncss-jsdom-test/node_modules/jsdom/lib/jsdom/living/nodes/Document-impl.js:88:11)
    at ReadStream.<anonymous> (/Users/davidclark/Downloads/uncss-jsdom-test/node_modules/jsdom/lib/jsdom/living/nodes/Document-impl.js:107:12)
    at ReadStream.wrappedEnqueued (/Users/davidclark/Downloads/uncss-jsdom-test/node_modules/jsdom/lib/jsdom/browser/resource-loader.js:255:16)
    at emitOne (events.js:77:13)
    at ReadStream.emit (events.js:169:7)
    at fs.js:1678:12
    at FSReqWrap.oncomplete (fs.js:82:15) { [Error: ENOENT: no such file or directory, open '/test.js'] errno: -2, code: 'ENOENT', syscall: 'open', path: '/test.js' }

@mikelambert sorry if the terminology wasn't clear. By "asset" I meant a file that is referenced with a <script> or <link> or something similar. This case is a basic <script> tag.

@RyanZim
Copy link
Member

RyanZim commented Feb 21, 2017

@davidtheclark The issue is the leading / in the asset path. Since assets are loaded via file://, a leading / bases the path off of the OS root directory.

The PhantomJS uncss had couldn't load these assets either, however, it failed silently, jsdom correctly reports the error.

You need to use htmlroot

@davidtheclark
Copy link
Contributor

davidtheclark commented Feb 21, 2017

@RyanZim I tried it with htmlroot and got the same error. Do you not get that error with the test repo?

@vseventer
Copy link
Contributor Author

vseventer commented Feb 21, 2017

uncss allows /css/main.css, so there is no reason why /js/main.js shouldn't be supported also. This could be fixed by having htmlroot (or, more precise, utility.parsePath) apply to stylesheets and scripts (currently, it only applies to stylesheets).

Don't think this is a particular issue with jsdom, especially if PhantomJS ignores the JS file anyway. But something that should be fixed regardless, imo.

@davidtheclark
Copy link
Contributor

@vseventer @RyanZim Thanks for chiming in! I opened up #287 for this issue. I guess it still seems to me that this problem needs to be addressed before the JSDom implementation merges, or else the JSDom implementation should deliver a warning but not throw an error when it encounters root-relative paths. If the behavior is not changed, then a very common and widespread practice will break uncss, right?

@RyanZim
Copy link
Member

RyanZim commented Feb 22, 2017

If the behavior is not changed, then a very common and widespread practice will break uncss, right?

Yes and No. PhantomJS would quietly output without processing the JS; jsdom fails hard (which is better IMO). Either way, this should be addressed (see #287 (comment)).

To be clear, switching to jsdom was always intended to be a breaking change.

@davidtheclark
Copy link
Contributor

davidtheclark commented Feb 22, 2017

@RyanZim: I agree that it's important to bring to the user's attention the fact that their JS is not loading; but a hard failure would mean that we suddenly can't use uncss with our valid HTML that worked before, right? For my case, as one example, I do not care whether the JS loads, would even prefer to be able to tell uncss not to try to load it.

Just wanted to make sure I was being clear about my concern. Thanks for your work on the transition!

@ArmorDarks
Copy link

Nice work!

Any estimations when it will be released?

@ArmorDarks
Copy link

We've tested this branch on our relatively large project, which used PhantomJS-based uncss before.

Everything seems to work perfectly, except few things:

  1. Already mentioned issue with relative paths to js files

    Most annoying that so far it is unclear how to bypass this error.

    Our file:

    <script src='/assets/scripts/main.min.js'></script>

    Results in error

    Error: Could not load script: "file:///J:/assets/scripts/main.min.js"
      at J:\Work\FASenergo\github\node_modules\jsdom\lib\jsdom\browser\resource-loader.js:44:23
      at Object.check (J:\Work\FASenergo\github\node_modules\jsdom\lib\jsdom\living\nodes\Document-impl.js:88:11)
      at Object.check (J:\Work\FASenergo\github\node_modules\jsdom\lib\jsdom\living\nodes\Document-impl.js:91:23)
      at J:\Work\FASenergo\github\node_modules\jsdom\lib\jsdom\living\nodes\Document-impl.js:107:12
      at wrappedEnqueued (J:\Work\FASenergo\github\node_modules\jsdom\lib\jsdom\browser\resource-loader.js:255:16)  at Request.request [as _callback] (J:\Work\FASenergo\github\node_modules\jsdom\lib\jsdom\browser\resource-loader.js:203:9)
      at Request.self.callback (J:\Work\FASenergo\github\node_modules\jsdom\node_modules\request\request.js:186:22)
      at emitTwo (events.js:106:13)
      at Request.emit (events.js:192:7)
      at Request.<anonymous> (J:\Work\FASenergo\github\node_modules\jsdom\node_modules\request\request.js:1081:10)  at emitOne (events.js:96:13)
      at Request.emit (events.js:189:7)
      at Gunzip.<anonymous> (J:\Work\FASenergo\github\node_modules\jsdom\node_modules\request\request.js:1001:12)
      at Object.onceWrapper (events.js:291:19)
      at emitNone (events.js:91:20)
      at Gunzip.emit (events.js:186:7)
      at endReadableNT (_stream_readable.js:974:12)
      at _combinedTickCallback (internal/process/next_tick.js:74:11)
      at process._tickCallback (internal/process/next_tick.js:98:9)
     { Error: ENOENT: no such file or directory, open 'J:\assets\scripts\main.min.js'
    
      errno: -4058,
      code: 'ENOENT',
      syscall: 'open',
      path: 'J:\\assets\\scripts\\main.min.js' }
    
  2. If we delete links to js files, everything works fine. However, for some reason this class gets stripped in jsdom version:

    svg:not(:root) { overflow: hidden; }

    I wonder why? It doesn't render :root properly?

However, there are also few positive things:

  1. It consumes two times less memory

  2. Turned out that PhantomJS version wrongly stripped those classes:

    @media print {
      *,
      *:before,
      *:after,
      *:first-letter, /* this */
      *:first-line /* and this */ { background: transparent !important; ... }

    In jsdom version they preserved, as expected.

@RyanZim
Copy link
Member

RyanZim commented Feb 28, 2017

@ArmorDarks

  1. WIP PR to fix JS issue: https://github.com/vseventer/uncss/pull/2.
  2. Hmm... needs investigation. Looks like a bug in jsdom. Edit: jsdom issue filed: document.querySelector('svg:not(:root)') inconsistency jsdom/jsdom#1750

@ArmorDarks
Copy link

Sorry, my bad; htmlroot can be relative to cwd.

Yeap, tested with full paths — seems to be no difference.

I thought that issue might be related to Grunt task. But I desicted it and simplified to plain example, and it made no difference too.

Here is simplified case:

Grunt task:

const uncss = require('uncss-jsdom')

module.exports = function (grunt) {
  grunt.registerMultiTask('uncss', 'Remove unused CSS', function () {
    const done = this.async()

    uncss('build/index.html', { htmlroot: 'build' }, function () {
      done()
    })
  })
}

/build/index.html:

<!doctype html>
<html>

  <head>
    <link rel='stylesheet' href='/assets/styles/style.css'>
  </head>

  <body>
    <script src='/assets/scripts/main.js'></script>
  </body>

</html>

build/assets/styles/style.css:

.test { border: 0 }

build/assets/scripts/main.js:

console.log('Hi!')

You can see full structure in Kotsu repository (though, you will need to build it to see build directory). In fact, you can install Kotsu, then go into /node_modules/grunt-uncss/tasks and change require inside js file to uncss-jsdom, and test whole thing with Kotsu to see same error.

Hm, I also peeked into error message again and thought — maybe it's related to OS-specific path normalization? Windows handles paths differently from Linux (uses \ for paths instead of /), and I'm exactly on Windows 10...

@vseventer
Copy link
Contributor Author

vseventer commented Apr 5, 2017

@ArmorDarks It might be - I'm using the test files (on Mac) as you described and Hi! gets logged to the console. I will try the same on my Windows machine later today, and see if that makes a difference.

@ArmorDarks
Copy link

So sad without this PR :(

@vseventer Did you have any time to do test on PC?

@vseventer
Copy link
Contributor Author

@ArmorDarks Sorry, this escaped my attention, I'm on Mac so this is not high priority for me. That said, I'll try and get to it in the next couple of days.

@mikelambert
Copy link
Collaborator

So this has been sitting around awhile, while I left it unintegrated hoping for more testing (didn't realize it would take so long and lose momentum, sorry!)

AFAIK, these are the only two known issues?

Is it worth adding a built-in ignore to catch the svg/root issue? I'm not clear on this syntax, to know if a valid uncss ignore regex can be built that catches all these cases? (Then we could ship without breaking clients, and could just remove the built-in ignore when jsdom/jsdom#1750 gets fixed.)

And assuming it is broken on Windows (and not something unique to @ArmorDarks setup), I would consider that blocking bug (even though I also am only on Mac). @ArmorDarks , do you think you'd have any time to dig into the uncss/jsom code yourself and see where things are going wrong and where a patch might be devised? Otherwise we're just hoping @vseventer can find the time to do more testing... :)

@ArmorDarks
Copy link

@mikelambert Yeap, this seems to be right

AFAIK, these are the only two known issues?

So far yes, That's the only ones we were able to find on our codebase. It should cover most part of use cases, but we don't use some part of CSS at all (for instance, deeply nested selectors with high specificity), so I can only hope that it won't break anything on more tricky codebases. But I think we'll never know untill will let it go into the wild, since it seems that not much people finding that PR and even less doing testing.

Is it worth adding a built-in ignore to catch the svg/root issue? I'm not clear on this syntax, to know if a valid uncss ignore regex can be built that catches all these cases? (Then we could ship without breaking clients, and could just remove the built-in ignore when jsdom/jsdom#1750 gets fixed.)

👍

And assuming it is broken on Windows (and not something unique to @ArmorDarks setup), I would consider that blocking bug (even though I also am only on Mac). @ArmorDarks , do you think you'd have any time to dig into the uncss/jsom code yourself and see where things are going wrong and where a patch might be devised? Otherwise we're just hoping @vseventer can find the time to do more testing... :)

Well, I can try to do something, but I'm not very skilled JavaScript developer. So far I'm pretty much sure that it's related to Windows-specific paths formatting (\ in paths instead of /).

It would be great if @vseventer or someone else on PC would finally test it, so we can be sure that it isn't related to my specific configuration, though, I tried to make build with stripped everything else, to minimize the effect of possible configuration flaws.

@RyanZim
Copy link
Member

RyanZim commented Jun 13, 2017

@mikelambert That svg root problem is actually a hack in normalize css for an old version of IE (I think IE9?) You'd never use a selector like that in normal real life. ignore: ['svg:not(:root)'] might do the job.

Windows does need fixed, but I'm stranded on a Linux box, so can't help there.

@vseventer
Copy link
Contributor Author

I'm able to reproduce the issue on Windows - a quick peek shows that the drive letter in Windows is the issue. (Root-)relative paths will always have the drive letter in front, so prepending the htmlroot will cause something like <htmlroot>/C:/<relative path>, while on Mac/Unix, we get the correct <htmlroot>/<relative path>.

Unfortunately, JSDom does some manipulation on the script src before exposing it, preventing me from reading out the actual script src attribute value without drive letter.

I will do some further testing later this week.

@vseventer
Copy link
Contributor Author

vseventer commented Jun 21, 2017

I added a commit which greatly simplifies the resourceLoader. Tests pass in both Mac and Windows.

@ArmorDarks - maybe you can give it a go to see if it works for you?

@ArmorDarks
Copy link

ArmorDarks commented Jun 21, 2017

Hi

Thanks for update!

Issue with local js-files finally fixed, thanks!

But there are a couple of other issues:

  1. jsdom tries to execute everything on page, including JavaScript. If any JS file fails, obviously, jsdom will report about that issue and may fail. From one side, it is cool, since serves as additional testing tool, but the bad side comes exactly from its good side — testing should be done explicitly whenever developers chooses too, but, instead, uncss will enforce testing of all JS.

    Not sure we can do anything about it, though.

    I guess after merge of this PR we will receive a lot of issues in uncss repository like "ahhh, uncss doesn't work, tons of errors!", since it will unexpectedly shed light on all errors of local and 3rd party scripts, which won't be able properly execute in jsdom...

  2. Loading of remote scripts (when you have something like src='https/... somescript.js') isn't super-reliable and rarely, but fails. Well, nothing can be done about it, I guess

  3. There seems to be quite specific issue with scripts, which injects something and relying on window location.

    For instance, here is <script> which serves as loader of 3rd-party script (sorry for minification, it is as provided by service)

    <script>
      ! function(t) {
        function e() {
          i = document.querySelectorAll(".button-widget-open");
          for (var e = 0; e < i.length; e++) "true" != i[e].getAttribute("init") && (options = JSON.parse(i[e].closest('.' + t).getAttribute("data-settings")), i[e].setAttribute("onclick", "alert('" + options.errorMessage + "(0000)'); return false;"))
        }
    
        function o(t, e, o, n, i, r) {
          var s = document.createElement(t);
          for (var a in e) s.setAttribute(a, e[a]);
          s.readyState ? s.onreadystatechange = o : (s.onload = n, s.onerror = i), r(s)
        }
    
        function n() {
          for (var t = 0; t < i.length; t++) {
            var e = i[t];
            if ("true" != e.getAttribute("init")) {
              options = JSON.parse(e.getAttribute("data-settings"));
              var o = new MangoWidget({
                host: window.location.protocol + '//' + options.host,
                id: options.id,
                elem: e,
                message: options.errorMessage
              });
              o.initWidget(), e.setAttribute("init", "true"), i[t].setAttribute("onclick", "")
            }
          }
        }
    
        host = window.location.protocol + "//widgets.mango-office.ru/";
        var i = document.getElementsByClassName(t);
    
        o("link", {
          rel: "stylesheet",
          type: "text/css",
          href: host + "css/widget-button.css"
        }, function() {}, function() {}, e, function(t) {
          document.documentElement.insertBefore(t, document.documentElement.firstChild)
        }), o("script", {
          type: "text/javascript",
          src: host + "widgets/mango-callback.js"
        }, function() {
          ("complete" == this.readyState || "loaded" == this.readyState) && n()
        }, n, e, function(t) {
          document.documentElement.appendChild(t)
        })
      }("mango-callback");
    </script>

    And it results in quite specific situation, when jsdom tries to load that resource as a local one:

    2222

    Obviously, issue comes from window.location.protocol + '//' + options.host concatenation. I guess jsdom places as location file protocol, and it results in error. I wonder, can anything be done about this?

    In this example we able to replace window location with hardcoded protocol, but it isn't always possible when script loaded from 3rd party.

  4. jsdom seems to be not handling relative protocols

    <script src='//widget.mango-office.ru/js/c-t-w.min.js'></script>

    Will result in error:

    333

    I guess because jsdom tries to resolve everything to local filesystem...

    Not big deal for us, I think those days relative protocols shouldn't be used anyway, but on some people it might come unexpectedly.

    Also, in some cases you're unable to change relative protocol. For example, when you're fetching script from 3rd party.

@vseventer
Copy link
Contributor Author

I think we're pushing the limits of jsdom here. Regarding 1 and 2, ideally scripts should be executed within a sandbox, but don't think this is a priority for the jsdom maintainers. Given that jsdom loads the source from disk, yeah, it uses the file:// protocol, causing issues 3 and 4 you mentioned.

Honestly, the issues you mention should be resolved. Not sure how, though.

@mikelambert
Copy link
Collaborator

Are 1 and 2 warnings, or failures? Warning about bad loads and missing resources is fine. Erroring-out and dying completely, is not fine. If the latter, can you give an example stacktrace? I feel like page-resource loads should be executed within a try/catch in jsdom, to allow the page to finish loading as best it can...

3 and 4, I don't believe we can do anything about. And I don't believe this is anything "new" about jsdom. Phantomjs loading a file from local disk, would have the same issue, right? (It possibly was too-silent about errors, and maybe the file just went missing when run from PhantomJS)

@RyanZim
Copy link
Member

RyanZim commented Jun 21, 2017

Phantomjs loading a file from local disk, would have the same issue, right? (It possibly was too-silent about errors, and maybe the file just went missing when run from PhantomJS)

I think that's what was happening; haven't verified though.

@ArmorDarks
Copy link

All errors aren't warning, they are full errors with stack trace, but they do not crash jsdom, so uncss indeed can successfully finish. That's great!

However, that's true only if it is impossible to load or execute properly remote script, but not CSS.

For instance, one of 3rd party scripts containing following line:

window.location.protocol + "//widgets.mango-office.ru/css/widget-button.css"

or

"//widgets.mango-office.ru/css/widget-button.css"

will result in

Fatal error: UnCSS: could not open J:\Work\FASenergo\github\build\widgets.mango-office.ru\css\widget-button.css

Easy fix is to change relative protocol to static https, but. as I said, not all users will have access to 3rd party script content, and this concatenation, unfortunately, is very common (anti) pattern.

Though, I guess, it can be fixed by simply ignoring such specific CSS files with something like /widgets.mango-office.ru/.

Also, in general, situation with window.location.protocol and relative protocols feels a bit regretful and a lot of users will bash into it, but so far I can't see too what we can do about it.

@mikelambert
Copy link
Collaborator

So I wouldn't call it an anti-pattern. It certainly sucks for file:// based pages. But for any website (or script library!) that needs to work on both http and https, scheme-relative paths are actually the correct solution. But yes, it's an unfortunate consequence how they break with file:// pages. :(

As far as CSS error-breakage you list above, this is actually an UnCSS error, not a jsdom error. See https://github.com/giakki/uncss/blob/master/src/utility.js#L123 . So on the bright side, this error is:

  • much easier to fix if we want
  • should not be related to the jsom-based implementation, so not a blocking issue for release

Just want to check though, does your hard-breakage error occur even with the original PhantomJS implementation? (It shouldn't)

Looks like the code currently is designed to raise an error when attempting to load a local (relative) file that does not exist. I can see the pros of making local uncss references failures, but also the cons since scheme-relative URLs become local references in a way that I think is difficult for uncss to detect...

@ArmorDarks
Copy link

ArmorDarks commented Jun 24, 2017

So I wouldn't call it an anti-pattern. It certainly sucks for file:// based pages. But for any website (or script library!) that needs to work on both http and https, scheme-relative paths are actually the correct solution. But yes, it's an unfortunate consequence how they break with file:// pages. :(

I definitely had to avoid mention about "anti" part to avoid this offtopic. Not, it isn't good solution. Whenever you have https resource, use https. There is no harm in serving https resource to http-served page, but there is if you use http one on https-served page. If you don't have https resource, relative protocol won't make situation with http resources served on https page any better. I think conclusions are quite straightforward.

Anyway, it doesn't make any point, since a lot of sites still using it and we have to live with it for now.

Just want to check though, does your hard-breakage error occur even with the original PhantomJS implementation? (It shouldn't)

PhantomJS version raises identical error:

conemu_2017-06-24_14-44-37

So, in this matter jsdom doesn't bring any breaking changes.

@RyanZim
Copy link
Member

RyanZim commented Jun 26, 2017

@mikelambert What else needs done here? (other than fixing merge conflicts)

@mikelambert
Copy link
Collaborator

Ahhh sorry for the delay.

@ArmorDarks Thanks, good points there, I should have known that. :)

@RyanZim Nope, don't think anything else is required. Does someone want to fix up the merge conflicts, and then we can try a major-semver release with a few pending changes?

@RyanZim
Copy link
Member

RyanZim commented Jul 6, 2017

@vseventer Can you fix merge conflicts? (or should someone else do it)

@coveralls
Copy link

Coverage Status

Coverage increased (+0.2%) to 96.983% when pulling 6dffb72 on vseventer:jsdom into 7d31a1f on giakki:master.

2 similar comments
@coveralls
Copy link

Coverage Status

Coverage increased (+0.2%) to 96.983% when pulling 6dffb72 on vseventer:jsdom into 7d31a1f on giakki:master.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.2%) to 96.983% when pulling 6dffb72 on vseventer:jsdom into 7d31a1f on giakki:master.

@mikelambert mikelambert merged commit 5852307 into uncss:master Jul 7, 2017
@ArmorDarks
Copy link

Just for the note, with jsdom uncss for some reason also strips this CSS:

::placeholder {
    color: #5a6066;
    font-family: inherit;
    font-size: 16px;
    font-style: inherit
}

While browsers-specific placeholders will be left intact:

::-webkit-input-placeholder {
    color: #5a6066;
    font-family: inherit;
    font-size: 16px;
    font-style: inherit
}
:-ms-input-placeholder {
    color: #5a6066;
    font-family: inherit;
    font-size: 16px;
    font-style: inherit
}

Though, jsdom seems to work with placeholder pseudo-selector.

As of right now, this is easily workaroundable with ignore rule.

ArmorDarks added a commit to LotusTM/Kotsu that referenced this pull request Aug 6, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants