Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ability to ignore js script in html #321

Open
armpogart opened this issue Jul 14, 2017 · 23 comments
Open

Ability to ignore js script in html #321

armpogart opened this issue Jul 14, 2017 · 23 comments

Comments

@armpogart
Copy link

I have following js snippet for Google Analytics at the end of my html file:

<script>
    (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
    (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
    m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
    })(window,document,'script','//www.google-analytics.com/analytics.js','ga');
    ga('create', 'UA-XXXXX-X', 'auto');
    ga('send', 'pageview');
</script>

and uncss tries to load that js file, which it couldn't as you can see there is no protocol specified there, and even if there's protocol there, I don't see a reason to parse Google Analytics JS for uncss. So the question: is there a way (some uncss specific comment) to ignore that js script.

The stack trace is:

Error: Could not load script: "file://www.google-analytics.com/analytics.js"
    at *\node_modules\jsdom\lib\jsdom\browser\resource-loader.js:44:23
    at Object.check (*\node_modules\jsdom\lib\jsdom\living\nodes\Document-impl.js:89:11)
    at Object.check (*\node_modules\jsdom\lib\jsdom\living\nodes\Document-impl.js:92:23)
    at Object.check (*\node_modules\jsdom\lib\jsdom\living\nodes\Document-impl.js:92:23)
    at Object.check (*\node_modules\jsdom\lib\jsdom\living\nodes\Document-impl.js:92:23)
    at *\node_modules\jsdom\lib\jsdom\living\nodes\Document-impl.js:108:12
    at wrappedEnqueued (*\node_modules\jsdom\lib\jsdom\browser\resource-loader.js:255:16)
    at ReadStream.readableStream.on (*\node_modules\jsdom\lib\jsdom\browser\resource-loader.js:76:7)
    at emitNone (events.js:110:20)
    at ReadStream.emit (events.js:207:7)
    at endReadableNT (_stream_readable.js:1047:12)
    at _combinedTickCallback (internal/process/next_tick.js:102:11)
    at process._tickCallback (internal/process/next_tick.js:161:9) Error: Invalid protocol: file:
    at Request.init (*\node_modules\jsdom\node_modules\request\request.js:460:31)
    at new Request (*\node_modules\jsdom\node_modules\request\request.js:130:8)
    at request (*\node_modules\jsdom\node_modules\request\index.js:54:10)
    at Object.exports.download (*\node_modules\jsdom\lib\jsdom\browser\resource-loader.js:185:15)
    at fetch (*\node_modules\jsdom\lib\jsdom\browser\resource-loader.js:136:20)
    at Object.exports.load (*\node_modules\jsdom\lib\jsdom\browser\resource-loader.js:270:11)
    at HTMLScriptElementImpl._attach (*\node_modules\jsdom\lib\jsdom\living\nodes\HTMLScriptElement-impl.js:35:22)
    at HTMLBodyElementImpl.insertBefore (*\node_modules\jsdom\lib\jsdom\living\nodes\Node-impl.js:227:22)
    at HTMLBodyElement.insertBefore (*\node_modules\jsdom\lib\jsdom\living\generated\Node.js:118:58)
    at file:///D:/web/html/frontend-boilerplate/src/index.html:5:67
    at file:///D:/web/html/frontend-boilerplate/src/index.html:6:7
    at ContextifyScript.Script.runInContext (vm.js:53:29)
    at Object.runInContext (vm.js:108:6)
    at processJavaScript (*\node_modules\jsdom\lib\jsdom\living\nodes\HTMLScriptElement-impl.js:128:10)
    at HTMLScriptElementImpl._eval (*\node_modules\jsdom\lib\jsdom\living\nodes\HTMLScriptElement-impl.js:65:7)
    at *\node_modules\jsdom\lib\jsdom\browser\resource-loader.js:31:22
@RyanZim
Copy link
Member

RyanZim commented Jul 14, 2017

There's no way to ignore it, other than commenting it out AFAIK. Personally, I'd just switch to https://; // isn't really needed anymore.

@armpogart
Copy link
Author

@RyanZim I agree that nowadays it is safe to switch to https://, but anyways I don't see the need for it to parse Google Analytics. It would be better performance wise if we could just specify (with some begin and end comment) that it is unnecessary to parse it.

@RyanZim
Copy link
Member

RyanZim commented Jul 14, 2017

@armpogart We're using jsdom under the hood, so I'm not sure how this would be possible.

@armpogart
Copy link
Author

@RyanZim Does uncss give a way to configure jsdom as it had skipExternalResources property in old api which is still supported (as stated in their main documentation). If so we could give it regex and document the ability to ignore unneeded resources.

@mikelambert
Copy link
Collaborator

mikelambert commented Jul 14, 2017 via email

@XhmikosR
Copy link
Member

This will break many projects that still use protocol relative URLs. Isn't there any way to ignore specific HTML from being parsed?

@RyanZim
Copy link
Member

RyanZim commented Jul 18, 2017

Isn't there any way to ignore specific HTML from being parsed?

<!-- ignore me -->

IIRC, uncss continues despite this error, the script just isn't loaded. PhantomJS had the same issue, it just was too opaque to report the error. You're going to have the exact same problem if you open a local html file with a normal web browser. One possible workaround is to run a small http server, and run uncss on localhost.

@armpogart
Copy link
Author

I still think the problem is not only on loading relative url. Why to load it in the first place, there can be many scripts that are there for other purposes and we don't need to load them (parse them) during uncss. So I still think it would be valid feature to have a way to ignore it.
Yeah, I agree that that we could simply delete it during uncss, but we don't do html preprocessing usually during css stage, for example when running gulp task and browserify, we may do css stage (which can compile sass, uncss, ...) but we don't preprocess html during serving dev, we usually process html only during build.

@RyanZim
Copy link
Member

RyanZim commented Aug 9, 2017

Thinking a bit more about this, perhaps we could have a script blacklist implemented in the resource loader.

@RyanZim RyanZim reopened this Aug 9, 2017
@ArmorDarks
Copy link

Why not delete it in the source html file, before you pass it to uncss?

The most prominent issue with that — it isn't always possible. A lot of 3rd-party scripts will load their own dependencies and additional files. You might need part of them, but the other half will yield those errors, or load something you didn't want to, at least affecting badly performance.

Besides, with large amount of pages there is an issue with loading remote resources that you don't want — they start to fail (for instance, not all servers like 300 requests in a row), and this might lead to erroring in resources that you really wanted. But because my statement above, simple deletion of remote resources would not always solve the problem.

Here said that skipExternalResources will continue to work, and for now it should be enough to fulfill current issue.

Also, here jsdom/jsdom#1567 said that another way is to use custom resource loader, though, I'm not aware how more difficult it is.

@RyanZim
Copy link
Member

RyanZim commented Aug 16, 2017

We're already using a custom resource loader; we just need to add this feature to our custom loader.

@ArmorDarks
Copy link

That would be really great

@FiniteLooper
Copy link

FiniteLooper commented Aug 30, 2017

I use a templating system and I have this line in my HTML template

<script src="{applicationPath}/assets/scripts.2.5.0.js"></script>

I was using UnCSS 0.14.1 before, and it had no problems. Now on 0.15.0 is produces this error:

Which then produces this error

Error: Could not load script: "file:///C:/my-project/templates/%7BapplicationPath%7D/assets/scripts.2.5.0.js"
    at C:\my-project\node_modules\jsdom\lib\jsdom\browser\resource-loader.js:44:23
    at Object.check (C:\my-project\node_modules\jsdom\lib\jsdom\living\nodes\Document-impl.js:89:11)
    at ReadStream.<anonymous> (C:\my-project\node_modules\jsdom\lib\jsdom\living\nodes\Document-impl.js:108:12)
    at ReadStream.wrappedEnqueued (C:\my-project\node_modules\jsdom\lib\jsdom\browser\resource-loader.js:255:16)
    at emitOne (events.js:115:13)
    at ReadStream.emit (events.js:210:7)
    at fs.js:1940:12
    at FSReqWrap.oncomplete (fs.js:135:15) { Error: ENOENT: no such file or directory, open 'C:\my-project\templates\%7BapplicationPath%7D\assets\scripts.2.5.0.js'
  errno: -4058,
  code: 'ENOENT',
  syscall: 'open',
  path: 'C:\\my-project\\templates\\%7BapplicationPath%7D\\assets\\scripts.2.5.0.js' }

For now my workaround is to remain on the older version so I can have my project continue to work.

@mikelambert
Copy link
Collaborator

Previously, the failure-to-load-js error was hidden. Now the error is shown.

Note that the previous version did not load the js-at-all, and so any CSS classes, or HTML elements, that were constructed by that JS would not have been seen by UnCSS.

My recommendation (what I do in my own project), is to generate fully-formed HTML pages from your server/code, and then run those through UnCSS, where you can verify they are proper HTML and not just "templates".

@FiniteLooper
Copy link

That's not possible with my project, so I guess I'm either stuck waiting on the ability to blacklist certain scripts , disable all scripts, suppress errors, or just stick with the older version.

@DamonHD
Copy link

DamonHD commented Sep 16, 2017

I tried to use the '-t 0' flag to get uncss NOT to try to load any JS. May be worth using that as an official way of disabling load and run of JS.

I don't touch any CSS in my JS, and almost all my JS is AdSense which I'd prefer that uncss didn't even try to run. Indeed I get errors from the protocol-relative code recommended by Google, and which one is not meant to edit according Google's Ts&Cs.

Maybe I'll fall back to purifycss for files with embedded scripts to avoid the errors at the cost of some extra bloat...

Rgds

Damon

@jobe451
Copy link

jobe451 commented Dec 4, 2017

I do have the same problem as chrismbarr. I have a template directory, path to js-scripts look like this (NEOS CMS):

<script src="{f:uri.resource(path: 'js/libs/superfish.js', package: 'jobe.drsch')}"></script>

Alternatively, I tried to provide the test page, which has proper HTML. However, looks like I use some script that is understood by modern browsers but not by jsdom, resulting in this error:

Error: Uncaught [TypeError: window.matchMedia is not a function]

@RyanZim
Copy link
Member

RyanZim commented Dec 4, 2017

jsdom doesn't support window.matchMedia yet, unfortunately.

@ArmorDarks
Copy link

Alternatively, I tried to provide the test page, which has proper HTML. However, looks like I use some script that is understood by modern browsers but not by jsdom, resulting in this error:

That error is easy to bypass by doing something like

if (window.matchMedia) {
   // your code goes here
}

Then it will be simply ignored by jsdom.

@michaelx
Copy link

+1 for a script blacklist option. That would be really useful.

@nhaglind
Copy link

Any updates to this?

@giakki giakki added this to To do in To-do board via automation Feb 13, 2020
@giakki giakki moved this from To do to To plan in To-do board Feb 13, 2020
@tkharuk
Copy link

tkharuk commented Mar 1, 2020

Hey,
Have a similar problem with a third-party widget, it is synchronous and seems like everything below it is removed by uncss.

Wanted to clarify, as of today, is there a way to turn it off with ProcessExternalResources: false ?

@benabraham
Copy link

Confirming that minification still occurs despite the error. But would be nice to somehow suppress it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
To-do board
  
To plan
Development

No branches or pull requests