Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extract runtime to a separate package and preprocess it with webpack for embedding #42

Open
MOZGIII opened this issue Feb 29, 2020 · 21 comments · Fixed by #61 or #58 · May be fixed by #851
Open

Extract runtime to a separate package and preprocess it with webpack for embedding #42

MOZGIII opened this issue Feb 29, 2020 · 21 comments · Fixed by #61 or #58 · May be fixed by #851

Comments

@MOZGIII
Copy link

MOZGIII commented Feb 29, 2020

Currently the runtime is not pre-processed, and this causes a lot of troubles when trying to use it with electron in a sandboxed environment.

@pmmmwh
Copy link
Owner

pmmmwh commented Mar 1, 2020

Is there any examples you can provide? All code in this repo is written as ES5-compliant as possible (and also in CommonJS), so I don't see it causing trouble in any environments unless you're expecting ES Modules. (I don't know a lot about electron)

@MOZGIII
Copy link
Author

MOZGIII commented Mar 1, 2020

To be more specific, there are numerous issues.

  1. ErrorOverlayEntry.js imports react-dev-utils/formatWebpackMessages, and that, in turn, imports chalks. chalks then tries to access process.env.*, that is not accessible in electron render process unless the process "has permissions". I have no idea why does ErrorOverlayEntry.js ends up being imported in the browser context - might only be required in CLI context, but nonetheless - it causes a runtime error at the browser (render process) end.

  2. Similarly, something requires url in the browser context, which is a node module. This should be required, and I suspect something, that shouldn't be, ends up being included in the webpack bundle, and this is why all this occurs.

I hope this is enough context - if not - I can create a sample repo.

@pmmmwh
Copy link
Owner

pmmmwh commented Mar 2, 2020

I hope this is enough context - if not - I can create a sample repo.

This is great - I will work on fixing the issues this week.

@pmmmwh
Copy link
Owner

pmmmwh commented Apr 1, 2020

Issue 1 fixed by #58

@MOZGIII
Copy link
Author

MOZGIII commented Apr 1, 2020

#58 is more of a workaround, however fundamentally the issue is not solved. This can happen again in the future, and there's nothing that would've prevented this.
While I'm grateful for the fix, I have some concerns about the reliability of the solution.

I can help with implementing the bundle split if needed.

@MOZGIII
Copy link
Author

MOZGIII commented Apr 1, 2020

Same goes for #61.

@pmmmwh while the particular pain points I stumbled upon are fixed (kudos for that btw), I didn't hear your feedback on the subject of this issue - splitting the runtime (browser end) code and webpack (node end) code into separate packages.
I appreciate the time you spent on fixes, but I don't feel like I got a clear response.

Is your point that the code should be portable across the environments, and therefore the split is not required? Well, this might be true, but then I'll end up loading the node-related code in the browser environment and vice versa. This consumes compute cycles unnecessarily, and on its own, it's not a big deal, but coupled with other packages that do the same it becomes a huge problem. Please don't ignore this factor.

Anyhow, please reopen this issue, as it has not been addressed entirely. There are other issues, and I'd like to avoid going over them one-by-one.

@MOZGIII
Copy link
Author

MOZGIII commented Apr 1, 2020

Maybe I'm not clear.

This project contains the code that runs as part of the webpack build process and the code that runs as part of the compiled resulting bundle. The code that runs in the webpack injects the code that runs at "runtime" into the resulting bundle.
My proposal is to process the runtime part that's injected differently than the code that's being run as part of the webpack execution at the package level. The easiest way to do it is to split it into a separate package.

@pmmmwh pmmmwh reopened this Apr 1, 2020
@pmmmwh pmmmwh linked a pull request Apr 2, 2020 that will close this issue
@pmmmwh
Copy link
Owner

pmmmwh commented Apr 2, 2020

I didn't hear your feedback on the subject of this issue - splitting the runtime (browser end) code and webpack (node end) code into separate packages.

The easiest way to do it is to split it into a separate package.

At this stage I am against splitting the repo into separate packages, I don't think it is any better than the current directory split. Differential compilation on directories is also a way to achieve what you want.

Is your point that the code should be portable across the environments, and therefore the split is not required?

This is not true - code is currently already split into directories.

I'll end up loading the node-related code in the browser environment and vice versa.

I don't think this is true either, Node-related files (everything other than runtime) is never required in the browser context, and it is the same for Node's context too.

This consumes compute cycles unnecessarily, and on its own, it's not a big deal, but coupled with other packages that do the same it becomes a huge problem.

Unless we bundle everything into one file (e.g. Rollup/Webpack), require cycles will still happen, and will only happen with files that are specific to that context. However, most of the runtime code cannot be grouped because of their varying nature, and a lot of the code will not work until it is being injected by webpack.ProvidePlugin.

My proposal is to process the runtime part that's injected differently than the code that's being run as part of the webpack execution at the package level.

The point is, I don't see any need to pre-process/transpile any of the runtime code. No code in the runtime folder is non-compatible with browsers/non-native renderers. The nature of the issues you were facing are related to how Webpack 4 aggressively polyfills (all Node APIs if used) patches stuff for the browser (like defaulting process.env) but not for electron. That is not something pre-processing can help - especially when we know that this behaviour is going away in Webpack 5, which means by then the code will be obviously broken. We could do the same aggressive polyfilling, but it brings in bundle penalties for everybody.

Unless we're switching to ES Modules/ES6+ syntax, I don't feel that processing the current plain CommonJS, ES5 compliant code (other than use of the URL API which breaks IE11) is a right move. Writing tests (for various output targets) seems like a much simpler and beneficial approach - which I will need help.

@MOZGIII
Copy link
Author

MOZGIII commented Aug 2, 2020

The point is, I don't see any need to pre-process/transpile any of the runtime code. No code in the runtime folder is non-compatible with browsers/non-native renderers. The nature of the issues you were facing are related to how Webpack 4 aggressively polyfills (all Node APIs if used) patches stuff for the browser (like defaulting process.env) but not for electron. That is not something pre-processing can help - especially when we know that this behaviour is going away in Webpack 5, which means by then the code will be obviously broken. We could do the same aggressive polyfilling, but it brings in bundle penalties for everybody.

I'm not following.

The issue, in particular, is things like this:

const querystring = require('querystring');

This has nothing to do with polyfilling, but about cutting what gets evaluated in the renderer webpack context or now. You are assuming that only the content of the runtime folder gets included into the webpack bundle that's executing at runtime - however, in my case, things from lib ended up there too. This is the root cause of the problem, and what I'm proposing to solve.

I don't know why it only affects electron, possibly because there we have all the various execution contexts, while normally the situation would be way simpler.

We have:

  • node context - for cli tooling, build process, webpack-dev-server and etc; runs at dev time, executed natively without webpack preprocessing.
  • electron main context - for code that runs as part of the app (at runtime) in the main electron process; very similar to node context, webpack produces a bundle with all context, except for a few select require targets, like nodejs APIs or native modules.
  • electron preload context - for code that runs at the render process before the real page is loaded; the code is processed into a bundle, but it still has access to node APIs (like require fn), but also to the browser API; this is where the issue occurs - the lib code is included into a bundle. This, in turn, will unconditionally trigger a non-webpack require at load (see the listing with code from lib above) effectively doing a privileged nodejs call, which triggers a security error and prevents further execution. This lib isn't intended for loading at runtime, but it's what's happening currently.
  • electron render process - webpack processed code for the user-visible UI rendered in the browser; doesn't have access to nodejs require fn, but the lib code is also still loaded there - and we get a "no such fn" error upon loading.

So, again, if everything worked as you assume - with only runtime actually being loaded in runtime - that would be completely fine. Polyfills are different story, and I'm not worried about those at all, since in electron we're in control of the browser-side APIs and know exactly what version we have (URL Web API object is available in our case, hence no polyfill is required). The problem is with rouge files getting into the bundle, and the code intended to even be run at runtime issuing require-s and process.envs.

@MOZGIII
Copy link
Author

MOZGIII commented Aug 2, 2020

Steps to reproduce the core issue:

git clone https://github.com/pmmmwh/react-refresh-webpack-plugin
cd react-refresh-webpack-plugin/examples/typescript-without-babel
yarn
yarn webpack --mode development --hot
grep '__webpack_require__.*querystring' dist/main.js

The last grep shows that querystring is indeed imported, yet it's supposed to not have been if, as you say, only runtime folder was included.

Again the querystring require comes from

const querystring = require('querystring');

@MOZGIII
Copy link
Author

MOZGIII commented Aug 2, 2020

UPD: also it comes from the native-url, which in turn is invoked at

const url = require('native-url');

@MOZGIII
Copy link
Author

MOZGIII commented Aug 2, 2020

Also, to see how it effectively affects electron apps, add target: "electron-preload", to the webpack.config.js at the example from the repro steps above, and re-run yarn webpack --mode development --hot.
You'll then be able to find that actual require call in the bundle: grep '*** external "querystring" ***' dist/main.js -B 3 -A 10.

See also: https://webpack.js.org/configuration/target/

@pmmmwh
Copy link
Owner

pmmmwh commented Aug 2, 2020

I'm getting confused. If you trave the code paths that actually gets into Webpack's entry option, you can clearly see that no code other than the overlay/the socket integrations/the refresh client is being called (injectRefreshEntry injects them, so you can start the trace there). All code outside of them is isolated and never required into them, which means by module resolution they should never appear in anything webpack outputs.

That seems like a bug somewhere outside of our control 🤔 even if we separate the runtime package from the plugin, it will not help cause one is injected by the other.

For process.env.NODE_ENV - it is something that I think we're stuck for the time being, unless some non-process dependent way of statically detecting development mode are invented. You can think of it as a stub that is unfortunately named after the Node.js process.

With native-url that's another story - maybe we should just implement our own url parsing so we can ensure nothing outside of our control will break. It's unfortunate that many browser-facing packages use Node APIs internally because they have been polyfilled.

@pmmmwh
Copy link
Owner

pmmmwh commented Aug 2, 2020

Steps to reproduce the core issue:

git clone https://github.com/pmmmwh/react-refresh-webpack-plugin
cd react-refresh-webpack-plugin/examples/typescript-without-babel
yarn
yarn webpack --mode development --hot
grep '__webpack_require__.*querystring' dist/main.js

The last grep shows that querystring is indeed imported, yet it's supposed to not have been if, as you say, only runtime folder was included.

Again the querystring require comes from

const querystring = require('querystring');

I think you have some wrong assumptions about the problem - I did the exact steps and while querystring is found, injectRefreshEntry is no where to be seen in the bundle. The only import to querystring I can see is from native-url. We do use querystring as a variable name too so there are more matches.

For native-url it is much more interesting because they are indeed using querystring, but that in turn is a dependency for all engines that shares the name with Node.js's internals (ref here). So in theory - with proper module resolution nothing is broken here.

@MOZGIII
Copy link
Author

MOZGIII commented Aug 2, 2020

Weren’t you able to reproduce it with my example? I can create an example project and upload it to github.

UPD: ah, I see, indeed injectRefreshEntry isn’t in the bundle, but the querystring is there regardless. The key is to use one of the electron-specific webpack targets.

@MOZGIII
Copy link
Author

MOZGIII commented Aug 2, 2020

For native-url it is much more interesting because they are indeed using querystring, but that in turn is a dependency for all engines that shares the name with Node.js's internals (ref here). So in theory - with proper module resolution nothing is broken here.

Huh, so it’s simply not supposed to be an external import, but a import from the node_modules? It makes sense then. Very unfortunate that there’s a name collision there.
Well, it’s yet another way how preprocessing and shipping ready-made bundles would make the setup more reliable, just saying.
What are the possible workarounds there? I can probably tweak webpack to use a node_module instead of external require if it comes from native-url.
Just to double-check - is that code supposted to be included in the runtime bundle?

@MOZGIII
Copy link
Author

MOZGIII commented Aug 2, 2020

One possible fix for this, I think, is for the runtime code to assume the URL object is available natively or polyfilled. Would this be a viable solution?
It doesn't solve the root cause - the issue subj is a generic solution, and shifts the actual control over this to your side, rather than leaving it to assumptions about how people's bundlers operate - but at least that way the immediate problem will go away.

@MOZGIII
Copy link
Author

MOZGIII commented Aug 2, 2020

For process.env.NODE_ENV - it is something that I think we're stuck for the time being, unless some non-process dependent way of statically detecting development mode are invented. You can think of it as a stub that is unfortunately named after the Node.js process.

It would be a good idea to allow removing the code that does this check from the bundle. The assumption that the process.env is a webpack stub holds for webapps, but in electron it's normal to have this value be a real Node's process.env, providing an access to user's env vars, not the bundling time env vars. This would be security hazard if this code would rely on user env vars. Good thing is we exclude it from the bundle for production build at the webpack level - so it's effectively not a security hazard, but simply an annoyance.

One thing to note though is we don't see the issues with process.env. anymore at v0.3.3 or v0.4.1. Must've been solved.

@pmmmwh
Copy link
Owner

pmmmwh commented Aug 2, 2020

What are the possible workarounds there? I can probably tweak webpack to use a node_module instead of external require if it comes from native-url.

I think the best bet is to do that. I think for electron the default is to skip polyfilling it (but will keep the import in place) though, not sure why it did not resolve it from the correct place ...

Just to double-check - is that code supposted to be included in the runtime bundle?

If you're using webpack-dev-server - then yes. From the source of electron-forge v6 though, I see that they're using webpack-hot-middleware instead, so if you're using that you should change sockIntegration to whm, which will drop that code (URL parsing is only needed for WDS atm).

One possible fix for this, I think, is for the runtime code to assume the URL object is available natively or polyfilled. Would this be a viable solution?

Short answer: Yes.

I have thought about implementing the URL parsing/formatting inline so we can have more control as I've mentioned above. When I moved from url to native-url I thought it's a bit simpler to depend on something the Chrome team worked on. However recently when shipping out 0.4 betas I've encounter some quirks of that package that don't align with my expectations and that's why I'm thinking to remove the dependency completely (since it is to a certain extent, trivial to implement).

It would be a good idea to allow removing the code that does this check from the bundle.

The thing is, there's so many package that does this and we have no other way round - even react-refresh or react does it (conditional API exposure is one of the main blockers for them to move to ESM). I think Webpack's stub is safe enough (it replaces process.env.NODE_ENV - the whole thing - with text inline if you don't mess with the configurations), and even if it meant the true env vars it also makes sense because we also don't want the code path under the conditional statement when the user is in production.

It doesn't solve the root cause - the issue subj is a generic solution, and shifts the actual control over this to your side, rather than leaving it to assumptions about how people's bundlers operate - but at least that way the immediate problem will go away.

I am 50/50 on this - on one hand I want to ship the least transpiled/minified/transformed code possible, so users downstream can optimise however they wanted to (and even dig into the source code when needed), but on the other hand I do agree that there's a certain level of discrepancy between people's Webpack setups which sometimes make issues very hard to track down (mostly for "non-browser" rendering).

For example, I also had an item on my road map to ship ES7+ bundles for the overlay, which would provide performance gains for Webpack 5/native ESM users. On the other hand, this is kinda against what the industry is doing - people more often than not exclude node_modules from their transpilation pipeline. The same can be said for all these "native" Node.js modules, url, querystring - Webpack 4 have been doing too good of a job for Web developers to tolerate these imports and now for us a platform-agnostic tool it is backfiring - but for most people their pipeline might have these polyfills enabled.

Moving on I'll alter tests so that they run with all these native polyfills disabled (that's also what Webpack 5 will ship by default). Extracting the runtime and preprocessing it can be negative for performance and since there are a lot of pieces (most of them are single files already) that have to be embedded separately I don't see how we can have a lot of gains from that.

@MOZGIII
Copy link
Author

MOZGIII commented Aug 2, 2020

I think the best bet is to do that. I think for electron the default is to skip polyfilling it (but will keep the import in place) though, not sure why it did not resolve it from the correct place ...

I'm trying to do just that right now. It's the first time I dig deep into webpack and stuff really, so my attempts so far are failing 😦. Any hints?

If you're using webpack-dev-server - then yes. From the source of electron-forge v6 though, I see that they're using webpack-hot-middleware instead, so if you're using that you should change sockIntegration to whm, which will drop that code (URL parsing is only needed for WDS atm).

Thx, I didn't realize that hot middleware and dev server would have different integrations in the new version! I'll try that.

I have thought about implementing the URL parsing/formatting inline so we can have more control as I've mentioned above. When I moved from url to native-url I thought it's a bit simpler to depend on something the Chrome team worked on. However recently when shipping out 0.4 betas I've encounter some quirks of that package that don't align with my expectations and that's why I'm thinking to remove the dependency completely (since it is to a certain extent, trivial to implement).

Why not use the Web API URL though? README already mentions user need to have a polyfill for old browsers. It seems like a non-issue to just switch to build-in (or polyfilled) URL instead of using an imported one.

The thing is, there's so many package that does this and we have no other way round - even react-refresh or react does it (conditional API exposure is one of the main blockers for them to move to ESM). I think Webpack's stub is safe enough (it replaces process.env.NODE_ENV - the whole thing - with text inline if you don't mess with the configurations), and even if it meant the true env vars it also makes sense because we also don't want the code path under the conditional statement when the user is in production.

True. The fine for now.

I am 50/50 on this - on one hand I want to ship the least transpiled/minified/transformed code possible, so users downstream can optimise however they wanted to (and even dig into the source code when needed), but on the other hand I do agree that there's a certain level of discrepancy between people's Webpack setups which sometimes make issues very hard to track down (mostly for "non-browser" rendering).

For example, I also had an item on my road map to ship ES7+ bundles for the overlay, which would provide performance gains for Webpack 5/native ESM users. On the other hand, this is kinda against what the industry is doing - people more often than not exclude node_modules from their transpilation pipeline. The same can be said for all these "native" Node.js modules, url, querystring - Webpack 4 have been doing too good of a job for Web developers to tolerate these imports and now for us a platform-agnostic tool it is backfiring - but for most people their pipeline might have these polyfills enabled.

I agree with your points on optimization. It would be much better to have bundlers that can work reliably and give users the control, rather than taking it away. I'd be happy if it just worked really, but this integration had surprisingly many issues so far.

In this sense, splitting into multiple packages could allow users to pick the package to use - pre-bundled or raw.

I was also under the impression that people try to avoid transpiling the node_modules, and rely on them being pre-transpiled. This is especially true for browser-side packages, and not so much for packages that have to run on the node end. This was the basis of my initial analysis of the situation - if we have a projects that has parts that have to run both at node and at browser end - then having a single package is a limitation, preventing further tailoring of the parts for the environment. Thus the proposal to split.

I'm cool if we make it work without the splitting, but the idea of transpiling the node_modules when we have so much environment to look after just sounds way too complicated. Seems like we're advancing in this though 😄

Moving on I'll alter tests so that they run with all these native polyfills disabled (that's also what Webpack 5 will ship by default). Extracting the runtime and preprocessing it can be negative for performance and since there are a lot of pieces (most of them are single files already) that have to be embedded separately I don't see how we can have a lot of gains from that.

I can help with preparing the test cases for electron. Not sure if we have to focus on electron-forge specifically there - it's v6 been in beta for so long, probably doesn't worth it. Technically, just asserting the generated bundle to not have any *** external "..." *** with target set to electron-preload and/or electron-renderer should do the trick.

@pmmmwh
Copy link
Owner

pmmmwh commented Aug 4, 2020

I'm trying to do just that right now. It's the first time I dig deep into webpack and stuff really, so my attempts so far are failing 😦. Any hints?

Erm ... maybe you can use resolve.alias or play with the node option which provides these Node polyfills?

Why not use the Web API URL though? README already mentions user need to have a polyfill for old browsers. It seems like a non-issue to just switch to build-in (or polyfilled) URL instead of using an imported one.

native-url provides a simpler API to interact with (similar to what Node's url module has to offer, url.format). It is not really a polyfill but a wrapper to provide more functionality.

In this sense, splitting into multiple packages could allow users to pick the package to use - pre-bundled or raw.

I think what you're suggesting is that we provide multiple bundles, rather than splitting the packages. The point is, most of these files are being injected by the plugin, so you have little to no choice on which one we're going to use. They are also mostly single files already, apart from the dependencies (well, technically speaking, apart from the overlay, this plugin only have 1 runtime dependency - native-url`). Bundling would help, but for most files it would be unnecessary complexity (unless we start shipping ESM builds - that is another topic).

I'm cool if we make it work without the splitting, but the idea of transpiling the node_modules when we have so much environment to look after just sounds way too complicated.

Again, transpiling node-modules in this case would not help because it is inherently a tooling issue between Webpack/Electron/Node.js. I understand the frustration, but there's little to nothing we can do to optimize this when it is the environment itself that is broken. The best we can do is provide a 0 runtime dependency environment so things should work out.

I can help with preparing the test cases for electron.

It would be greatly appreciated if you can create and PR an example (see other examples in this repo). I have plans to make them function as both examples and regression test cases with other integrations (we have the testing setup ready, just need to wire them up).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants