Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v8::Platform / node integration discussion #133

Closed
laverdet opened this issue Oct 28, 2019 · 9 comments
Closed

v8::Platform / node integration discussion #133

laverdet opened this issue Oct 28, 2019 · 9 comments

Comments

@laverdet
Copy link
Owner

Anna [@addaleax],

I have some questions about node_platform and how it relates back to this project. isolated-vm is a module which lets users create v8 isolates from scratch which run in the nodejs process but in separate threads. It's similar to worker_threads which I know you had a significant role in building. Isolated-vm is a bit different in that it specializes in running untrusted or semi-trusted code. To that end it handles some things that worker_threads doesn't care about, like graceful OOM errors and runaway execution prevention.

As you know, v8::Platform is a singleton responsible for scheduling tasks per isolate. This means that at some point some master has to be in charge of all the isolates & tasks created in a process, but today we don't really have a good way to plug in to the global platform instance. MultiIsolatePlatform, IsolateData, and PerIsolatePlatformData do too much and assume too many node-specific things to be useful for isolated-vm.

Right now the way I'm accomplishing this is by calling into v8 internals to swap out the platform with a delegate of my own construction. My platform maintains a pointer to the original platform instance and will forward calls for any isolate it doesn't control. It's a very javascripty monkey-patch solution. This has the disadvantage that the library won't load on builds of node that don't export internal symbols.

I also believe Electron is running into issues with the singleton problem too-- electron/electron#18540

I'd like your thoughts on what the best way forward here is. I'm happy to hack some features into nodejs but I want to make sure I'm on the right track first.

Thanks!

@addaleax
Copy link

To that end it handles some things that worker_threads doesn't care about, like graceful OOM errors and runaway execution prevention.

If it helps, I’ve been meaning to look at how this project handles those things to see whether we could use them in worker_threads as well.

As you know, v8::Platform is a singleton responsible for scheduling tasks per isolate. This means that at some point some master has to be in charge of all the isolates & tasks created in a process, but today we don't really have a good way to plug in to the global platform instance. MultiIsolatePlatform, IsolateData, and PerIsolatePlatformData do too much and assume too many node-specific things to be useful for isolated-vm.

I think it would really help if you could gives some details here – in what way are these thing too Node.js-specific, and how could we make using Node’s MultiIsolatePlatform interface easier for you? Would it help to be able to “override” the platform for specific Isolates in some way?

I also believe Electron is running into issues with the singleton problem too-- electron/electron#18540

That actually turned out to be kind of a dumb bug in worker_threads itself, afaik: nodejs/node@6db45bf

I'd like your thoughts on what the best way forward here is. I'm happy to hack some features into nodejs but I want to make sure I'm on the right track first.

Thank you for reaching out! I’d be happy to see what we can do to make this module easier to maintain.

@laverdet
Copy link
Owner Author

If it helps, I’ve been meaning to look at how this project handles those things to see whether we could use them in worker_threads as well.

I suspect the approach this project takes wouldn't meet the standards of core nodejs. v8 stops short of making any kind of assurance that you can safely enforce memory or execution limits. The safeguards in place in this project are really just heuristics that attempt to shut down any troublesome isolate before v8 crashes the whole process. It works pretty well but it's far from bulletproof.

That actually turned out to be kind of a dumb bug in worker_threads itself, afaik: @nodejs/node6db45bf

I think that's just a bandaid for the issue when using cp.fork(..., {env: {ELECTRON_RUN_AS_NODE:1}}). You still can't create worker_threads in the main Electron process because of a "different issue". Now this is just speculation because I haven't looked into the bug that far but my guess is that Chromium won the fight over whose platform implementation gets to run in Electron. I do know that node::GetMainThreadMultiIsolatePlatform() returns nullptr when running electron 7.0.0 which would indicate that node's platform code never runs. I'd be curious to see the fix here because that has implications for getting isolated-vm running under Electron again.

I think it would really help if you could gives some details here – in what way are these thing too Node.js-specific, and how could we make using Node’s MultiIsolatePlatform interface easier for you? Would it help to be able to “override” the platform for specific Isolates in some way?

RegisterIsolate requires you pass along a uv_loop so that PerIsolatePlatformData can schedule tasks but my isolates have complex scheduling needs and run outside of libuv. For example, my scheduler has "handle tasks" which are posted when a different isolate releases a cross-isolate reference. In this case I don't want that task to wake up the isolate until the user requests more work from it.

I think the crux of the issue is that PerIsolatePlatformData is both a delegate to v8 and node-specific scheduler in one class. My first instinct is that extracting out the v8 delegation interface into its own class would be a good idea. This might be something like node::IsolatePlatformDelegate which implements any function from v8::Platform that accepts an Isolate* argument. So currently that would look like this:

class IsolatePlatformDelegate {
  virtual std::shared_ptr<v8::TaskRunner> GetForegroundTaskRunner() = 0;
  virtual bool IdleTasksEnabled() = 0;
  // also several deprecated methods?
}

Then the second parameter of RegisterIsolate would change from uv_loop_s* to IsolatePlatformDelegate*. nodejs would implement all the node scheduling logic in their own implementation (probably PerIsolatePlatformData would now just implement the new interface) and isolated-vm would be free to do the same. Ideally Electron would find some way to tie into this implementation and we can all get along in one process.

@addaleax
Copy link

If it helps, I’ve been meaning to look at how this project handles those things to see whether we could use them in worker_threads as well.

I suspect the approach this project takes wouldn't meet the standards of core nodejs. v8 stops short of making any kind of assurance that you can safely enforce memory or execution limits. The safeguards in place in this project are really just heuristics that attempt to shut down any troublesome isolate before v8 crashes the whole process. It works pretty well but it's far from bulletproof.

Okay, that was my experience with former attempts as well. Thanks for confirming that :)

That actually turned out to be kind of a dumb bug in worker_threads itself, afaik: @nodejs/node6db45bf

I think that's just a bandaid for the issue when using cp.fork(..., {env: {ELECTRON_RUN_AS_NODE:1}}). You still can't create worker_threads in the main Electron process because of a "different issue".

It’s been a few days since then but I was actually kind of sure that I had figured that out with Shelley too – either way, I can confirm that it’s being worked on.

Now this is just speculation because I haven't looked into the bug that far but my guess is that Chromium won the fight over whose platform implementation gets to run in Electron.

I think that’s mostly true, yes – Electron uses the platform provided by gin, which at least works partially in the same Chromium’s one. It’s not ours, though.

I do know that node::GetMainThreadMultiIsolatePlatform() returns nullptr when running electron 7.0.0 which would indicate that node's platform code never runs.

It’s a bit more complicated than that – I know that Electron does initialize a Node.js platform instance under some circumstances.

node::GetMainThreadMultiIsolatePlatform() returns the “canonical” Node.js platform instance if Node.js has created one through node::InitializeV8Platform(), and it’s true that Electron doesn’t do that. I’d consider these functions mistakes/temporary workarounds that are going to be deprecated, though.

I'd be curious to see the fix here because that has implications for getting isolated-vm running under Electron again.

Right, that probably makes sense. I’ll try to make sure we figure this out.

I think it would really help if you could gives some details here – in what way are these thing too Node.js-specific, and how could we make using Node’s MultiIsolatePlatform interface easier for you? Would it help to be able to “override” the platform for specific Isolates in some way?

RegisterIsolate requires you pass along a uv_loop so that PerIsolatePlatformData can schedule tasks but my isolates have complex scheduling needs and run outside of libuv. For example, my scheduler has "handle tasks" which are posted when a different isolate releases a cross-isolate reference. In this case I don't want that task to wake up the isolate until the user requests more work from it.

Okay, so the tl;dr is, you want to be able to control scheduling for your Isolates on your own, right?

I think the crux of the issue is that PerIsolatePlatformData is both a delegate to v8 and node-specific scheduler in one class.

👍

My first instinct is that extracting out the v8 delegation interface into its own class would be a good idea. This might be something like node::IsolatePlatformDelegate which implements any function from v8::Platform that accepts an Isolate* argument. So currently that would look like this:

class IsolatePlatformDelegate {
  virtual std::shared_ptr<v8::TaskRunner> GetForegroundTaskRunner() = 0;
  virtual bool IdleTasksEnabled() = 0;
  // also several deprecated methods?
}

Then the second parameter of RegisterIsolate would change from uv_loop_s* to IsolatePlatformDelegate*. nodejs would implement all the node scheduling logic in their own implementation (probably PerIsolatePlatformData would now just implement the new interface) and isolated-vm would be free to do the same.

PerIsolatePlatformData implements the TaskRunner interface, so I think this would be fairly straightforward to implement on our side. I’ll try to think about this a bit, but it should be easy enough to put a PR together and see how that would work out for everyone.

One additional things that PerIsolatePlatformData currently implements and which our platform implementation uses is shutdown callbacks, i.e. callbacks that are called when the platform doesn’t use the event loop anymore. I’d have to see how to handle that – they are mostly used in our worker_threads implementation.

Ideally Electron would find some way to tie into this implementation and we can all get along in one process.

Maybe? I’d have to check with them to see what would help them possibly reconcile the two platform implementations…

@laverdet
Copy link
Owner Author

I threw together a straw man proposal for what this might look like: laverdet/node@27d9ef4

  • PerIsolatePlatformData is now responsible for registering and unregistering.
  • ref / unref have been remove and this is instead handled by std::shared_ptr.
  • PerIsolatePlatformData should maybe be renamed to IsolatePlatformScheduler or something to make this obviously separate from IsolateData.

The main problem I hiccup I ran into is that NodePlatform::RegisterIsolate is responsible for creating and managing the lifetime of the PerIsolatePlatformData instances based on a reference count. It seems like this responsibility would be better suited to IsolateData.

@addaleax
Copy link

@laverdet I’ll take a closer look tomorrow, but a few thoughts:

  • You may want to rebase against nodejs/master, there have been some relevant changes.
  • I feel like RegisterIsolate() should also take a std::shared_ptr<IsolatePlatformDelegate> argument for consistency, and we should be using that internally as well?
  • In this form, this is going to be a semver-major change to Node.js … are you okay with that? Do we want to hack around it?

@laverdet
Copy link
Owner Author

laverdet commented Nov 6, 2019

  • You may want to rebase against nodejs/master, there have been some relevant changes.

Oops, I was trying to get a diff out quickly for discussion I didn't notice I was on an old branch. The newer code is a much better starting point because RegisterIsolate is no longer idempotent. I pushed a newer commit here: laverdet/node@8623e9a. Should I open a formal pull request on this?

  • I feel like RegisterIsolate() should also take a std::shared_ptr<IsolatePlatformDelegate> argument for consistency, and we should be using that internally as well?

The problem I have with that is then you're giving the platform implementation control over the lifecycle of your object. RegisterIsolate and UnregisterIsolate must be called at very specific milestones during an isolate's lifetime so it seems senseless to delegate to them control over this object. I don't feel super strongly about it though and I think there's probably advantages to each. I've included a second patch to show what that implementation would look like: laverdet/node@cfc2ca9

  • In this form, this is going to be a semver-major change to Node.js … are you okay with that? Do we want to hack around it?

The new patch is source compatible but not ABI compatible. Adding a new virtual function to a class changes the vtable layout so NODE_MODULE_VERSION would need to be incremented. I think you guys allow that kind of change during odd-numbered releases?

@addaleax
Copy link

addaleax commented Nov 6, 2019

  • You may want to rebase against nodejs/master, there have been some relevant changes.

Oops, I was trying to get a diff out quickly for discussion I didn't notice I was on an old branch. The newer code is a much better starting point because RegisterIsolate is no longer idempotent. I pushed a newer commit here: laverdet/node@8623e9a. Should I open a formal pull request on this?

Yeah, that would seem like a good idea – I’ll leave a comments on the commit the way I would during code review, I hope that’s fine with you. The patch basically looks good to me.

  • I feel like RegisterIsolate() should also take a std::shared_ptr<IsolatePlatformDelegate> argument for consistency, and we should be using that internally as well?

The problem I have with that is then you're giving the platform implementation control over the lifecycle of your object. RegisterIsolate and UnregisterIsolate must be called at very specific milestones during an isolate's lifetime so it seems senseless to delegate to them control over this object. I don't feel super strongly about it though and I think there's probably advantages to each. I've included a second patch to show what that implementation would look like: laverdet/node@cfc2ca9

Yeah, I’m personally okay with the first variant, it’s quite a bit cleaner now. :)

  • In this form, this is going to be a semver-major change to Node.js … are you okay with that? Do we want to hack around it?

The new patch is source compatible but not ABI compatible. Adding a new virtual function to a class changes the vtable layout so NODE_MODULE_VERSION would need to be incremented. I think you guys allow that kind of change during odd-numbered releases?

Quoting our policy file on this:

It is current TSC policy to bump major version when ABI changes.

I guess maybe we could make an exception here because this is exclusively embedder-focused and there are no known users that use Node.js as a shared/static library?

@addaleax
Copy link

@laverdet Are there other things you’d like to see from Node.js to help you here?

laverdet added a commit that referenced this issue Dec 7, 2019
This is the solution for the issue raised in #132 and discussed in #133.
Implements the API which landed in nodejs c712fb7c. We should be able to
take advantage of this starting in nodejs 14.x
@laverdet
Copy link
Owner Author

laverdet commented Dec 9, 2019

@addaleax thanks, everything looks good on my end. I implemented this in 7087645 and tested it against node master, everything worked out great.

I believe that once electron/electron#18540 is taken care of this module will run under Electron once more. And it should run fine using ELECTRON_RUN_AS_NODE once Electron integrates nodejs v14.x.

@laverdet laverdet closed this as completed Dec 9, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants