Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discussion/Tracking SnapshotCreator support #13877

Closed
14 tasks
bmeck opened this issue Jun 22, 2017 · 51 comments
Closed
14 tasks

Discussion/Tracking SnapshotCreator support #13877

bmeck opened this issue Jun 22, 2017 · 51 comments
Labels
discuss Issues opened for discussions and feedbacks. lib / src Issues and PRs related to general changes in the lib or src directory. performance Issues and PRs related to the performance of Node.js. process Issues and PRs related to the process subsystem.

Comments

@bmeck
Copy link
Member

bmeck commented Jun 22, 2017

Relevant earlier discussion on #9473 .

v8::SnapshotCreator is a means to capture a heap snapshot of JS, CodeGen, and C++ bindings and revive them w/o performing loading/evaluation steps that got to there. This issue is discussing what would be needed for tools like webpack which run many times and have significant startup cost need in order to utilize snapshots.

CC: @hashseed @refack

  • All C++ bindings available as intptr_t
  • C++ addon API/declaration to register external intptr_ts
  • CLI flags for --make-snapshot and --from-snapshot
  • JS API to declare main() functions for snapshots (save to v8::Private::ForApi($main_symbol_name)).
  • JS API to declare vm.Context for snapshot
  • Serializer for
    • ArrayBuffer/TypedArrays
    • HandleWrap
    • ObjectWrap
    • Timer
    • STDIO
    • require.cache paths?
  • File format for snapshots
    • C++ Addon inlining / symlinking

The v8 API might be able to have some changes made (LANDED)

Right now the v8 API would need a --make-snapshot CLI flag since v8::SnapshotCreator controls Isolate creation and node would need to use the created isolate.

Since all JS handles need to be closed when creating the snapshot, a main() function would need to be declared during snapshot creation after all possible preloading has occurred. The snapshot could then be taken when node exits if exiting normally (note, unref'd handles may still exist).

Some utility like WarmUpSnapshotDataBlob from v8 so that the JIT code is warm when loaded off disk also relevant.

@refack refack added lib / src Issues and PRs related to general changes in the lib or src directory. process Issues and PRs related to the process subsystem. labels Jun 22, 2017
@refack
Copy link
Contributor

refack commented Jun 22, 2017

I'm working on first one.

@mscdex
Copy link
Contributor

mscdex commented Jun 22, 2017

Does this still really matter now with Ignition (and TurboFan)?

@bmeck
Copy link
Member Author

bmeck commented Jun 22, 2017

@mscdex less so, but allows for rudimentary equivalence of features of pkg and does have impact since you don't need to run any require on startup.

@mscdex mscdex added performance Issues and PRs related to the performance of Node.js. discuss Issues opened for discussions and feedbacks. labels Jun 22, 2017
@bmeck
Copy link
Member Author

bmeck commented Jun 22, 2017

Some numbers from local investigation: https://twitter.com/bradleymeck/status/875758792805404673

On a Mac 64bit:

  • 2MB~ is minimal snapshot size (also why setting max old space < 3MB doesn't work)
  • 10e6 globals serialized into a snapshot w/ different hidden classes (global[i++]={[i]:i}) is 88MB, no significant change in speed when calling Isolate::New w/ this vs 2MB.
  • Isolate::New time seems to float around 10ms on local machine (consistent w/ v8 builtin default snapshot)

@refack
Copy link
Contributor

refack commented Jun 22, 2017

If the v8::SnapshotCreator was able to use the existing Isolate, we could use this for pure JS post-mortem dumps, right?

@bmeck
Copy link
Member Author

bmeck commented Jun 22, 2017

@refack yes, but the snapshot blob is platform and build specific unlike core dumps

@refack
Copy link
Contributor

refack commented Jun 22, 2017

But we'll end up with a live system that could be dynamically interrogated. Sounds like an amazing feature, and if V8 stabilize the blob format it seems possible they could implement portability.
/cc @nodejs/post-mortem @nodejs/v8

@bmeck
Copy link
Member Author

bmeck commented Jun 22, 2017

Not sure what you mean by "live", all v8::Handles need to be closed in order to serialize a blob, can't have JS on the stack.

@refack
Copy link
Contributor

refack commented Jun 22, 2017

Not sure what you mean by "live", all v8::Handles need to be closed in order to serialize a blob, can't have JS on the stack.

Ok not live, suspended animation, but AFAICT you could reignite the event loop and use the preloaded code for example to rerun a JS function to see what it's doing.
For example if a production server enters an err state, before you restart it you "suspend" it, take the blob to your dev machine and fiddle with the state in JS land. Mind blown!

@sebmarkbage
Copy link

Related: Proposal for snapshots as a Realm API for wider support in the ecosystem and getting libraries to consider the requirements.

I built a mode into Prepack to compile snapshots of basic Node CLIs. One limitation that I hit was that things like the type of a socket becomes hard coded. You might imagine that taking a snapshot of a CLI that prints to the console and then running it with node mysnapshot > file.txt or something would work but now that stdout is a different type and a lot of the JS streams around stdio has be rewired. Now Prepack can automatically encode multiple branches over abstract values but you'd probably want to manually handle such cases.

@bmeck
Copy link
Member Author

bmeck commented Jun 23, 2017

@sebmarkbage I think the custom serializer deserializer functions in v8 could be enough to handle this. Even with that though, libs could check tty info that gets invalidated when main runs after being deserialized, like you said.

@hashseed
Copy link
Member

hashseed commented Jun 23, 2017

Some comments:

  • The isolate provided by the snapshot creator is special in that it prepares for serialization. This basically means that code objects will include meta data to mark external references so that the serializer can find them.
  • Not everything on the heap can be serialized. Most importantly, optimized code and typed arrays.
  • Snapshot for Realm API is different than the start up snapshot in so far that the start up snapshot bundles isolate and context snapshot, while for the Realm API the snapshot will only contain the context, decoupled from the isolate. This could prove to be more complicated since there need to be a way to untangle a context from the isolate it belongs to, into a form that can be transplanted into any arbitrary isolate.
  • Snapshot creator can be used to also include compiled code into the snapshot. For this, check out the implementation of V8::WarmUpSnapshotDataBlob. If it confuses, I can elaborate further.
  • We can't capture a snapshot that includes the stack. This in combination with some of the above points makes snapshot as mechanism for crash dump unsuitable, imo. Also unsuitable for suspending an isolate, as it contains more state than what's on the heap. It really is just to put V8 into a predefined start up state.
  • V8 team is unlikely to stabilize the snapshot format for the same reasons the bytecode format will not be stable.

(edited by @TimothyGu to correct a typo)

@sebmarkbage
Copy link

@bmeck My point was more that the JS libs internal to Node's runtime already does that. Ideally they'd be updated to avoid that problem.

@bnoordhuis
Copy link
Member

Not everything on the heap can be serialized. Most importantly, optimized code and typed arrays.

I wondered about that. Means it's probably not useful for node because typed arrays (Buffers, which are instances of Uint8Array) are used pervasively throughout core.

@hashseed
Copy link
Member

Adding support for serializing typed arrays should be fairly easy, as long as the deserializer can allocate the backing stores outside of V8's heap, I think. I'm not familiar with how Buffer is implemented. Would it work out-of-the-box if Uint8Arrays can be serialized?

@TimothyGu
Copy link
Member

@hashseed Yes it would work, but can be a bit unwieldy if no special case is designated. Node.js Buffers can be pooled, and multiple Buffers can share one ArrayBuffer, but with different offsets and lengths. If Buffer serialization were to be implemented, only the visible view should be stored rather than the entire ArrayBuffer. This is what we are doing with the Serializer/Deserializer APIs (see https://github.com/nodejs/node/blob/master/lib/v8.js#L148-L165). Otherwise everything is the same.

/cc @addaleax who implemented the Serializer/Deserializer binding.

@hashseed
Copy link
Member

@TimothyGu that would require V8's serializer to recognize and introduce special case for Buffers, which doesn't sound like a good idea to me. Why is it necessary to serialize views into separate ArrayBuffers? I would suggest to deserialize Buffers into the exact same state as they've been serialized.

@bmeck
Copy link
Member Author

bmeck commented Jun 23, 2017

I am with @hashseed here, I think the entire pool would want to be serialized since a snapshot shouldn't change the data (which is leaking the pool).

@TimothyGu
Copy link
Member

@bmeck @hashseed The default pool size (Buffer.poolSize) is 8192 bytes, so the underlying ArrayBuffer is 8192 bytes until it is used in full. In case I need to serialize 4096 consecutively allocated Buffers, each with two bytes, they would actually be stored on the same ArrayBuffer. To serialize Buffers into the same state as they've been initialized, I would have to store 4096 * 8192 = 33554432 bytes instead of just 4096 * 2 = 8192 bytes.

I'm not sure how startup snapshots are different from V8's Serializer, and the calculation above only applies to Serializer. If different ArrayBufferViews in a startup snapshot are allowed to share a single ArrayBuffer, then the issue would of course be solved. If not, I do see the argument for keeping the Buffers in the same state (and TBH I'm fine with that), but it just seems to be a bit wasteful.

@bmeck
Copy link
Member Author

bmeck commented Jun 23, 2017

I think there is a disconnect here. I am not sure why you need to change the pool for all the buffers, that seems like a shallow copy which would also be bad.

@hashseed
Copy link
Member

The start up snapshot is supposed to put the context in a predefined state while bypassing any initialization scripts. It should reach the same state as if the initialization script ran.

Assuming that serializing ArrayBuffer is implemented, there is not going to be a duplication you described above. The serializer can recognize objects that it already visited before and deserialize to the same object graph. After deserialization we would still require just 8192 bytes. I think you are confusing the ValueSerializer with the snapshot serializer. The former is fairly new, while the latter has existed for years, and been continuously improved. We are talking about the latter.

@TimothyGu
Copy link
Member

@hashseed Ah that was partially what I was asking – if they operate differently. Seems like they do, so thanks for the explanation :)

@mrkmarron
Copy link

We have been working on snapshot/restore in ChakraCore for a while in the context of time-travel debugging. Our current implementation supports full serialization/deserialization of most ES5/6 constructs to a stable and (mostly) engine agnostic format. Snapshot performance is a big concern for us and the current implementation is able to extract/restore the full JS application state on the order of 10's of milliseconds.

We don't yet support linking this up with the native data in Node as, mentioned above, (1) we need to track down and associate all the native reference locations for snapshot/restore and (2) there are challenges in restoring some native state such as file handles or sockets. The first issue is mainly an engineering challenge, an intern of mine experimented with this for migrating live JavaScript/HTML applications (section 5.2). The perf. numbers are on the slow side since it was a proof-of-concept effort. The second issue requires some thought in the API for inflation -- e.g. symbolic values for startup code, user provided hooks, report error on next access and let recovery logic take over, etc. -- and I think having some specific scenarios here to guide design would be very useful.

Using snapshots for application startup performance and migration are things I have been wanting to do more work on, and we have most of the needed JavaScript side parts already implemented, so I am definitely happy to support work on this.

@bmeck
Copy link
Member Author

bmeck commented Jun 29, 2017

@mrkmarron we can discuss need for ChakraCore here as well. Do you think an intptr_t[] is enough for ChakraCore to setup pointers to when doing serialization? In pseudocode~

intptr_t externals = { READ_FILE }; // provided by node

JSFunction readFileFn = { // made by runtime env / C++
   intptr_t internal_fields = { READ_FILE }
};

snapshot_create(...) {
   // walk ...
       for (field, i of obj.internal_fields) {
           serialized.internal_fields[i] = externals.indexOf(field);
       }
   // ... walk
}
snapshot_load(...) {
   // walk ...
       for (serialized_external_i, i of obj.internal_fields) {
           deserialized.internal_fields[i] = externals[serialized_external_i];
       }
   // ... walk
}

@mrkmarron
Copy link

Hi @bmeck, if I understand your design correctly I think it should work for recording basic JS->Native references in the snapshot. My paraphrasing of your design is that it uses the index in a fixed by the host array of intptr_t as the stable way to give an identity to a native reference in the snapshot phase so it can be looked up in the inflate phase. This should definitely be do-able with the current ChakraCore snapshot/inflate code.

My only thought here is that this works well when the snapshot is assumed to be at a very stable point (e.g., right after Node loads but before event loop starts) but it can be a bit brittle and difficult to debug if we start to want to use this in more complex scenarios. We had a similar issue with primitive objects from the core runtime (like the undefined object or builtin functions) where we cannot create them and need to give them "well known identities" in the snapshot. We ended up using simple string names, which is slightly lower performance but much easier to debug and more flexible, and I have been satisfied with the result. So, I feel like having a typedef-able setup for the identifier tokens, at least to allow for debugging, might be good.

Also, what are you thinking in of the the Native->JS references? We currently just use the pointer values cast to integer values (#defined as TTD_PTR_ID) in the snapshot and then have a map TTD_PTR_ID -> Js::Object when inflating. We could either look to pass that map back explicitly or provide and API the host can use like Js::Object JsRTMapSnapshotObjectId(TTD_PTR_ID).

I'll plan to put some cycles into the JS->Native implementation next week.

@bmeck
Copy link
Member Author

bmeck commented Jun 30, 2017

@mrkmarron I think the map could be fine as long as it doesn't explicitly point to a memory address once serialized.

For v8, the general idea is to us Private symbols that are not visible to JS and attach them to the global scope:

Context global;
Function require = makeJSRequireFunction();
global->Set(Private::ForAPI("require"), require); // ~= Symbol.for but not visible to JS
// ... take snapshot
// ... revive snapshot
Context global = snapshot.GetContext(0);
Function require = global->Get(Private::ForAPI("require"));

This roughly equates to us manually making the map ourselves so that seems fine. Not having to do all the Get/Sets and being able to get an array would be nice.

This was referenced Sep 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discuss Issues opened for discussions and feedbacks. lib / src Issues and PRs related to general changes in the lib or src directory. performance Issues and PRs related to the performance of Node.js. process Issues and PRs related to the process subsystem.
Projects
None yet
Development

No branches or pull requests