Design nucleus. #1

creationix · 2016-06-02T15:39:40Z

The basic goal of this project is to implement a tiny core runtime that contains libuv, javascript and some essential C libraries (like openssl) needed to re-implement node.js in userland as modules.

If possible this will be backend agnostic and allow multiple JS engines.

See also nodejs/node#7098

creationix · 2016-06-02T15:43:38Z

I think a first step would be to define an interface which all backend implementations should adhere to. One of the goals here is to avoid C/C++ addons so we don't need to worry about a public facing C or C++ API for addons initially.

Since it's a royal pain to try and get all JS engines to conform to a least-common-denominator interface, let's instead have independent implementations for each engine that all match some JS interface spec. Then modules written for one runtime will work for all so long as they don't use language features unique to a particular runtime (V8 and Chakra for example have most/all of ES6 while duktape is mostly ES5, but has lua style coroutines).

Once he have a common interface, we can start implementing the C parts for the various runtimes. I humbly suggest the I/O parts be directly designed to match libuv.

Fishrock123 · 2016-06-02T15:44:02Z

I think this will also be greatly simplified if we use things such as @mscdex's work to make the dns resolver be pure-js and not use c-ares nodejs/node#1843, and the http-parser: nodejs/node#1457

Fishrock123 · 2016-06-02T15:44:58Z

I think a first step would be to define an interface which all backend implementations should adhere to.

Sounds like @trevnorris's original API WG goals.

trevnorris · 2016-06-02T15:47:34Z

:-) I've had a long-standing goal to create an API for node that serves as a strict entry point into C++, which basically all code in lib/ would use. Alas, time constraints.

mscdex · 2016-06-02T16:01:06Z

As I mentioned in the linked DNS PR, it is difficult to get even close to matching the performance of c-ares/libc, even when using node's C++ UDP bindings directly. That pretty much rules out any performance issues in js land, so the C++ layer would have to be improved (if possible) to be able to compete with c-ares and/or the system resolver.

Regarding http, I haven't compared benchmarks since @indutny incorporated the JS stream stuff that bypasses js land when doing http parsing, so I'm not sure how the pure js http parser fares anymore.

creationix · 2016-06-02T16:05:11Z

As far as DNS resolving, in luvit we have two paths. One is pure lua on top of libuv's UDP primitives and is used for advanced queries. For basic resolving domain names to ip addresses, we use libuv's getaddrname and getaddrinfo which uses the system library on a thread-pool I believe. We have had no performance issues with this. Both are pure script on top of what libuv provides natively which will be provided in the the C core.

creationix · 2016-06-02T16:10:05Z

Let's not get too tripped up on edge performance issues. The goal here isn't to win synthetic benchmarks with the vanilla flavor of the minimal core. We will have options where people can build different flavors of the core with various libraries included (like openssl, cares, http_parser, etc). If you're deploying a large enough system where these performance issues are actually a problem, then you don't mind compiling a little C code. But for most projects and development workflows, this is not critical.

In luvi, there are two main flavors known as "tiny" and "regular" with the biggest difference being that regular includes openssl and a couple lesser used C addons in it's core. For many cases, http servers don't need openssl since they are running behind a reverse proxy anyway that handles the TLS termination. Things like MD5, SHA1, etc can usually be handled just fine (and sometimes even faster) in pure script.

indutny · 2016-06-02T16:19:39Z

Wow, I really like this proposal. I was working on something similar recently:

It is a modular C stream implementation. Not sure how useful it is, but it could be a good enough interface for interactions between C addons.

creationix · 2016-06-02T16:23:50Z

@indutny I saw those. I've always said libuv should have an extension community where things are written in C and can be used by all runtimes that consume libuv. Those could be included as well in the core if they are tiny (which I expect) and in optional addons if not.

If they are only to be consumed by other C code and it doesn't make sense to expose them to JS, that's fine. It will still be useful for addons to core that can use them.

creationix · 2016-06-02T16:25:17Z

I started a new issue for designing the libuv -> js mapping interface that all implementations must adhere to. #2

indutny · 2016-06-02T16:25:22Z

uv_link_t by itself is very small. uv_ssl_t is a bit bigger.

creationix · 2016-06-02T16:35:12Z

So I think we could say a nucleus implementation contains:

A JS runtime engine
libuv
Bindings to libuv for said engine (exposing the standard interface)
Glue to make applications.
Other optional C modules.

I think for part 4, we should follow the pattern in luvi. This means including minimal code for reading zip files. I have a modified version of miniz that I've bugfixed and added missing features that works great for this and is super tiny.

This will expose a bundle API that allows scripts to read in the virtual filesystem that can either be a zip file (standalone or appended to nicleus) or a folder on disk.

We will also have some minimal hooks that makes bootstrapping a require system in userland less painful. For example, it can look for a file bundle:deps/require.js or something and auto-run it if it exists before running bundle:main.js.

mmicko · 2016-06-02T17:49:40Z

Do not wish to disappoint you but there is already something similar in works https://github.com/saghul/sjs

Have to say I am interested in this kind of projects, since I think those can replace LUA with JavaScript (that more people are common with) to be used for scripting their software. Also there is large library of nodejs compatible javscript modules, making it possible to be used from application itself would be a great plus.

My suggestion would be to to go for C++11/14 support, and not just plain C. Exposing API and enabling user to expose their classes into JavaScript is very useful. There is LuaBridge project done for LUA that enables you to expose your classes and objects to a LUA engine. Doing similar out-of-box solution would make integration with user code even easier.

Note that Lua and duktape are quite similar in design so similar patterns can be used.

If you go this or similar way "you have my axe" :)

dlmanning · 2016-06-02T17:56:59Z

Am also super into this.

Plain C makes for easier interop with whatever other language one might be interested in calling from, (e.g. rust).

creationix · 2016-06-02T17:59:58Z

@mmicko I'm not disappointed, I know about sjs and even linked to it in the parent conversation in the nodejs issue. From my initial browsing however, sjs is much higher-level and opinionated than this project is aiming to accomplish.

Fishrock123 · 2016-06-02T18:05:57Z

Glue to make applications.

@creationix We'll probably a good amount of process, and some sort of module... bootstrapping at least. (Or maybe we just use ES modules?)

Is that what you meant by "glue"?

mmicko · 2016-06-02T18:06:11Z

@creationix good to hear that

@dlmanning understand that C API is easiest to combine with other languages, just pointing that C++11/14 support would be quite welcome

creationix · 2016-06-02T18:06:13Z

@mmicko Also since we're fixing the interop level at the JS interface exposed by the C/C++ backend we don't need to standardize on a language/version. The duktape backend might be all C89 while the V8 backend will obviously have some C++ involved. The common glue layer can even have multiple implementations if needed as long as the JS interface matches the spec. This is why it's important to define the interface clearly.

Fishrock123 · 2016-06-02T18:07:37Z

Note: using just ES modules are quite incompatible to the current node ecosystem so we'd still have to have some module bootstrapping available for the module module I think.
(& It would probably still have to be passed to scripts implicitly, like require. ...So it would probably have to be apart of the nucleus, I think.)

creationix · 2016-06-02T18:08:44Z

I don't want the module system to be part of the core glue. All we need is some conventions for bootstrapping a module system on choice. I really don't want things like node's global process in this layer.

For the curious, you can see how luvit accomplished this. Both process and require are userspace in modules.

creationix · 2016-06-02T18:23:53Z

@Fishrock123 I envision two parts.

The core API will provide things like loading files by path, scanning directories, getting cwd, getting environment variables, getting path to main binary.

It would also expose the JS runtime with API functions for compiling strings into code (with filename and ES goal type)
The hook will simply auto-run a file with a certain filename so that it can self-register before the main file is run.

Would this not be enough? What APIs exactly would need to be provided for a module system to be implemented?

For luvit's require which is modeled after node's I basically needed:

scandir(path) -> stream/list of filenames with type
readfile(path) -> file contents (in lua strings are 8-bit binary safe, no text encoding)
pathjoin(...parts) -> path

creationix · 2016-06-02T19:19:45Z

@Fishrock123 I think the simplest way to expose the builtin C modules without depending on a module system is to have some global object (like global.NUCLEUS) that exposes the various builtin modules. Userspace module systems could then expose a uniform interface where require('uv') simple returns global.NUCLEUS.uv, but require('some-other') is handled by the custom loader.

domenic · 2016-06-02T19:31:11Z

You could even call it process.binding

creationix · 2016-06-02T19:38:15Z

@domenic As I told @Fishrock123 in IRC, I'd like to avoid any name clashes with anything existing in node so I don't have to worry about matching semantics. This layer needs to have as little opinion as possible.

creationix · 2016-06-02T19:39:00Z

Also, process.binding will go away if this ever lands in core. And it will assuredly have a different shape.

creationix · 2016-06-02T22:29:47Z

@Fishrock123 I wrote up the beginnings of a README with the parts that are currently designed. This should help solidify the design goals a little.

creationix · 2016-06-02T23:25:58Z

@dlmanning see #3

dlmanning · 2016-06-02T23:27:09Z

@creationix : I am not as funny as I think I am...

drom · 2016-06-03T00:51:16Z

@creationix It woulde nice if nucleus would be available as an library for C++ embedding. I have used jxcore for this purpose: https://github.com/jxcore/jxcore/blob/master/doc/native/Embedding_Basics.md and quite liked it. But it is not supported anymore ;(

creationix · 2016-06-03T01:11:19Z

@drom I'm not sure there would be much in here apart from what's provided in the JS engines and the bindings. I'll try to make the various bindings independent enough that they could be used embedded in other projects.

chrisdickinson · 2016-06-03T01:18:28Z

Hi! I'm poking at something along the same lines over here. It builds and runs on linux (ubuntu trusty) and OSX thus far, and glues v8 to libuv & uv_link_t using gn.

It currently leans on a hacked-up version of chromium's build/ dir, which I'm tearing apart to get to the salient bits. The idea is to get it running on windows, osx, and linux first, then rewrite the build dir's gn stuff in a cleaner way to get to that end.

The experiment is thus:

Get a minimal project that includes v8, libuv, and the various uv bits @indutny has been putting together building everywhere.
At that point build in & expose fs, tcp, and tls bindings and a module system (via require) to js.
- I might do this in a separate project using gclient & gn to pull in the minimal binding layer.
Whenever a node global (process) is accessed, or a node builtin module is required require('fs'), short circuit the lookup to require('@nojs/node-<target>').
Long term goal is to get npm install working and bundle npm with the project.

My (handwave-y) plans are — and you'll each probably find something you like and something you dislike here:

Steer closer to TC39:
- The minimal API will use Promise. async is coming.
- No streams at first. Possibly include streams from WHATWG's ReadableStream spec later.
Steer closer to (newer) Google tools:
- Build with gn and gclient, keep deps up to date with gclient sync.
Focus on FFI. (Insert so much 👋 handwaving 👋 here)
- With an eye towards @indutny's heap.js & mmap.js, explore exposing mmap in order to create callable executable code from JS (possibly only for core functionality, but maybe not.)
- Binary compat with Node later.
  - @dominictarr had the excellent idea that the build tools should be dockerized.
Stick with Node's decision on ES modules. If Node zigs, Nojs zigs. No zagging, never zagging.
- Interoperability/backcompat is key.

In other words: I think this project and nojs are probably going to be walking along the same path for a bit, though it seems like eventually we'll have different goals. I'm happy to share the build code I've hacked together. Maybe making it easier to grab a compilable, working copy of libuv+v8 & friends will let a thousand Nodes bloom.

indutny · 2016-06-03T01:23:27Z

@chrisdickinson looks very cool! Though, you probably would like to use jit.js instead of heap.js, since the latter one is a JS VM Heap implementation...

creationix · 2016-06-03T01:25:44Z

@chrisdickinson thanks for the feedback. Indeed our goals are slightly different. Also I'll be starting with duktape and jerryscript as sample imeplementations of this interface as I abhor C++ and that steers me away from V8. Once I have things stable it would be awesome to use your code to make a V8 implementation.

Also the scope of this project seems to be a bit slimmer. I won't have any opinions at all regarding streams, promises, etc. I just want to provide a common base for tools to be built.

chrisdickinson · 2016-06-03T01:28:47Z

@indutny Ah indeed! I was thinking about repurposing this code to do the hop from JS to compiled code.

@creationix Cool — I wish you the best of luck! I'd definitely encourage checking out gn as a metabuild tool, it's slightly opaque but is pretty slick after a bit of use. I'm collecting a list of possibly handy links on the process of gluing stuff together.

indutny · 2016-06-03T01:38:52Z

@chrisdickinson https://github.com/js-js/jit.js/blob/master/src/jit.cc#L56-L96 ;)

dominictarr · 2016-06-03T02:46:38Z

I am certainly of the opinion that @creationix's opinionlite approach is the way to go. Streams should definitely not be in the "core", way to many opinions in streams. even we have @creationix's min-streams and my pull-streams because we couldn't agree on one thing and they are incredibly simple!

I think a project like this is really a C project, it looks like it's about javascript but it's not. It's about finding a way for C libraries to easily plug into a thing, it seems to involve javascript, but would that even be necessary?

There are totally ligitimate reasons not to include certain C libraries (personally, I'd like be able to exclude openssl, and build in libsodium instead - This would be ideal for secure decentralization projects) clearly there is also different JS engines that target different use cases (jerryscript is low resource use vs v8 is performance)

I think that means that the particular C libraries used need to be lightly coupled, I just need to pull them in by editing a config (or package.json)

@drom's point about embedding as a library would be super valuabe too - that would make this easy to deploy as an android app - just write a java binding to it and then embed directly into the same process.

dominictarr · 2016-06-03T02:59:58Z

but @chrisdickinson I think you are right about FFI. It's too hard to write a node binding, if you could just call a C function from "javascript" then we are done. Is that what you are thinking here?

dominictarr · 2016-06-03T03:00:57Z

even if I have to put the args I am calling into a buffer, that is still easier than the current way to write node bindings.

dominictarr · 2016-06-03T03:48:15Z

I should also point out that you don't actually need a module system. If you can run one javascript file, then you can statically link the javascript. i.e. with browserify, or noderify (which is assembled from browserify parts to make node.js scripts start really fast)

creationix · 2016-06-03T05:10:14Z

Initial core API is documented in the README and I just prototyped a duktape version (minus libuv and zip reading) that you can see in action.

main.js This is the entry point of a sample app. It doesn't provide a require system and instead uses dofile directly to manually load it's minimal libraries.

See it in action https://asciinema.org/a/b0yk23l05yhrw9mlp0uqik6pp

creationix · 2016-06-03T05:12:23Z

@dominictarr while it's true you don't need a module system, I do love a workflow that doesn't have build steps. As I demonstrated in the asciicast, you can run apps directly out of the source tree while developing without needing to rebuild the final binary. If the JS needs to go through a build step it breaks this simple workflow.

dlmanning · 2016-06-03T06:41:52Z

Given that JS now has a module system in its specification, it would seem strange to not build it in, no?

dominictarr · 2016-06-03T10:59:02Z

@dlmanning sure, if you are using a javascript engine that implements modules, then you could have that. The engines that @creationix is talking about starting with jerry-script and duktape both implement ES5.1

dlmanning · 2016-06-03T13:08:00Z

@dominictarr sorry, I missed the bit about starting with JerryScript

chrisdickinson · 2016-06-03T16:51:53Z

@dominictarr:

but @chrisdickinson I think you are right about FFI. It's too hard to write a node binding, if you could just call a C function from "javascript" then we are done. Is that what you are thinking here?

Yep!

@dlmanning: Notably, the module system is only ~sorta implemented in stable V8's as well (flagged and, IIRC, incomplete.)

dlmanning · 2016-06-03T17:12:25Z

@chrisdickinson : sure, it's a work in progress, but it's in progress.

(Don't worry, I have no desire to turn this thread into another ES Modules debate)

trevnorris · 2016-06-03T17:26:07Z

One the side about import. It's not possible to resolve a path at runtime. Which makes development of native modules a little more painful when you simply want to run:

$ NODE_DEBUG=1 ./node_g /path/to/my/module

and have it automatically pick up the Debug build of the binary. Setting up the application in this way, I'd assume there would be more than a few native modules written to extend the basic functionality.

dlmanning · 2016-06-03T18:01:01Z

@trevnorris : Seems like it would be good to provided a separate means of deliberately loading dynamically?

matthewp · 2016-06-03T19:02:04Z

Good choice on splitting the module system into user-land. I agree with both @creationix here that having one is good for development and with @dominictarr that they aren't needed for production. Is main.js as an entry-point going to be configurable? I'd like to have a separate dev.js and prod.js so I can do both.

This is going to be amazing for transpile-to-js languages, you essentially get statically linked small(ish) binaries for free if you just choose JS as your target.

creationix · 2016-06-03T19:21:44Z

@matthewp luvi has an option to override the entry point, but it's tricky designing the CLI without resorting to environment variables that can cause security vulnerabilities.

That said, you can have a main.js that loads a real main of you choice based on some env or argument.

matthewp · 2016-06-03T19:30:28Z

I assume you mean dynamically load the real main? That would defeat the purpose of it being "statically linked". I mean, this is not a real issue, just a nicety. Can always have your build script / makefile do:

mv main.js _main.js
browserify _main.js > main.js
nucleus ...
mv _main.js main.js

creationix · 2016-06-03T19:40:30Z

@matthewp of course. If you want something done at build time, do it with your build tool. If you want something done at runtime, do it with your runtime. :)

creationix · 2016-06-06T19:29:36Z

I think the core design is now stable-ish and mostly documented in the README. I'm going to close this for now. Create new issues as problems come up.

Thanks everyone for the feedback and encouragement. See you at nodeconf if you're going!

creationix closed this as completed Jun 6, 2016

Design nucleus. #1

Design nucleus. #1

Comments

creationix commented Jun 2, 2016

creationix commented Jun 2, 2016

Fishrock123 commented Jun 2, 2016 • edited Loading

Fishrock123 commented Jun 2, 2016

trevnorris commented Jun 2, 2016

mscdex commented Jun 2, 2016 • edited Loading

creationix commented Jun 2, 2016

creationix commented Jun 2, 2016

indutny commented Jun 2, 2016

creationix commented Jun 2, 2016 • edited Loading

creationix commented Jun 2, 2016

indutny commented Jun 2, 2016

creationix commented Jun 2, 2016 • edited Loading

mmicko commented Jun 2, 2016

dlmanning commented Jun 2, 2016

creationix commented Jun 2, 2016

Fishrock123 commented Jun 2, 2016

mmicko commented Jun 2, 2016

creationix commented Jun 2, 2016

Fishrock123 commented Jun 2, 2016 • edited Loading

creationix commented Jun 2, 2016 • edited Loading

creationix commented Jun 2, 2016 • edited Loading

creationix commented Jun 2, 2016 • edited Loading

domenic commented Jun 2, 2016

creationix commented Jun 2, 2016

creationix commented Jun 2, 2016

creationix commented Jun 2, 2016 • edited Loading

creationix commented Jun 2, 2016

dlmanning commented Jun 2, 2016

drom commented Jun 3, 2016

creationix commented Jun 3, 2016

chrisdickinson commented Jun 3, 2016

indutny commented Jun 3, 2016

creationix commented Jun 3, 2016

chrisdickinson commented Jun 3, 2016

indutny commented Jun 3, 2016

dominictarr commented Jun 3, 2016

dominictarr commented Jun 3, 2016

dominictarr commented Jun 3, 2016

dominictarr commented Jun 3, 2016

creationix commented Jun 3, 2016 • edited Loading

creationix commented Jun 3, 2016 • edited Loading

dlmanning commented Jun 3, 2016

dominictarr commented Jun 3, 2016

dlmanning commented Jun 3, 2016

chrisdickinson commented Jun 3, 2016

dlmanning commented Jun 3, 2016 • edited Loading

trevnorris commented Jun 3, 2016

dlmanning commented Jun 3, 2016

matthewp commented Jun 3, 2016

creationix commented Jun 3, 2016

matthewp commented Jun 3, 2016 • edited Loading

creationix commented Jun 3, 2016

creationix commented Jun 6, 2016

Fishrock123 commented Jun 2, 2016 •

edited

Loading

mscdex commented Jun 2, 2016 •

edited

Loading

creationix commented Jun 2, 2016 •

edited

Loading

creationix commented Jun 2, 2016 •

edited

Loading

Fishrock123 commented Jun 2, 2016 •

edited

Loading

creationix commented Jun 2, 2016 •

edited

Loading

creationix commented Jun 2, 2016 •

edited

Loading

creationix commented Jun 2, 2016 •

edited

Loading

creationix commented Jun 2, 2016 •

edited

Loading

creationix commented Jun 3, 2016 •

edited

Loading

creationix commented Jun 3, 2016 •

edited

Loading

dlmanning commented Jun 3, 2016 •

edited

Loading

matthewp commented Jun 3, 2016 •

edited

Loading