Skip to content
Find file
Fetching contributors…
Cannot retrieve contributors at this time
379 lines (275 sloc) 13.7 KB
Good morning everyone, and I hope all of you had fun in the first few sessions.
Since I don't have much time, I'll jump right in. This talk is about writing
bindings so that you can use native C/C++ libraries in node.js projects. This
talk is useful if you want to do that, OR you are interested in some node.js/V8
internals for curiosity's sake. I will assume you know C++ fairly well. The talk
is very example heavy. You'll have a lot of C++ thrown onto the screen. That
said I have intentionally not shown any actual binding. This is because
wrapping your head around 2 things in 30min might be too much. Rather I show
the V8/node side of the problem and at the end will point you to tiny bindings
that serve as good examples. I suggest you clone the git repository and follow
along in the code.
We want to
Some examples of requiring bindings are to do image processing using OpenCV, or
OpenSSH for cryptography (in-built) or parsing XML using libxmljs.
The V8 engine is very modular and built to easily allow these bindings. Most of
these concepts also apply to Spidermonkey too.
The biggest reason everyone is using node is that it makes asynchronous I/O
easy. For any bindings with I/O you will want it to be async too, so I'm going
to cover that.
Getting started
So let's jump into the first piece of code. This is a outline of how any module
looks like. You'll always want the v8 and node headers. The NODE_MODULE macro
allows you to define the name of the module and specify the function called to
initialize your module. This function receives a single JS object, the module
scope, in which you can set up your library. Think of this `target` as the
`exports` object used in JS modules.
There should be one and only one initialization function for every node module.
Node uses the WAF build tool to compile addons. When you install node, the
node-waf tool is also installed which is aware of node specific information.
Just like make has a Makefile, waf has a wscript file which is just Python code.
The standard build for a node module is given here. This will create the
dynamic library in build/Release
You specify you want to use C++. The node_addon allows linking with v8 and node
and includes their header files in the build process. obj.source can be
multiple space separated filenames.
To compile use `node-waf configure` the first time, and then `node-waf` by
itself for subsequent recompiles. At the end of the talk I'll show you how to
actually link your external library into the code by modifying the wscript.
If you now require() your module, you should see it is empty but imported
.. code-block:: bash
$ node-waf configure build
'build' finished successfully (0.327s)
$ node
.. code-block:: js
> require('./build/Release/firststep')
Your node module acts as a conduit between V8 and your library. Indeed, node is
only a tiny wrapper over V8 and most of the time you'll be touching only V8.
The V8 API documentation is therefore a good thing to have open. One version
can be found at that link.
.. image:: images/architecture.png
:width: 12cm
.. [#] you **will** need
So before we start interacting with V8, let's discuss Handles, since you will *always* use them and access all V8 related objects through them.
Every V8 Javascript/process runs in a v8::Context, and memory is managed in
that context by the v8 GC. Heap allocated objects are referenced by the Handle
class, which acts as a smart pointer. Handles are stack-based and lightweight
(so you can pass them around by value).
When the GC finds an object with no handles to them in native code, and also no
references to them from JS, it is free to delete the object. It is usually
convenient to load a new HandleScope (something like a stack frame for Handles)
in every function, so that all Handles are safely reclaimed in one-shot.
Handles come in two varieties. A object referenced only by a Local handle is
deleted when the handle goes out of scope. A Persistent handle on the other
hand keep their memory around until manually disposed (of course, destroying
the context in which the Handle was created, will take down all handles). In
bindings, Persistent handles are usually used for classes and function
templates and not much else. Sometimes if you want to pass around function
callbacks between different C++ scopes, then you will use a Persistent
reference to that JS function.
Injecting primitives
With all that behind us, we can finally start writing some real code. The simple thing you can do is introduce variables into the module from C++. This could be used to report the version of your C++ library, expose constants to JS and to populate data. Understand that target is a JS object, so you simply Set integer or string keys into it. Here I have introduced the value of PI into my module.
For constants specified by macros, node provides a handle macro. All the
standard JS types have their V8 analogues. One common pattern I would like to
stress is the use of a static New method on all the V8 classes. This is the
right way to create V8 objects so that the GC can track them. All of these New
methods will always return a Local handle. To make them persistent, an explicit
cast is required.
> Play with the module in a REPL.
Simple functions
Now primitives are useful, but to actually start executing code based on what
happens in the JS environment, we need functions. In V8 every JS function can
map to a native function. I'll show the code first and then we'll understand
the concepts involved. For demonstration, let's convert the simple square
function into C++.
Simple functions
All functions mapped to JS have a signature returning a Value and accepting
a v8::Arguments instance that has information about the function call. This
information includes the arguments themselves, and references to the This
object, the context of the call and something called the Holder.
Once your template is set up, you can get a function object to be injected into
JS via GetFunction. The square function is now available as exports.square.
Simple functions
The actual implementation of functions is pretty easy. You'll extract and
type-check the arguments, do your computations or pass on the arguments to your
library, and then compute a value to give back to the JS environment.
Here is where Scope comes into play. All the arguments extracted become part of
the closest HandleScope. When the C++ function returns, this scope will be
destroyed and all all Locals will be GCed. But you want to preserve the return
value. The Close method will make the return value part of the next closest
scope in the JS environment (usually the scope of the caller).
Function themselves are held as a FunctionTemplate. Since objects are also
created from functions in JavaScript, the FunctionTemplate acts as an
intermediate staging ground for you to set properties on them. It provides
access to the prototype object and the instance object.
The instance object is what becomes 'this' inside the function when the
function is called as a constructor. Prototypal inheritance is also allowed
using FunctionTemplate::Inherits, but I won't be going into that
Simple objects
Since the same FunctionTemplate system is used to create objects, the code is
not much different. To implement this standard OO style in C++, you would
Simple objects
Simply set the items property of the instance to the corresponding value.
The inventory function callback will itself return args.This(). When invoked as
a constructor, this will be an empty object instantiated by V8.
Methods are usually associated with the prototype, so what do you think the
same will be in C++? That's right, use PrototypeTemplate() instead. There are
one or two intricacies which I will explain later, which are abstracted away by
the NODE_SET_PROTOTYPE_METHOD macro. The C++ function code for methods doesn't
change compared to functions, except that you access and use properties from
the This object.
Registering prototype methods
Object properties can be accessed by treating the This object as a simple
dictionary and using Get/Set
Throwing exceptions is very easy in V8, you just return a ThrowException call
which will schedule an exception to be thrown
Now that we've covered functions and methods, we are ready to start dealing
with bindings where you have some internal objects. For example when parsing
XML, every DOM document in JS should have a backing C++ DOM document.
Associating these objects is made easier by a node class called ObjectWrap. It
is a two level process. You create a ObjectWrap subclass, hold references to
your binding related object in it. All the methods that the JS environment
should be able to invoke on the object have corresponding C++ equivalents in
this class. Then whenever a C++ function is called, it invokes the
corresponding native method on the native object.
V8 Objects have special fields call internal fields which are not touched by V8
itself, so you can store arbitrary data in them. ObjectWrap::Wrap allows you to
associate your subclass with one of those objects.
Let's say this is your native library
To tell V8 you are using n internal fields, you've to say so.
In the constructor C++ equivalent function, you create a new Inventory
instance. Here Inventory is the ObjectWrap subclass. You then call Wrap. Notice
I've used args.Holder() rather than args.This(). I'll explain why in just a while.
Every 'method' then gets the reference to the wrapper by using Unwrap. Unwrap
just retrieves the internal field. You could ditch ObjectWrap and just use
internal fields directly, what ObjectWrap does is automates proper deletion of
your object when the V8 object itself gets deleted.
Going Async
Finally, we are at the reason people want to use node, because it moves I/O
transparently to another thread and makes the JS code non-blocking. libuv has
now abstracted away all the threading and libev code behind one function to
make async calls very convenient. That said, coding async calls can still be
Going Async
In the code, there is a folder 'sync' which does not use any async features. It
has a similar script to the one shown here. If you run that you'll see the
"After reshelve" is actually printed after 5 seconds.
Going Async
The native blocking code simply sleeps for 5 seconds, although usually you'll
do I/O in there.
Going Async
The baton is used to exchange all data. It's first member should always be
a uv_work_t structure. This is very essential. Then since we want to invoke the
callback later, we pass it, and since we want a reference to the object since
we want to invoke a method on it in the I/O thread.
Going Async
This is the first function.
Rather than returning or calling the callback, we create the Baton, populate it and call uv_queue_work. Notice how we made the function persistent since we are going to be passing it around.
queue_work() takes the event loop, which is almost always the default loop, the
request object, and the function which is to be run parallelly, and the one to
be called after the blocking function returns.
Going Async
Thread pool function
The thread pool function should not interact with V8 or V8 objects at all. It
is running in a separate thread, and fiddling will cause unpredictable results.
Just extract the baton from the request object's data field and do your
blocking work.
Going Async
Clean up
The After function calls the callback now that the blocking task has finished.
Since we had made the callback persistent, it is also our job to dispose it off
and delete the baton.
Going Async
Running async/test.js should immediately show the difference between the synchronous and asynchronous versions.
.. code-block:: txt
After reshelve in source
Reshelving done
Linking your library
With that we are almost at the end. To actually link your library in the
compile step, you'll want to tell Waf to do so. Waf uses pkg-config to find the
location. Simply using the alias in the build stage should do it. It is possible to directly
link static libraries into the node executable, but I don't know how to do it
with the new build system, yet.
Holder vs This
So the final remaining inconsistency.
args.Holder() refers to the object it should've been called on
so that prototype chains work.
Things I haven't covered
* Accessors
* Per property accessors
* Indexed accessors ( `object[5]` )
* Named property accessors ( `` )
* Function Signatures and HasInstance for type safety
* Emitting events using new JS only EventEmitter
* Details of libuv
* Using V8 on its own
You might want to look at
these bindings are good references when writing your own.
End notes
That's it. Thank you very much for listening to me, I hope it was enriching.
Before you ask me questions, I have one question to ask, which is just a little
survey I'm doing.
What do you think is the one technology/cultural value/concept that you think
*every* software developer, whether he be a assembly programmer or a web
developer, should know?
Please drop me replies at either of these places.
The PDF has some more slides you may find useful.
Questions, thanks etc.
Something went wrong with that request. Please try again.