Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Call a nbind::cbFunction with nbind::Buffer? #91

Open
sambhare opened this issue Sep 18, 2017 · 7 comments
Open

Call a nbind::cbFunction with nbind::Buffer? #91

sambhare opened this issue Sep 18, 2017 · 7 comments

Comments

@sambhare
Copy link

Is it possible to create an nbind::Buffer object inside a wrapped function and send that object to Javascript via a callback? I created the following example but all I see on the Javascript side a null.

#include <iostream>
#include <string>
#include "nbind/api.h"
class BufferCB
{
public:
    static void callbackWithBuffer(nbind::cbFunction& callback)
    {
        unsigned char* data = new unsigned char[_length];
        for (size_t pos = 0; pos < _length; ++pos) {
            data[pos] = pos;
        }
        nbind::Buffer buf(data, _length);
        callback.call<void>(buf);
        // Not sure if memory to be freed in C++ or JS
    }
    const static size_t _length = 50000;
};

// For NBIND_CLASS() and method() macros.
#include "nbind/nbind.h"
#ifdef NBIND_CLASS
NBIND_CLASS(BufferCB)
{
    method(callbackWithBuffer);
}
#endif

Corresponding JavaScript code (buftest.js):

var nbind = require('nbind')
var lib = nbind.init().lib
lib.BufferCB.callbackWithBuffer((buf) => {
  console.log(buf)
})

Outputs:

$ node buftest.js
null

Attempting to console.log(buf.length) results in a TypeError: Cannot read property 'length' of null

@Luthaf
Copy link

Luthaf commented Oct 2, 2017

I am interested in this functionality too, to provide memory view into some std::vector to the javascript side.

From browsing the sources, it looks like we can not use nbind::Buffer as a return type:

return(Nan::Null());
. I'll try to write my own BindingType instantiation for this, I'll let you know if it works =)

@jjrv
Copy link
Member

jjrv commented Oct 2, 2017

@Luthaf, thanks for looking into this! I think the reason it wasn't implemented before is complexities in the Emscripten target. JavaScript can receive a view of the C++ memory, but would have to explicitly release the reference because in browsers, the C++ code cannot know when the JavaScript object gets garbage collected. Then it's easy to leak memory in the C++ heap, which is already restricted to a constant size. Another option is to pass the buffer as a copy, but that negates many of the benefits.

Can you think of a solution portable also into Emscripten? Otherwise, this could still be released as a native-only feature.

@Luthaf
Copy link

Luthaf commented Oct 2, 2017

I am not sure I understand the issue here. Do C++ destructors still run with Emscripten? Then the problem is that we don't know when to run destructors because the JS side could still be holding a reference to the buffer data?


My use case looks like this:

class Container {
private:
    // buffer is something like a std::vector, 
    // except I can't access the actual std::vector directly
    some_type buffer;
public:
    size_t length() {return data.length();}
    double* data() {return &buffer[0];}
}

I would like to provide read and write access to the data to the JS side, but I don't want the JS side to free the memory at any point. The C++ object is the owner of the memory.

This is fundamentally unsafe, as the vector can be resized and then the pointer becomes invalid. But this is OK for my use case. I've got something working with the following code, but I only implemented the node.js part for now.

@jjrv
Copy link
Member

jjrv commented Oct 3, 2017

Exactly, in Emscripten we can't know what references JS is holding. It's like security boxing is reversed compared to Node.js native code: C++ is limited to accessing a single typed array and has no knowledge of JS engine internals. Meanwhile JS has direct access to the entire C++ heap.

The current nbind::Buffer holds a reference to memory owned by JavaScript. If a new buffer is returned, the memory is allocated by C++ which then would end up owning it. This needs to be handled very differently in the nbind::Buffer class. In native Node.js it's important that the memory stays allocated until JS has lost all references to it, or the process will crash or allow security exploits. In Emscripten, either JS has to explicitly release references or the programmer must take care not to access de-allocated memory. However, the worst that can happen is the JS code behaving strangely.

Returning buffers would be more performant in Emscripten, because they can be views into the C++ heap. Meanwhile JS-allocated arrays are currently copied, because the Asm.js engine has no memory mapping support for directly accessing any JS object other than its own initial heap.

@Luthaf
Copy link

Luthaf commented Oct 3, 2017

Ok, so what I need is different from nbind::Buffer and more like an externalized v8 ArrayBuffer. It could be made working with emscriten without too many issues (JS code behaving strangely at worst). For the node.js side, you use std::shared_ptr to hold everything, am I right ? Then you could use the aliasing constructor of std::shared_ptr to build the new shared buffer, making it use the same reference count as the memory owner. But I don't know if it is desirable to have in nbind, as it might be relatively hard to document, and the behaviour would change between node and emscripten.

@jjrv
Copy link
Member

jjrv commented Oct 3, 2017

It seems like a useful addition to nbind. I guess from C++ you'd return a struct by value, containing std::shared_ptr<unsigned char *> and size_t len. Then toWireType would create a new externalized v8::ArrayBuffer and a new heap-allocated C++ object holding an std::shared_ptr<unsigned char *> to the data and a Nan::Persistent<v8::Object> with a weak reference to the ArrayBuffer. Once the weak reference GC callback fires, the object deletes itself which also resets the shared_ptr it was holding.

Such an wrapper object already exists here in External.h. It seems it could be re-used, if the template argument is changed from Data to DataPtr. The addDestructor function in that same file should then create it using new Destructor<Data *>(callback, data, handle); and the new Buffer wrapper could instead use new Destructor<std::shared_ptr<unsigned char *> >(...); passing the ArrayBuffer as the handle.

The ArrayBuffer is then passed to JavaScript, which doesn't need any additional wrappers around the buffer it receives... Except maybe it's possible to add a no-op free method on it, for future Emscripten compatibility.

I guess you could use void * in place of unsigned char *. Or maybe the buffer class should be templated to hold the right type of pointer. So you in C++ you do return ExternalBuffer(data, len); or callback.call<void>(ExternalBuffer(data, len)); where data should be any shared_ptr or unique_ptr. I don't think naked pointers should be supported, it's too dangerous and unclear who will free them.

@ISNIT0
Copy link
Collaborator

ISNIT0 commented Feb 22, 2019

For those looking for a working example of this, I have a Node binding working here passing a buffer via a callback:
https://github.com/openzim/node-libzim/blob/e790bfc55352a027f1e4d7e8acce9bfaf369c461/src/ZimReader.cc#L45

On the JS side I had to use a slice to avoid memory issues: https://github.com/openzim/node-libzim/blob/e790bfc55352a027f1e4d7e8acce9bfaf369c461/src/ZimReader.ts#L39

This is currently a bit of a hack and definitely not efficient, but it's working well enough for demos 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants