reconsider name mangling on api #7

Open
roojs opened this Issue Sep 12, 2011 · 20 comments

Comments

Projects
None yet
6 participants

roojs commented Sep 12, 2011

saw bergies post in node gir, im the maintainer of seed, so if you need help there is a javascript mailing list on gnome.

Please rethink the camels method mangling, it will break all compatibility between gjs and seed.

The connect<>on mentioned should be feasible in the javascript wrapper layer

Contributor

bergie commented Sep 12, 2011

I don't think the GIR API for Node.js can be compatible with Gjs or Seed anyway. Currently it uses the call-return pattern, but Node's asynchronous nature would require it to follow the Continuation-Passing Style instead.

So, this should not be the way:

if (!mgd.open_config(configObject)) { process.exit() }

But instead:

mgd.openConfig(configObject, function(status) {
    if (!status) { process.exit(); }
});

Otherwise the GIR calls can't be handled in their own threads and will freeze up the Node.js server. And the API would feel too unnatural for Node developers.

And BTW, hey Alan! :-)

roojs commented Sep 12, 2011

My first impression is that It should not be that difficult to implement callback style functions in seed, if params+1 is a function, call it... with the return value(s).

Actually, it's quite a nice solution to our current return mangling mess.

I presume that is how node-gir will do it.

I would presume gjs should be trivial to do as well.

function listSelectedFiles(callback) {

      treeview.get_selection(function(s) {
         s.count_selected_rows(function(n) {
        if (n < 1) callback([]);
            var files = [];
            s.selected_foreach(function(model,p,iter) {
             model.get_value(iter, 0, function(value) {
                 value.get_string(function(f) {
                   files.push(f);
            })
        }, function() {
                callback(files);
            }
    });
}

Looks like it add's quite a bit of noise to rather trivial code.... is that the only way node can work?

function getSelectedFiles() {
    if (this.el.get_selection().count_selected_rows() < 1) {
        //nothing? - clea it?
        return [];
    }
    var ret = {};         
    var s = this.el.get_selection();
    var files = [];
    s.selected_foreach(function(model,p,iter) {
       files.push( model.get_value(iter, 0).value.get_string());
    });
    return files;
},

roojs closed this Sep 12, 2011

roojs reopened this Sep 12, 2011

Owner

creationix commented Sep 12, 2011

Sorry, I haven't had time to work on this is a while. I've just been pulling in others work to help promote the efforts to now die out. I'm all for using the same naming conventions of seed where possible. As far as node being async, it only needs to be async when doing I/O. This is probably hard since I doubt the GIR information tells us if a particular function does blocking I/O or not. But for anything that's not I/O bound (like the bulk of gtk widgets) I think the node bindings should stay syncronous as well. FFI does have some overhead, but so does pushing work out to the thread-pool and syncing things. Plus async APIs are pretty nasty when you don't need them.

Contributor

bergie commented Sep 13, 2011

In case GIR doesn't provide us the needed information on blocking I/O, I'd rather go on the safe side, and force all methods to be async.

At least our GIR use case (Midgard content repository) will be rather I/O-intensive, and if that will be blocking then we just can't use it.

roojs commented Sep 13, 2011

From my use of GLib spawning, which might be quite pointless with Node, but it does show how a blocking/async interface tends to look with GObject libraries.

in gio's case, it does
GLib.spawn_async_with_pipes

then you add a watcher to the pid to get completion
GLib.child_watch_add()
and an io watch to monitor the stdin/stdout.
GLib.io_add_watch()

It's probably worth following the same or similar API for Midgard, rather than relying on a feature of Node if you need other bindings to work as well (eg. python/php etc..)

Contributor

bergie commented Sep 13, 2011

@roojs good point, but the problem is that Midgard's API hasn't been designed with asynchronous environments in mind. It might be possible to change, but not very soon (pinging @piotras). And many other libraries are likely to be similar.

I'm looking forward to using node-gir on web servers, and there synchronous calls are very bad thing to have.

Collaborator

piotras commented Sep 16, 2011

No idea how it affects performance, but CamelCase methods feels more natural in every language.

@bergie, wrt async, I think I need more use case examples. If you think about GLib's spawning API, Midgard will work in forked processe quite well. The problem are operations, not commands.

Contributor

swick commented Sep 16, 2011

the problem is that each(!) call of a gobject's method may block (in gst e.g. they lock a mutex to ensure, that it works with multiple threads). So we've got two choices: we make everything synchronous or we make everything asynchronous.

Contributor

bergie commented Sep 19, 2011

@swick then my suggestion stands:

In case GIR doesn't provide us the needed information on blocking I/O, I'd rather go on the safe side, and force all methods to be async.

Contributor

swick commented Sep 19, 2011

+1 on that but I don't think that I will implement anything in the direction of libev in the next few weeks. I would rather wait for libuv to be ready.

Collaborator

piotras commented Sep 19, 2011

I agree with @bergie. What happens under the GLib/GObject hood is not so important. If async is natural for Node, let it be this way.

Owner

creationix commented Sep 19, 2011

The problem that async APIs are much harder to use. Node uses then when necessary to keep the reactor loop from blocking, but this is only done when there is I/O wait involved. Forcing CPU bound functions to be async by using eio_custom is not a free conversion. It's very heavy on CPU, especially when there is high concurrency because of the mutex locking to keep the threads syncronized. All I/O in node, except for filesystem I/O uses a multiplexing select loop, so there are no threads to worry about. That's where node is really fast and shines. The filesystem, on the other hand is a thread-pool because there is no native non-blocking filesystem on most unix systems.

I'm not just saying this out of theory, I once implemented a database in pure node, and the filesystem interaction was terrible for any considerable level of concurrency.

Also from an API point of view, async APIs are just harder to use. Also there is a difference between true async (where the callback happens on a new execution stack) and a function that simply takes a callback, but calls it before returning (like Array.prototype.forEach). To make something that's doesn't have a natural event source in it be truly async, you have to use setTimeout or process.nextTick, and then this introduces an extra level of complexity that's probably not needed.

I'm not saying that we should go for async when it makes sense, but don't say "async is natural for node" as a blanket statement. Node is only as async as needed to keep the event loop running smoothly, but no more. Forcing functions to be async that don't need to be is in many cases worse than blocking on filesystem operations.

We really need a way to know if a function is going to block on I/O, especially internet I/O. Since gir is not manual bindings, this is tricky. Maybe the solution is to allow the developer the option of calling a function async or sync and leave it up to them to chose the right solution. If anyone has a better idea, I'm all ears, but simply making everything async is not a solution.

Contributor

bergie commented Sep 19, 2011

One option might be to generate two methods for each method we get from GIR:

  • methodName(param1, param2, callback) - calling the method asynchronously
  • methodNameSync(param1, param2) - calling the method synchronously

This would feel reasonably similar to Node's filesystem API.

We should probably raise the issue of getting info on whether a method is a blocking one or not via GIR.

Owner

creationix commented Sep 19, 2011

Sounds good. I don't really have a preference if we go for two explicit versions of each function or a single combined one where the presence of a final callback makes it async.

Node actually has both. The higher-level public facing API exposes two functions as noted in the API docs, but the underlying implementation is actually only one function.

I have good and bad feelings about both, so either would be fine. And yes, we should bring this up with the upstream GIR people to see if we can get some annotation to find out if a function makes blocking network or filesystem calls.

elima commented Oct 2, 2011

Asynchronous API calls can be identified by certain patterns. For example, the presence of a function returning void, a 'GAsyncReadyCallback callback' coupled with a 'gpointer user_data' arguments, and function with the same name plus '_finish' suffix indicates an asynchronous operation.

That is the standard async pattern in GIO, and all GLib based libraries should use it. So I guess we can rely on that pattern and "trust" that GIO compliant libraries will implement it correctly.

Another way to detect it is by checking if there is any argument that is a pointer to a function, and check its 'scope' annotation. There are currently three types to consider: (call), (async) and (notified). They are explained at https://live.gnome.org/GObjectIntrospection/Annotations. 'call' is for cases similar to Array.forEach, while 'async' indicates a "one-shot" callback (remove after called) and is probably enough to decide it is an async op. 'notified', in the other hand, is for callbacks that live longer and can be called more than once.

About name mangling, I have mixed feeling but I'm more favor to use the underscore pattern and provide the CamelCase syntax in a JS layer on top of node-gir. That would not be hard to do and would provide consistency across the different engines, yet developers have the two syntax available to decide which to use in their code.

Thanks for bringing node-gir up! cheers

Contributor

bergie commented Oct 2, 2011

About name mangling, I have mixed feeling but I'm more favor to use the underscore pattern and provide the CamelCase syntax in a JS layer on top of node-gir.

I think the question is, are we targeting GNOME developers, or JS developers with this. For GNOME developers, some consistency across different GIR implementations would be valuable (meaning we should use underscores). For JS developers, camelCase would feel much more natural.

My gut feel is that the latter is the target group here, but I'm not entirely sure.

Contributor

swick commented Oct 2, 2011

@elima: I think you missunderstood the problem. gir gives us information if a function is async or not, so no problem there. The problem is that when glib is doing file or network io it is blocking the proccess und then we have a problem. These functions are non-async in glib but must be async in node. gir has no information about file/network io (at least not yet).

elima commented Oct 2, 2011

@swick, yeah I might have misunderstood it :).

First thing GLib developers (and anyone developing to an event-driven platform) learn is that sync ops block the main loop and should be forbidden unless they are moved to a thread.

GIO provides sync and async ops, and it is library user's responsibility to know which of those to use. Then I don't quite grasp the problem you are trying to fix in node-gir (likely because I don't know its internals) but in GJS/Seed we just get blocked too if a user calls a sync API. They don't protect against sync calls because it is user's fault not using the async version.

@bergie: yes, but maybe instead of decide which group to "sacrifice" :), maybe we can provide the underscore mangling version as the base in all engines (GJS/Seed/Node-gir); and have each engine to provide a thin layer on top of it, that does the conversion to CamelCase. That thin layer could also be common to the different engines.

Contributor

bergie commented Oct 3, 2011

GIO provides sync and async ops, and it is library user's responsibility to know which of those to use.

That works quite poorly with Node, where threading isn't really available out-of-the-box, and blocking operations block the whole process (which might be a web server). There are things like Web Workers that you could install and use, but that is an extra step I can't see people taking.

Owner

creationix commented Oct 3, 2011

There is a thread-pool in node that bindings have access to. It's what's used by the FS module in node. (since file-system operations are all blocking on most OSes).

I don't think many people will be making gtk3 apps that also serve 200,000 concurrent websocket clients in the same process, but you never know. It is more likely that a simple app will have some GIO library in use and some local web-server (like a media streamer), or maybe it just wants to use setTimeout to defer some action. If you're blocking the event loop forever, the timers will never fire. Short blocking operations like file system access or network I/O will block for a little while. It will kill the performance of any network server that expects to have high load, but not hurt much something that's less sensitive to it's events getting delayed a little.

I think that since there is no way of knowing if an API call is going to perform blocking I/O, then we should both offer async and sync versions. It's not ideal, but a good compromise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment