Measure performance with Canavas #77

aruneshchandra · 2017-02-01T18:53:12Z

No description provided.

jasongin · 2017-02-08T23:35:13Z

I ran the Canvas benchmarks on my Windows machine, comparing results from before and after the NAPI port (using the same build of NAPI-enabled node). While some benchmarks show no measurable change, others are up to 5x slower on NAPI:

Note that the benchmarks that show significant slowness from NAPI are the ones that have a high number of operations per second -- that is, they have very frequent calls through the NAPI layer. The first data point there, lineTo(), does very little work on its own, so a majority of the benchmark time is spent calling in and out of the NAPI layer.

With the current APIs, every call from JavaScript to C++ requires 4-6 NAPI calls, not including any additional parameter type validation and retrieval that may be done by the C++ function being called. The sequence is (in pseudocode):

argc = napi_get_cb_args_length();
argv = malloc(argc);
napi_get_cb_args(argv);
callbackData = napi_get_cb_data();
thisWrapper = napi_get_cb_this(); // Even static methods in JS have a 'this' (usually 'global')
thisArg = napi_unwrap(thisWrapper); // Only called for instance methods
returnValue = callbackData->method(thisArg, argv, callbackData->userData); // Call the user function
napi_set_return_value(returnValue); // Only called for methods that have a non-void return type

The lineTo() benchmark scenario makes four additional NAPI calls to validate and retrieve its arguments: 2 calls each to napi_get_type_of_value() and napi_get_value_number().

I did some experiments and some math, and found that on my machine every NAPI call costs approximately 25ns. That's actually not much, but I think there are some things we can do to reduce the number of NAPI calls required.

To be continued...

jasongin · 2017-02-09T00:47:56Z

To reduce the number of NAPI calls required for every call, we could define an ugly API that looks something like this, to retrieve all the callback info at once:

napi_status napi_get_cb_info(
  napi_env e,                // [in] NAPI environment handle
  napi_callback_info cbinfo, // [in] Opaque callback-info handle
  int* argc,                 // [in-out] Specifies the size of the provided argv and argt arrays
                             // and receives the actual count of args.
  napi_value* argv,          // [out] Array of values
  napi_valuetype* argt,      // [out] Optional array of value types, for optimizing arg validation
  napi_value* thisArg,       // [out] Receives the JS 'this' arg for the call
  void** data);              // [out] Receives the data pointer for the callback.

While we could skip the optional argt array there, it would make canvas faster because canvas does frequent type-checking on arguments, for both validation and method overloading.

In the case of the lineTo() benchmark scenario, this API could reduce the number of NAPI calls per operation from 9 to 4. That would theoretically reduce the per-operation time from 0.38 μs to around 0.25 μs, reducing the NAPI overhead to 0.25/0.08 = 313%. Still not great, but this is an extreme case.

jasongin · 2017-02-09T21:53:39Z

I still want to test canvas perf on a non-Windows system, since we might find different performance characteristics for calling through the NAPI layer.

mhdawson · 2017-02-09T21:53:50Z

Are you actually doing a malloc as shown in the pseudocode ? thats going to be a killer I think. We should do a stack allocation (even if we have to overestimate the size we need) up to a certain number of parameters as 99% of the time that will probably be less than ~6

jasongin · 2017-02-09T21:56:52Z

Currently it's using a std::vector which I believe allocates on the heap. But yes, I had thought about allocating space for some small fixed number of args on the stack, then using the heap only for extreme cases. I'll try that and see if it makes a measurable difference in performance.

ianwjhalliday · 2017-02-14T02:05:18Z

I think the argt array would be useful. In the leveldown conversion I found most of the callbacks did overload resolution and/or parameter validation based on the types of the arguments, so most of them requested the types of the arguments.

I also put the idea of one big ugly API as you propose here in the back of my mind to explore later if we ever hit a case where the performance overhead was significant, so +1 to this proposal.

jasongin · 2017-02-24T00:21:45Z

The above benchmark data was collected on a 5-year-old workstation PC running Windows, with a Xeon W3530 @2.8GHz, 20 GB RAM.

I also ran on a 1.5-year-old Mac Mini and the results were very similar percentage-wise.

jasongin · 2017-03-07T23:42:55Z

The performance improvements in my PR reduce the worst case canvas benchmark from 505% to 277%. Other benchmarks that stress the JS-to-C++ NAPI callback layer show similar improvements.

aruneshchandra added this to the Milestone 5 milestone Feb 1, 2017

aruneshchandra assigned jasongin Feb 1, 2017

jasongin mentioned this issue Feb 8, 2017

Convert Canvas to NAPI #76

Closed

jasongin closed this as completed Mar 16, 2017

jasongin mentioned this issue Jun 26, 2017

napi_get_cb_info has a lot of overhead #261

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Measure performance with Canavas #77

Measure performance with Canavas #77

aruneshchandra commented Feb 1, 2017

jasongin commented Feb 8, 2017 •

edited

Loading

jasongin commented Feb 9, 2017

jasongin commented Feb 9, 2017

mhdawson commented Feb 9, 2017

jasongin commented Feb 9, 2017

ianwjhalliday commented Feb 14, 2017

jasongin commented Feb 24, 2017

jasongin commented Mar 7, 2017

Measure performance with Canavas #77

Measure performance with Canavas #77

Comments

aruneshchandra commented Feb 1, 2017

jasongin commented Feb 8, 2017 • edited Loading

jasongin commented Feb 9, 2017

jasongin commented Feb 9, 2017

mhdawson commented Feb 9, 2017

jasongin commented Feb 9, 2017

ianwjhalliday commented Feb 14, 2017

jasongin commented Feb 24, 2017

jasongin commented Mar 7, 2017

jasongin commented Feb 8, 2017 •

edited

Loading