Official benchmark of this repo comparing other PL #301

ppKrauss · 2018-08-13T18:14:05Z

Hi, there are a "reliable source" or sister-project where we can check basic benchmark results comparing plv8 with PL/Python, PL/Perl or PL/pgSQL?

JerrySievert · 2018-08-13T19:58:38Z

we don't provide benchmarks, and performance is going to vary based on what exactly you're trying to accomplish.

if you're trying to choose between languages based on speed alone, you might as well write whatever you need in c as an extension. otherwise, use your strengths and develop against whichever language makes the most sense for you.

any benchmarks that I've been able to find have been years old, and v8 (and thus plv8) have made huge improvements in speed since then.

what are you trying to do? maybe knowing that will help me be able to give you a better answer.

ppKrauss · 2018-08-13T22:26:12Z

Hi @JerrySievert , thanks the explanations (!). Well, I try to check if the dream of the unified Javascript development stack is possible in nowdays, for reliable software, for day-by-day serious production...

Example: PostgreSQL have problems with filesystem, and I suppose that NodeJS functions offer better exprience for programmers (no permission barriers, etc.)... But we need peformance of NodeJS-filesystem to be comparable with foregin table with file_fdw.

JerrySievert · 2018-08-13T22:31:11Z

plv8 is a trusted extension, meaning that it does not have direct access to the network, nor the filesystem. if that's what you're trying to accomplish, it's not going to work.

the good news is that you can have tests, see https://github.com/jerrysievert/equinox for an example.

JerrySievert · 2018-08-13T22:34:29Z

fyi, the only relationship between plv8 and node.js is that they both embed the v8 javascript engine, and both happen to be javascript.

ppKrauss · 2018-08-13T23:04:11Z

Oops, there are no way to use (call lib) with plv8? Is possible to do a fork of this project to use something (partial) of NodeJS, or it is a PostgreSQL constraint?

JerrySievert · 2018-08-14T00:31:01Z

first, you need to understand the differences between trusted and untrusted languages in postgres. trusted languages are sandboxed to not allow any access to the filesystem or network. this is by design, and not an afterthought.

untrusted languages have full access to the machine and network running as the user that postgres is run under. you can see a list of trusted vs untrusted languages on the postgres wiki: https://wiki.postgresql.org/wiki/PL_Matrix - hosted providers, such as amazon's RDS, or Google or Microsoft's shared postgres hosting specifically do not support untrusted languages.

some languages are designed to be able to be both, plv8 is not. there is a fork of plv8 to be built as an untrusted language, but it is not complete, as there has not been much (any?) call for it.

if you just want to use modules from npm, there are a few projects that allow you to do that, but that are not affiliated with plv8 itself. you can find a few by [https://www.npmjs.com/search?q=plv8](searching npm for plv8).

brandonros · 2018-10-16T20:51:54Z

I'd be curious to see rudimentary operations per second on normal benchmarks like, fibonacci?

Obviously nobody implementing a performant fibonacci application would stand up a Postgre instance and add this extension to achieve it. However, comparing this extension's performance to say, a similar node.js app would be pretty telling on overhead.

JerrySievert · 2018-10-16T21:51:57Z

ostensibly it should be pretty much the same speed as node - at least on comparable versions of v8 (I've tried to track v8 versions with node where it made sense).

the only differences that I can think of would be startup time - plv8 doesn't use snapshots, and when crossing the c++/javascript membrane - there's an additional layer of indirection because of type conversions.

in regards to startup time, this will be a hit at the beginning of a session/connection when using your first stored procedure for that session, but will use the same instance of the interpreter for the rest of the connection as long as the role remains the same.

as far as the membrane goes, that occurs right before entering the function and at exit, as well as SPI (queries).

brandonros · 2018-10-16T22:10:16Z

This project is very interesting and it is reassuring to hear your response that the overhead should not be too high.

Out of curiosity, what did you write this project for/what is it used for?

JerrySievert · 2018-10-16T22:19:12Z

I'm not the original author, I'm just the maintainer.

as for what it's used for, it provides javascript as a language for stored procedures and triggers inside of postgres. it's used for quite a lot of things, and is one of the few language plugins available in RDS and Microsoft's hosted postgres.

I've used it extensibly in a lot of projects where it made sense, probably the most interesting was doing spatial conversions between geojson and esri's spatial json format.

brandonros · 2018-10-17T02:57:20Z

docker run -d --name postgres clkao/postgres-plv8:latest
docker exec -it postgres bash -c 'psql -U postgres -c "CREATE EXTENSION plv8;"'
docker run --rm -p 3000:3000 -e PGRST_DB_URI="postgres://postgres@postgres/postgres" -e PGRST_DB_ANON_ROLE="postgres" -e PGRST_DB_SCHEMA="public" --link postgres:postgres postgrest/postgrest

docker exec -it postgres bash -c "psql -U postgres"

CREATE FUNCTION plv8_test(input TEXT) RETURNS JSON AS $$
  return {
    output: input
  };
$$ LANGUAGE plv8 IMMUTABLE STRICT;

npm install -g loadtest
loadtest -c 10 --rps 200 http://localhost:3000/rpc/plv8_test?input=Hello

$ loadtest -c 10 --rps 200 http://localhost:3000/rpc/plv8_test?input=Hello
[Tue Oct 16 2018 22:55:44 GMT-0400 (EDT)] INFO Requests: 0, requests per second: 0, mean latency: 0 ms
[Tue Oct 16 2018 22:55:49 GMT-0400 (EDT)] INFO Requests: 899, requests per second: 180, mean latency: 18.9 ms
[Tue Oct 16 2018 22:55:54 GMT-0400 (EDT)] INFO Requests: 1900, requests per second: 200, mean latency: 486 ms
[Tue Oct 16 2018 22:55:54 GMT-0400 (EDT)] INFO Errors: 263, accumulated errors: 263, 13.8% of total requests
[Tue Oct 16 2018 22:55:59 GMT-0400 (EDT)] INFO Requests: 2760, requests per second: 172, mean latency: 14 ms
[Tue Oct 16 2018 22:55:59 GMT-0400 (EDT)] INFO Errors: 0, accumulated errors: 263, 9.5% of total requests
[Tue Oct 16 2018 22:56:04 GMT-0400 (EDT)] INFO Requests: 3825, requests per second: 213, mean latency: 144.7 ms
[Tue Oct 16 2018 22:56:04 GMT-0400 (EDT)] INFO Errors: 34, accumulated errors: 297, 7.8% of total requests
[Tue Oct 16 2018 22:56:09 GMT-0400 (EDT)] INFO Requests: 4826, requests per second: 200, mean latency: 36.2 ms
[Tue Oct 16 2018 22:56:09 GMT-0400 (EDT)] INFO Errors: 0, accumulated errors: 297, 6.2% of total requests
[Tue Oct 16 2018 22:56:14 GMT-0400 (EDT)] INFO Requests: 5816, requests per second: 198, mean latency: 14.5 ms
[Tue Oct 16 2018 22:56:14 GMT-0400 (EDT)] INFO Errors: 0, accumulated errors: 297, 5.1% of total requests
[Tue Oct 16 2018 22:56:19 GMT-0400 (EDT)] INFO Requests: 6826, requests per second: 202, mean latency: 57.2 ms

versus

var express = require('express')

var app = express()

app.get('/rpc/plv8_test', function(req, res) {
  console.log('Request');

  res.send({
    output: req.query.input
  });
});

app.listen(3000, function() {
  console.log('Listening...');
});

Brandons-iMac:~ brandonros$ loadtest -c 10 --rps 200 http://localhost:3000/rpc/plv8_test?input=Hello
[Tue Oct 16 2018 22:53:22 GMT-0400 (EDT)] INFO Requests: 0, requests per second: 0, mean latency: 0 ms
[Tue Oct 16 2018 22:53:27 GMT-0400 (EDT)] INFO Requests: 924, requests per second: 185, mean latency: 5 ms
[Tue Oct 16 2018 22:53:32 GMT-0400 (EDT)] INFO Requests: 1922, requests per second: 200, mean latency: 4.4 ms
[Tue Oct 16 2018 22:53:37 GMT-0400 (EDT)] INFO Requests: 2925, requests per second: 201, mean latency: 4.2 ms
[Tue Oct 16 2018 22:53:42 GMT-0400 (EDT)] INFO Requests: 3916, requests per second: 198, mean latency: 4.4 ms
[Tue Oct 16 2018 22:53:47 GMT-0400 (EDT)] INFO Requests: 4925, requests per second: 202, mean latency: 3.7 ms
[Tue Oct 16 2018 22:53:52 GMT-0400 (EDT)] INFO Requests: 5925, requests per second: 200, mean latency: 3.3 ms
[Tue Oct 16 2018 22:53:57 GMT-0400 (EDT)] INFO Requests: 6925, requests per second: 200, mean latency: 3.1 ms
[Tue Oct 16 2018 22:54:02 GMT-0400 (EDT)] INFO Requests: 7925, requests per second: 200, mean latency: 3.7 ms

I guess this + Postgrest really isn't viable?

JerrySievert · 2018-10-17T03:37:02Z

there are so many variables in that test that it'd be hard to point a finger at any specific part of it as the limitation.

if you're not pooling a connection, or destroying connections at any point, there could be a huge cost in the connection (there's a reason why connection pooling is pretty much a requirement for any large Postgres implementation), in the instantiation of the v8 engine if a role is set or a new connection is made, etc.

a much better test of plv8 speed specifically would be to isolate each part of the test:

[local] jerry@test=# select plv8_version();
┌──────────────┐
│ plv8_version │
├──────────────┤
│ 3.0alpha     │
└──────────────┘
(1 row)

Time: 6.533 ms

[local] jerry@test=# DO $$
DECLARE
   counter INTEGER := 0 ;
BEGIN

LOOP
 EXIT WHEN counter = 10000 ;
 counter := counter + 1 ;
 PERFORM plv8_test('123456');
END LOOP ;
END ;
$$ LANGUAGE plpgsql;
DO
Time: 20.652 ms

CREATE OR REPLACE FUNCTION fibonacci (n INTEGER) 
 RETURNS INTEGER AS $$ 
DECLARE
   counter INTEGER := 0 ; 
   i INTEGER := 0 ; 
   j INTEGER := 1 ;
BEGIN
 
 IF (n < 1) THEN
 RETURN 0 ;
 END IF; 
 
 LOOP 
 EXIT WHEN counter = n ; 
 counter := counter + 1 ; 
 SELECT j, i + j INTO i, j ;
 END LOOP ; 
 
 RETURN i ;
END ; 
$$ LANGUAGE plpgsql;

[local] jerry@test=# select fibonacci(31);
┌───────────┐
│ fibonacci │
├───────────┤
│   1346269 │
└───────────┘
(1 row)

Time: 0.667 ms

CREATE FUNCTION plv8_fibonacci (n INTEGER) RETURNS INTEGER AS $$
function fibonacci(num, memo) {
  memo = memo || {};

  if (memo[num]) return memo[num];
  if (num <= 1) return 1;

  return memo[num] = fibonacci(num - 1, memo) + fibonacci(num - 2, memo);
}

return fibonacci(30);
$$ LANGUAGE plv8;

[local] jerry@test=# select plv8_fibonacci(30);
┌────────────────┐
│ plv8_fibonacci │
├────────────────┤
│        1346269 │
└────────────────┘
(1 row)

Time: 0.391 ms

function fibonacci(num, memo) {
  memo = memo || {};

  if (memo[num]) return memo[num];
  if (num <= 1) return 1;

  return memo[num] = fibonacci(num - 1, memo) + fibonacci(num - 2, memo);
}

console.log(fibonacci(30));

 ~  node --version
v8.9.4
 ~  time node /tmp/fib.js
1346269
        0.09 real         0.06 user         0.01 sys

JerrySievert · 2018-10-17T03:49:46Z

and just for fun:

CREATE FUNCTION plv8_fibonacci2 (num INTEGER, memo JSON) RETURNS INTEGER AS $$
  memo = memo || {};

  if (memo[num]) return memo[num];
  if (num <= 1) return 1;

  let q = plv8.execute('SELECT plv8_fibonacci2($1 - 1, $2) + plv8_fibonacci2($1 - 2, $2)', [num, memo]);

  return q[0]['?column?'];
$$ LANGUAGE plv8;

which is exceedingly slow:

[local] jerry@test=# select plv8_fibonacci2(30, NULL);
┌─────────────────┐
│ plv8_fibonacci2 │
├─────────────────┤
│         1346269 │
└─────────────────┘
(1 row)

Time: 66144.019 ms (01:06.144)

JerrySievert · 2018-10-17T04:01:50Z

the test was accidentally set to do 30 regardless, but the timing is the same fixed:

CREATE OR REPLACE FUNCTION plv8_fibonacci (n INTEGER) RETURNS INTEGER AS $$
function fibonacci(num, memo) {
  memo = memo || {};

  if (memo[num]) return memo[num];
  if (num <= 1) return 1;

  return memo[num] = fibonacci(num - 1, memo) + fibonacci(num - 2, memo);
}

return fibonacci(n);
$$ LANGUAGE plv8;

[local] jerry@test=# select plv8_fibonacci(30);
┌────────────────┐
│ plv8_fibonacci │
├────────────────┤
│        1346269 │
└────────────────┘
(1 row)

Time: 0.233 ms

brandonros · 2018-10-17T14:26:21Z

0.391 ms for the plv8 fibonnacci implementation versus 0.09ms for the node.js version? What could be the cause for that overhead?

JerrySievert · 2018-10-17T17:46:59Z

these are in seconds, not milliseconds:

 ~  time node /tmp/fib.js
1346269
        0.09 real         0.06 user         0.01 sys

but, that said, there are differences in environment between a command-line program and a database are pretty large, including external startup data, transition between the c++/javascript layers, query parsing for the select statement, all sorts of things.

you're welcome to audit the code, and I'm happy to accept pull requests that will help improve the project, within the limitations of this being a trusted language extension, but the underlying engine here is still v8, and I think that you will find that beyond the specific choice of not using external startup data, that the javascript code executes the same in both environments.

awbacker · 2020-09-15T21:25:03Z

Since this is the first performance hit on google, I thought I'd add my thoughts. I haven't looked in depth, but I was looking for something faster for a task: generate 100K random strings, following some simple rules (with a variable length, but 12 in this test).

Postgres 12, PLv8 2.3.13
pgpl: ~3600+ ms
plv8: ~450 ms (8x)
cleaner code

I re-wrote in plv8 using the first reasonable answer (slightly tweaked) found on copy-paste heaven. No attempt at optimization, not that it seems much could be done. Same for the pg/pl version.

It seems that v8 can make a huge difference, at least depending on what you are doing. If I can get it deployed everywhere then I'll probably switch over to it for at least some things. I can't imagine a simple trigger would be much faster, but maybe if its cleaner reading...

CREATE OR REPLACE FUNCTION gen_rand_str_plv8(length int) returns text
    language plv8 volatile parallel safe
AS $$
    var charset = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789';
    var result  = charset.charAt(Math.floor(Math.random() * 26));
    for (var i = 0; i < length - 1; i++ ) {
      result += charset.charAt(Math.floor(Math.random() * charset.length));
    }
    return result;
$$;

CREATE OR REPLACE FUNCTION gen_rand_str_pgpl(length int) RETURNS text
    language plpgsql volatile parallel safe
AS
$$
    charset constant char[] := '{A,B,C,D,E,F,G,H,I,J,K,L,M,N,O,P,Q,R,S,T,U,V,W,X,Y,Z,0,1,2,3,4,5,6,7,8,9}';
    charlen constant int    := array_length(charset, 1);
    result           text   := charset[floor(random() * 26):: int + 1];
BEGIN
    loop
        exit when length(result) = length;
        -- generate_series(1, length-1) removes loop, but >2x perf hit
        result := result || charset[floor(random() * charlen):: int + 1];
    end loop;
    return result;
END;
$$;

And a simple query to test it, looking for duplicates. Similar to how it would actually be used:

with src as (
    select gen_rand_str_plv8(12) as x from generate_series(1, 100000) s
)
select x, count(*) from src group by x having count(*) > 1;

luss · 2020-09-16T11:21:22Z

I think you will find if your procedure was doing mostly DB access then it would be a little faster in pl/pg. As you have demonstrated, if your code is all procedural logic, it is A LOT faster in pl/v8.

…

--Luss

On Tue, Sep 15, 2020 at 2:25 PM Andrew Backer ***@***.***> wrote: Since this is the first performance hit on google, I thought I'd add my thoughts. I haven't looked in depth, but I *was* looking for something faster for a task: generate 100K random strings, following some simple rules (with a variable length, but 12 in this test). - Postgres 12, PLv8 2.3.13 - pgpl: ~3600+ ms - plv8: ~450 ms (8x) - *cleaner code* I re-wrote in plv8 using the first reasonable answer (slightly tweaked) found on copy-paste heaven. No attempt at optimization, not that it seems much could be done. Same for the pg/pl version. It seems that v8 can make a huge difference, at least depending on what you are doing. If I can get it deployed everywhere then I'll probably switch over to it for at least some things. I can't imagine a simple trigger would be much faster, but maybe if its cleaner reading... CREATE OR REPLACE FUNCTION gen_rand_str_plv8(length int) returns text language plv8 volatile parallel safe AS $$ var charset = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789'; var result = charset.charAt(Math.floor(Math.random() * 26)); for (var i = 0; i < length - 1; i++ ) { result += charset.charAt(Math.floor(Math.random() * charset.length)); } return result; $$; CREATE OR REPLACE FUNCTION gen_rand_str_pgpl(length int) RETURNS text language plpgsql volatile parallel safe AS $$ charset constant char[] := '{A,B,C,D,E,F,G,H,I,J,K,L,M,N,P,Q,R,S,T,U,V,W,X,Y,Z,a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z,0,1,2,3,4,5,6,7,8,9}'; charlen constant int := array_length(charset, 1); result text := charset[floor(random() * 26):: int + 1]; BEGIN loop exit when length(result) = length; -- generate_series(1, length-1) removes loop, but >2x perf hit result := result || charset[floor(random() * charlen):: int + 1]; end loop; return result; END; $$; And a simple query to test it, looking for duplicates. Similar to how it would actually be used: with src as ( select gen_rand_str_plv8(12) as x from generate_series(1, 100000) s ) select x, count(*) from src group by x having count(*) > 1; — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#301 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAMWOHSSF2AVRNTSCWMZ2D3SF7LT5ANCNFSM4FPL5LLQ> .

brandonros mentioned this issue Oct 17, 2018

Performance pramsey/pgsql-http#88

Closed

JerrySievert closed this as completed Jun 27, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Official benchmark of this repo comparing other PL #301

Official benchmark of this repo comparing other PL #301

ppKrauss commented Aug 13, 2018

JerrySievert commented Aug 13, 2018

ppKrauss commented Aug 13, 2018

JerrySievert commented Aug 13, 2018

JerrySievert commented Aug 13, 2018

ppKrauss commented Aug 13, 2018

JerrySievert commented Aug 14, 2018

brandonros commented Oct 16, 2018

JerrySievert commented Oct 16, 2018

brandonros commented Oct 16, 2018

JerrySievert commented Oct 16, 2018

brandonros commented Oct 17, 2018

JerrySievert commented Oct 17, 2018

JerrySievert commented Oct 17, 2018

JerrySievert commented Oct 17, 2018

brandonros commented Oct 17, 2018

JerrySievert commented Oct 17, 2018 •

edited

Loading

awbacker commented Sep 15, 2020 •

edited

Loading

luss commented Sep 16, 2020 via email

Official benchmark of this repo comparing other PL #301

Official benchmark of this repo comparing other PL #301

Comments

ppKrauss commented Aug 13, 2018

JerrySievert commented Aug 13, 2018

ppKrauss commented Aug 13, 2018

JerrySievert commented Aug 13, 2018

JerrySievert commented Aug 13, 2018

ppKrauss commented Aug 13, 2018

JerrySievert commented Aug 14, 2018

brandonros commented Oct 16, 2018

JerrySievert commented Oct 16, 2018

brandonros commented Oct 16, 2018

JerrySievert commented Oct 16, 2018

brandonros commented Oct 17, 2018

JerrySievert commented Oct 17, 2018

JerrySievert commented Oct 17, 2018

JerrySievert commented Oct 17, 2018

brandonros commented Oct 17, 2018

JerrySievert commented Oct 17, 2018 • edited Loading

awbacker commented Sep 15, 2020 • edited Loading

luss commented Sep 16, 2020 via email

JerrySievert commented Oct 17, 2018 •

edited

Loading

awbacker commented Sep 15, 2020 •

edited

Loading