-
-
Notifications
You must be signed in to change notification settings - Fork 140
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Official benchmark of this repo comparing other PL #301
Comments
we don't provide benchmarks, and performance is going to vary based on what exactly you're trying to accomplish. if you're trying to choose between languages based on speed alone, you might as well write whatever you need in c as an extension. otherwise, use your strengths and develop against whichever language makes the most sense for you. any benchmarks that I've been able to find have been years old, and v8 (and thus plv8) have made huge improvements in speed since then. what are you trying to do? maybe knowing that will help me be able to give you a better answer. |
Hi @JerrySievert , thanks the explanations (!). Well, I try to check if the dream of the unified Javascript development stack is possible in nowdays, for reliable software, for day-by-day serious production... Example: PostgreSQL have problems with filesystem, and I suppose that NodeJS functions offer better exprience for programmers (no permission barriers, etc.)... But we need peformance of NodeJS-filesystem to be comparable with foregin table with file_fdw. |
plv8 is a trusted extension, meaning that it does not have direct access to the network, nor the filesystem. if that's what you're trying to accomplish, it's not going to work. the good news is that you can have tests, see https://github.com/jerrysievert/equinox for an example. |
fyi, the only relationship between plv8 and node.js is that they both embed the v8 javascript engine, and both happen to be javascript. |
Oops, there are no way to use (call lib) with plv8? Is possible to do a fork of this project to use something (partial) of NodeJS, or it is a PostgreSQL constraint? |
first, you need to understand the differences between trusted and untrusted languages in postgres. trusted languages are sandboxed to not allow any access to the filesystem or network. this is by design, and not an afterthought. untrusted languages have full access to the machine and network running as the user that postgres is run under. you can see a list of trusted vs untrusted languages on the postgres wiki: https://wiki.postgresql.org/wiki/PL_Matrix - hosted providers, such as amazon's RDS, or Google or Microsoft's shared postgres hosting specifically do not support untrusted languages. some languages are designed to be able to be both, plv8 is not. there is a fork of plv8 to be built as an untrusted language, but it is not complete, as there has not been much (any?) call for it. if you just want to use modules from |
I'd be curious to see rudimentary operations per second on normal benchmarks like, fibonacci? Obviously nobody implementing a performant fibonacci application would stand up a Postgre instance and add this extension to achieve it. However, comparing this extension's performance to say, a similar node.js app would be pretty telling on overhead. |
ostensibly it should be pretty much the same speed as node - at least on comparable versions of v8 (I've tried to track v8 versions with node where it made sense). the only differences that I can think of would be startup time - plv8 doesn't use snapshots, and when crossing the c++/javascript membrane - there's an additional layer of indirection because of type conversions. in regards to startup time, this will be a hit at the beginning of a session/connection when using your first stored procedure for that session, but will use the same instance of the interpreter for the rest of the connection as long as the role remains the same. as far as the membrane goes, that occurs right before entering the function and at exit, as well as SPI (queries). |
This project is very interesting and it is reassuring to hear your response that the overhead should not be too high. Out of curiosity, what did you write this project for/what is it used for? |
I'm not the original author, I'm just the maintainer. as for what it's used for, it provides javascript as a language for stored procedures and triggers inside of postgres. it's used for quite a lot of things, and is one of the few language plugins available in RDS and Microsoft's hosted postgres. I've used it extensibly in a lot of projects where it made sense, probably the most interesting was doing spatial conversions between geojson and esri's spatial json format. |
CREATE FUNCTION plv8_test(input TEXT) RETURNS JSON AS $$
return {
output: input
};
$$ LANGUAGE plv8 IMMUTABLE STRICT;
versus var express = require('express')
var app = express()
app.get('/rpc/plv8_test', function(req, res) {
console.log('Request');
res.send({
output: req.query.input
});
});
app.listen(3000, function() {
console.log('Listening...');
});
I guess this + Postgrest really isn't viable? |
there are so many variables in that test that it'd be hard to point a finger at any specific part of it as the limitation. if you're not pooling a connection, or destroying connections at any point, there could be a huge cost in the connection (there's a reason why connection pooling is pretty much a requirement for any large Postgres implementation), in the instantiation of the v8 engine if a role is set or a new connection is made, etc. a much better test of plv8 speed specifically would be to isolate each part of the test:
|
and just for fun:
which is exceedingly slow:
|
the test was accidentally set to do 30 regardless, but the timing is the same fixed:
|
0.391 ms for the plv8 fibonnacci implementation versus 0.09ms for the node.js version? What could be the cause for that overhead? |
these are in seconds, not milliseconds:
but, that said, there are differences in environment between a command-line program and a database are pretty large, including external startup data, transition between the c++/javascript layers, query parsing for the you're welcome to audit the code, and I'm happy to accept pull requests that will help improve the project, within the limitations of this being a trusted language extension, but the underlying engine here is still |
Since this is the first performance hit on google, I thought I'd add my thoughts. I haven't looked in depth, but I was looking for something faster for a task: generate 100K random strings, following some simple rules (with a variable length, but 12 in this test).
I re-wrote in It seems that v8 can make a huge difference, at least depending on what you are doing. If I can get it deployed everywhere then I'll probably switch over to it for at least some things. I can't imagine a simple trigger would be much faster, but maybe if its cleaner reading...
And a simple query to test it, looking for duplicates. Similar to how it would actually be used:
|
I think you will find if your procedure was doing mostly DB access then it
would be a little faster in pl/pg. As you have demonstrated, if your code
is all procedural logic, it is A LOT faster in pl/v8.
…--Luss
On Tue, Sep 15, 2020 at 2:25 PM Andrew Backer ***@***.***> wrote:
Since this is the first performance hit on google, I thought I'd add my
thoughts. I haven't looked in depth, but I *was* looking for something
faster for a task: generate 100K random strings, following some simple
rules (with a variable length, but 12 in this test).
-
Postgres 12, PLv8 2.3.13
-
pgpl: ~3600+ ms
-
plv8: ~450 ms (8x)
-
*cleaner code*
I re-wrote in plv8 using the first reasonable answer (slightly tweaked)
found on copy-paste heaven. No attempt at optimization, not that it seems
much could be done. Same for the pg/pl version.
It seems that v8 can make a huge difference, at least depending on what
you are doing. If I can get it deployed everywhere then I'll probably
switch over to it for at least some things. I can't imagine a simple
trigger would be much faster, but maybe if its cleaner reading...
CREATE OR REPLACE FUNCTION gen_rand_str_plv8(length int) returns text
language plv8 volatile parallel safe
AS $$
var charset = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789';
var result = charset.charAt(Math.floor(Math.random() * 26));
for (var i = 0; i < length - 1; i++ ) {
result += charset.charAt(Math.floor(Math.random() * charset.length));
}
return result;
$$;
CREATE OR REPLACE FUNCTION gen_rand_str_pgpl(length int) RETURNS text
language plpgsql volatile parallel safe
AS
$$
charset constant char[] := '{A,B,C,D,E,F,G,H,I,J,K,L,M,N,P,Q,R,S,T,U,V,W,X,Y,Z,a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z,0,1,2,3,4,5,6,7,8,9}';
charlen constant int := array_length(charset, 1);
result text := charset[floor(random() * 26):: int + 1];
BEGIN
loop
exit when length(result) = length;
-- generate_series(1, length-1) removes loop, but >2x perf hit
result := result || charset[floor(random() * charlen):: int + 1];
end loop;
return result;
END;
$$;
And a simple query to test it, looking for duplicates. Similar to how it
would actually be used:
with src as (
select gen_rand_str_plv8(12) as x from generate_series(1, 100000) s
)
select x, count(*) from src group by x having count(*) > 1;
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#301 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAMWOHSSF2AVRNTSCWMZ2D3SF7LT5ANCNFSM4FPL5LLQ>
.
|
Hi, there are a "reliable source" or sister-project where we can check basic benchmark results comparing plv8 with PL/Python, PL/Perl or PL/pgSQL?
The text was updated successfully, but these errors were encountered: