cats: postgresql introduce pl/sql lstat_decode() function #1466

bruno-at-bareos · 2023-05-11T15:49:19Z

Introduce a new pl/sql function that can decode file.lstat field with PostgreSQL language and tool.

List of Improvements made to the original function

Moving to parallel safe
- Time: 33375.061 ms (00:33.375) Original query
- Time: 32689.084 ms (00:32.689) Adding Parallel safe
Replace regexp_split_to_array by string_to_array is faster (Arogge) we won 10 seconds
- Time: 23095.753 ms (00:23.096)
Replace length() for each loop to 1 var win another 1 second
Reimplement FromBase64() C function to handle bigint and skip negative field
LinkFi need a review compared to C code (here bigint)
Modify lstat_decode() to return a table without the need to create an empty table making the use evern easier if not faster
Future investigation could get_bit,get_byte function to gain speed in bareos_frombase64()

Note it is not possible to replace the bareos_frombase64() sql function by decode due to the way the algorithm was implemented.

convert_from(decode(fields[1], 'base64'), 'SQL_ASCII');

psql:lstat_decode_test.sql:1: ERROR:  invalid base64 end sequence
HINT:  Input data is missing padding, is truncated, or is otherwise corrupted.
CONTEXT:  PL/pgSQL function decode_lstat(text) line 8 at assignment
Time: 385.523 ms

Testing on different jobs with high number of files, decoding the 16 fields in psql.

Top 10 decode_lstat

 nb_files │       │ jobid │ Old function               │ New function               │
══════════╪═══════╪═══════╪════════════════════════════╪════════════════════════════│
 62317486 │   62M │ 14873 │  aborted >1h               │ 1434870.685 ms (23:54.871) │
 45876323 │   45M │ 15427 │ 1754329.892 ms (29:14.330) │  999460.792 ms (16:39.461) │
 10086553 │   10M │ 15308 │  857101.268 ms (14:17.101) │  215277.489 ms (03:35.277) │
  1049440 │    1M │ 15422 │   59892.181 ms (00:59.892) │   22682.788 ms (00:22.683) │

The time spent to get an answer is proportional (and almost predictable) to the number of lines to scan.
Of course it is bad to parse 62M line to keep only 10.

The actual query plan look like

 explain analyze
select
  p.path    ,
  f.name    ,
  st_dev    ,
  st_ino    ,
  st_mode   ,
  st_nlink  ,
  st_uid    ,
  st_gid    ,
  st_rdev   ,
  st_size   ,
  st_blksize,
  st_blocks ,
  st_atime  ,
  st_mtime  ,
  st_ctime  ,
  LinkFI    ,
  st_flags  ,
  data
from file f, path p, decode_lstat(f.lstat)
where p.pathid = f.pathid
and f.jobid = 15422 order by st_size limit 10;
                                                                                   QUERY PLAN
═════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════
 Limit  (cost=20531078.37..20531079.54 rows=10 width=290) (actual time=34138.720..34153.014 rows=10 loops=1)
   ->  Gather Merge  (cost=20531078.37..129223097.51 rows=931580834 width=290) (actual time=34005.535..34019.828 rows=10 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         ->  Sort  (cost=20530078.35..21694554.39 rows=465790417 width=290) (actual time=33986.939..33986.940 rows=10 loops=3)
               Sort Key: decode_lstat.st_size
               Sort Method: top-N heapsort  Memory: 27kB
               Worker 0:  Sort Method: top-N heapsort  Memory: 29kB
               Worker 1:  Sort Method: top-N heapsort  Memory: 29kB
               ->  Nested Loop  (cost=1.52..10464514.90 rows=465790417 width=290) (actual time=129.285..33768.062 rows=349813 loops=3)
                     ->  Nested Loop  (cost=1.27..1148714.65 rows=465790 width=219) (actual time=127.856..866.435 rows=349813 loops=3)
                           ->  Parallel Index Scan using file_jpfid_idx on file f  (cost=0.70..1009023.84 rows=465790 width=85) (actual time=0.050..448.124 rows=349813 loops=3)
                                 Index Cond: (jobid = 15422)
                           ->  Memoize  (cost=0.58..2.22 rows=1 width=142) (actual time=0.001..0.001 rows=1 loops=1049440)
                                 Cache Key: f.pathid
                                 Cache Mode: logical
                                 Hits: 341555  Misses: 3185  Evictions: 0  Overflows: 0  Memory Usage: 502kB
                                 Worker 0:  Hits: 360194  Misses: 3179  Evictions: 0  Overflows: 0  Memory Usage: 513kB
                                 Worker 1:  Hits: 337881  Misses: 3446  Evictions: 0  Overflows: 0  Memory Usage: 555kB
                                 ->  Index Scan using path_pkey on path p  (cost=0.57..2.21 rows=1 width=142) (actual time=0.003..0.003 rows=1 loops=9810)
                                       Index Cond: (pathid = f.pathid)
                     ->  Function Scan on decode_lstat  (cost=0.25..10.25 rows=1000 width=128) (actual time=0.093..0.093 rows=1 loops=1049440)
 Planning Time: 0.295 ms
 JIT:
   Functions: 49
   Options: Inlining true, Optimization true, Expressions true, Deforming true
   Timing: Generation 3.653 ms, Inlining 145.424 ms, Optimization 222.947 ms, Emission 147.994 ms, Total 520.019 ms
 Execution Time: 34154.113 ms
(28 rows)

Time: 34154.905 ms (00:34.155)

We also tested the creation of a jsonb column but this create as expected a too large burden on table file size ( ~2.5x bigger), even if then the attribute can be queries directly.

We are still investigating if the lstat can be stored in a more efficient way, which would allow good performance on querying the attributes.

      relname       │  total_sz  │   idx_sz   │ reltuples │ bytes_per_row │
════════════════════╪════════════╪════════════╪═══════════╪═══════════════│
 file_jsonb         │ 195 GB     │ 0 bytes    │ 353025984 │           592 │
 file               │ 80 GB      │ 26 GB      │ 353020128 │           243 │

note the above table will be edited with a file_lstat table containing all additional column (16 fields of bigint)

Some of the ideas to store differently lstat field.

native bytea directly the output (with platform agnostic format) of stat
text: array decoded

Please check

Short description and the purpose of this PR is present above this paragraph
Your name is present in the AUTHORS file (optional)

If you have any questions or problems, please give a comment in the PR.

Helpful documentation and best practices

Checklist for the reviewer of the PR (will be processed by the Bareos team)

Make sure you check/merge the PR using devtools/pr-tool to have some simple automated checks run and a proper changelog record added.

General

Is the PR title usable as CHANGELOG entry?
Purpose of the PR is understood
Commit descriptions are understandable and well formatted
Check backport line
Required backport PRs have been created

Source code quality

Source code changes are understandable
Variable and function names are meaningful
Code comments are correct (logically and spelling)
Required documentation changes are present and part of the PR

arogge

Doesn't look too bad, but what if I just want the size? It is still going to decode all the other fields I don't care about.
How bad is the performance impact for that? Maybe it is worth adding something like size_from_lstat(text)

arogge · 2023-06-21T14:18:29Z

core/src/cats/ddl/creates/postgresql.sql

+    bareos_frombase64(fields[4])  st_nlink,
+    bareos_frombase64(fields[5])  st_uid,
+    bareos_frombase64(fields[6])  st_gid,
+    bareos_frombase64(fields[3])  st_mode,


this one it out of order

arogge · 2023-06-21T14:18:44Z

core/src/cats/ddl/creates/postgresql.sql

+  res bigint default 0;
+  rf text default '';
+  base64 text default 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/';
+  z integer := char_length(field); -- minus 1s


what does that comment mean?

good catch I forget to remove them. will do

arogge · 2023-06-21T14:18:50Z

core/src/cats/ddl/creates/postgresql.sql

+ declare
+  fields text[];
+ begin
+  fields = string_to_array(lst, ' '); -- minus 10s


what does that comment mean?

good catch I forget to remove them. will do

bruno-at-bareos · 2023-06-21T15:15:10Z

You make a point, I must think about and test decoding one of the field directly but I'm not sure if calling bareos_frombase64() one time is really quicker that the whole field.

I would like also maybe create a small test to check that no variation appear with newer code (pg or our).

bruno-at-bareos · 2023-06-22T11:42:34Z

@arogge so to allow searching whatever parameter you need I've adapted the function to support parameters (by default all fields are returned).

Finding top 10 only st_size Time: 9613.526 ms (00:09.614) over  353'020'126 rows
versus all fields
Time: 25584.687 ms (00:25.585)

bruno-at-bareos · 2023-06-27T13:00:52Z

Forget to say that you can check the assumption that parallel safe is applied for example with this kind of query

regress_upgrade_database=# SELECT proname, proparallel FROM   pg_proc where proname in ('decode_lstat','bareos_frombase64');
      proname      | proparallel
-------------------+-------------
 decode_lstat      | s
 bareos_frombase64 | s

arogge · 2023-06-29T14:57:27Z

core/src/cats/ddl/updates/postgresql.2210_2301.sql

+commit;
+
+begin;
+create or replace function exec(raw_query text) returns boolean as $$


According to this post you can just name the function pg_temp.exec. Then you don't have to drop and and you can be pretty sure that it won't clash with anything, because it is session-local.

That's definitively a good idea, will implement that.

arogge · 2023-06-29T15:03:54Z

core/src/cats/cats.h

@@ -469,7 +469,7 @@ class pathid_cache;
 #define QUERY_HTABLE_PAGES 128

 // Current database version number for all drivers
-#define BDB_VERSION 2210
+#define BDB_VERSION 2301


This is probably open to discussion, but previously the schema was 2000 + 10 * Major + Minor, which would result in 2230 for Bareos 23.0.0.
However, I don't really care as long as it is strictly increasing.

Thanks I was trying to understand the pattern behind hit. So I will adapt, but also add the formula as comment, so next contributor will have its life eased ;-)

- introduce a new sql function decode_lstat to help querying lstat field of file table with PostgreSQL language and tool. By default the function will return all available fields in lstat you can restrict the desired field(s) by passing those in the param text array. - improvements made to the original function: + Moving to parallel safe gain 1 sec Time: 33375.061 ms (00:33.375) - Original query Time: 32689.084 ms (00:32.689) - Adding Parallel safe + Replace regexp_split_to_array by faster string_to_array (Arogge) Time: 23095.753 ms (00:23.096) - 10 seconds gain + Replace length() for each loop with a variable; another 1s win + Reimplement FromBase64() function to handle bigint, skip negative + LinkFi need a review compared to C code (here bigint) + Modify lstat_decode() to return a table without the need to create an empty table make the use easier + Future investigation get_bit,get_byte function may offer further gain speed in bareos_frombase64() - add a query for bconsole 22 list top XX biggest files for a jobid - remove sql instruction from update script - ddl scripts: unset PAGER before psql call - move basefiles_baseid_seq alter to sql script - add decode lstat function to update script - as rhel7 use PostgreSQL < 10 function attribue parallel safe is applied with an alter instruction only if pg > 10 use pg_temp.exec function so no drop is needed - cats: update database version to 2230 - add database version pattern to cats.h Signed-off-by: Bruno Friedmann <bruno.friedmann@bareos.com>

bruno-at-bareos added nobuild draft jenkins_repotest_nodebug labels May 11, 2023

bruno-at-bareos force-pushed the dev/bruno/lstat_research branch from 1e84a06 to bdc4877 Compare June 21, 2023 14:03

bruno-at-bareos changed the title ~~cats: postgresql introduce lstat_decode.sql~~ cats: postgresql introduce pl/sql lstat_decode() function Jun 21, 2023

bruno-at-bareos removed nobuild draft labels Jun 21, 2023

bruno-at-bareos marked this pull request as ready for review June 21, 2023 14:07

bruno-at-bareos requested a review from arogge June 21, 2023 14:08

bruno-at-bareos force-pushed the dev/bruno/lstat_research branch from bdc4877 to e3d2ae7 Compare June 21, 2023 14:16

arogge requested changes Jun 21, 2023

View reviewed changes

bruno-at-bareos assigned arogge Jun 22, 2023

bruno-at-bareos requested a review from arogge June 22, 2023 11:43

bruno-at-bareos force-pushed the dev/bruno/lstat_research branch 3 times, most recently from f4b4b7f to 339f30a Compare June 27, 2023 12:31

bruno-at-bareos force-pushed the dev/bruno/lstat_research branch from a6793fc to 6d3a4aa Compare June 29, 2023 11:46

arogge reviewed Jun 29, 2023

View reviewed changes

arogge approved these changes Jun 30, 2023

View reviewed changes

bruno-at-bareos force-pushed the dev/bruno/lstat_research branch from 552682d to 88272dc Compare July 3, 2023 07:46

bruno-at-bareos force-pushed the dev/bruno/lstat_research branch from 88272dc to 015652a Compare July 4, 2023 09:53

Update CHANGELOG.md

9a00aa4

BareosBot merged commit 1ad4771 into bareos:master Jul 4, 2023

bruno-at-bareos deleted the dev/bruno/lstat_research branch July 4, 2023 11:52

bruno-at-bareos mentioned this pull request Jul 4, 2023

cats: fix creates, grants and drops postgresql #1502

Merged

11 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cats: postgresql introduce pl/sql lstat_decode() function #1466

cats: postgresql introduce pl/sql lstat_decode() function #1466

bruno-at-bareos commented May 11, 2023 •

edited by arogge

Loading

arogge left a comment

arogge Jun 21, 2023

arogge Jun 21, 2023

bruno-at-bareos Jun 21, 2023

arogge Jun 21, 2023

bruno-at-bareos Jun 21, 2023

bruno-at-bareos commented Jun 21, 2023

bruno-at-bareos commented Jun 22, 2023 •

edited

Loading

bruno-at-bareos commented Jun 27, 2023

arogge Jun 29, 2023

bruno-at-bareos Jun 29, 2023

arogge Jun 29, 2023

bruno-at-bareos Jun 29, 2023

cats: postgresql introduce pl/sql lstat_decode() function #1466

cats: postgresql introduce pl/sql lstat_decode() function #1466

Conversation

bruno-at-bareos commented May 11, 2023 • edited by arogge Loading

Introduce a new pl/sql function that can decode file.lstat field with PostgreSQL language and tool.

List of Improvements made to the original function

Testing on different jobs with high number of files, decoding the 16 fields in psql.

Please check

Helpful documentation and best practices

Checklist for the reviewer of the PR (will be processed by the Bareos team)

General

Source code quality

arogge left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bruno-at-bareos commented Jun 21, 2023

bruno-at-bareos commented Jun 22, 2023 • edited Loading

bruno-at-bareos commented Jun 27, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bruno-at-bareos commented May 11, 2023 •

edited by arogge

Loading

bruno-at-bareos commented Jun 22, 2023 •

edited

Loading