Json v9 #2

l-wang · 2025-02-27T22:02:00Z

No description provided.

Per buildfarm and reports, it seems that 9.X to 18 upgrades were failing after commit 1fd1bd8 due to an incorrect regex. Loosen the regex to accommodate older versions. Reported-by: vignesh C <vignesh21@gmail.com> Reported-by: Andrew Dunstan <andrew@dunslane.net> Discussion: https://postgr.es/m/CALDaNm3GUs+U8Nt4S=V5zmb+K8-RfAc03vRENS0teeoq0Lc6Tw@mail.gmail.com Discussion: https://postgr.es/m/ea4cbbc1-c5a5-43d1-9618-8ff3f2155bfe@dunslane.net

Remove a number of (char *) casts that are unnecessary. Or in some cases, rewrite the code to make the purpose of the cast clearer. Reviewed-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org> Discussion: https://www.postgresql.org/message-id/flat/fd1fcedb-3492-4fc8-9e3e-74b97f2db6c7%40eisentraut.org

Commit b3f0be7 accidentally missed adding the oauth client test binary to the relevant .gitignore. Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/2839306.1740082041@sss.pgh.pa.us

A patch touching this area of the code is under review, and this format makes the readability of the code slightly harder to parse. Extracted from a larger patch by the same author. Author: Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com> Discussion: https://postgr.es/m/CAO6_XqroE7JuMEm1sWz55rp9fAYX2JwmcP_3m_v51vnOFdsLiQ@mail.gmail.com

With \bind, \parse, \bind_named and \close, it is possible to issue queries from psql using the extended protocol. However, it was not possible to send these queries using libpq's pipeline mode. This feature has two advantages: - Testing. Pipeline tests were only possible with pgbench, using TAP tests. It now becomes possible to have more SQL tests that are able to stress the backend with pipelines and extended queries. More tests will be added in a follow-up commit that were discussed on some other threads. Some external projects in the community had to implement their own facility to work around this limitation. - Emulation of custom workloads, with more control over the actions taken by a client with libpq APIs. It is possible to emulate more workload patterns to bottleneck the backend with the extended query protocol. This patch adds six new meta-commands to be able to control pipelines: * \startpipeline starts a new pipeline. All extended queries are queued until the end of the pipeline are reached or a sync request is sent and processed. * \endpipeline ends an existing pipeline. All queued commands are sent to the server and all responses are processed by psql. * \syncpipeline queues a synchronisation request, without flushing the commands to the server, equivalent of PQsendPipelineSync(). * \flush, equivalent of PQflush(). * \flushrequest, equivalent of PQsendFlushRequest() * \getresults reads the server's results for the queries in a pipeline. Unsent data is automatically pushed when \getresults is called. It is possible to control the number of results read in a single meta-command execution with an optional parameter, 0 means that all the results should be read. Author: Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com> Reviewed-by: Jelte Fennema-Nio <postgres@jeltef.nl> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Discussion: https://postgr.es/m/CAO6_XqroE7JuMEm1sWz55rp9fAYX2JwmcP_3m_v51vnOFdsLiQ@mail.gmail.com

Use RelationGetIndexExpressions() rather than rd_indexprs directly. Author: Corey Huinker <corey.huinker@gmail.com>

Change backend launcher functions to take void * for binary data instead of char *. This removes the need for numerous casts. Reviewed-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org> Discussion: https://www.postgresql.org/message-id/flat/fd1fcedb-3492-4fc8-9e3e-74b97f2db6c7%40eisentraut.org

The type argument wasn't actually really necessary. It was a remnant of converting the API of the gist strategy translation from using opclass to using opfamily+opcintype (commits c09e5a6, 622f678). For looking up the gist translation function, we used the convention "amproclefttype = amprocrighttype = opclass's opcintype" (see pg_amproc.h). But each operator family should only have one translation function, and getting the right type for the lookup is sometimes cumbersome and fragile, so this is all unnecessarily complicated. To simplify this, change the gist stategy support procedure to take "any", "any" as argument. (This is arbitrary but seems intuitive. The alternative of using InvalidOid as argument(s) upsets various DDL commands, so it's not practical.) Then we don't need opcintype for the lookup, and we can remove it from all the API layers introduced by commit c09e5a6. This also adds some more documentation about the correct signature of the gist support function and adds more checks in gistvalidate(). This was previously underspecified. (It relied implicitly on convention mentioned above.) Discussion: https://www.postgresql.org/message-id/flat/E72EAA49-354D-4C2E-8EB9-255197F55330@enterprisedb.com

NO INDENT is the default, and is added if no explicit indentation flag was provided with XMLSERIALIZE(). Oversight in 483bdb2. Author: Jim Jones <jim.jones@uni-muenster.de> Discussion: https://postgr.es/m/bebd457e-5b43-46b3-8fc6-f6a6509483ba@uni-muenster.de Backpatch-through: 16

Previously, a WARNING was issued at the time of defining a subscription with origin=NONE only when the publisher subscribed to the same table from other publishers, indicating potential data origination from different origins. However, the publisher can subscribe to the partition ancestors or partition children of the table from other publishers, which could also result in mixed-origin data inclusion. So, give a WARNING in those cases as well. Reported-by: Sergey Tatarintsev <s.tatarintsev@postgrespro.ru> Author: Hou Zhijie <houzj.fnst@fujitsu.com> Author: Shlok Kyal <shlok.kyal.oss@gmail.com> Reviewed-by: Vignesh C <vignesh21@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Backpatch-through: 16, where it was introduced Discussion: https://postgr.es/m/5eda6a9c-63cf-404d-8a49-8dcb116a29f3@postgrespro.ru

The bibliography entries for olsen93 and ong90 lacked links to online copies. While ong90 is available in digital form, the olsen93 thesis is only available as a physical copy in the UCB library. To save people from searching for it, we still link to it via the UCB library page. Reported-by: jian he <jian.universality@gmail.com> Discussion: https://postgr.es/m/CACJufxFcJYdRvzgt59N26XjFp2tFFUXu+VN+x8Uo0NbDUCMCbw@mail.gmail.com

This allows using text position search functions with nondeterministic collations. These functions are - position, strpos - replace - split_part - string_to_array - string_to_table which all use common internal infrastructure. There was previously no internal implementation of this, so it was met with a not-supported error. This adds the internal implementation and removes the error. Unlike with deterministic collations, the search cannot use any byte-by-byte optimized techniques but has to go substring by substring. We also need to consider that the found match could have a different length than the needle and that there could be substrings of different length matching at a position. In most cases, we need to find the longest such substring (greedy semantics), but this can be configured by each caller. Reviewed-by: Euler Taveira <euler@eulerto.com> Discussion: https://www.postgresql.org/message-id/flat/582b2613-0900-48ca-8b0d-340c06f4d400@eisentraut.org

Dumps from versions older than v16 do not know about NO INDENT in a XMLSERIALIZE() clause. This commit adjusts AdjustUpgrade.pm so as NO INDENT is discarded in the contents of the new dump adjusted for comparison when the old version is v15 or older. This should be enough to make the cross-version upgrade tests pass. Per report from buildfarm member crake. Oversight in 984410b. Reviewed-by: Andrew Dunstan <andrew@dunslane.net> Discussion: https://postgr.es/m/88b183f1-ebf9-4f51-9144-3704380ccae7@dunslane.net Backpatch-through: 16

Previously the portlock logic, added in 9b4eafc, didn't actually work properly when the tests were run via meson. 9b4eafc used the MESON_BUILD_ROOT environment variable to determine the directory for the port lock directory, but that's never set for running the tests. That meant that each test used its own portlock dir, unless the PG_TEST_PORT_DIR environment variable was set. Fix the problem by setting top_builddir for the environment. That's also used for the autoconf/make build. Backpatch back to 16, where meson support was added. Reported-by: Zharkov Roman <r.zharkov@postgrespro.ru> Reviewed-by: Andrew Dunstan <andrew@dunslane.net> Backpatch-through: 16

Also simplify and correct data checksum wording in master now that it is the default. PG 13 did not have the awkward wording. Reported-by: Felix <afripowered@gmail.com> Reviewed-by: Laurenz Albe Discussion: https://postgr.es/m/173928241056.707.3989867022954178032@wrigleys.postgresql.org Backpatch-through: 14

The signedness of the 'char' type in C is implementation-dependent. For instance, 'signed char' is used by default on x86 CPUs, while 'unsigned char' is used on aarch CPUs. Previously, we accidentally let C implementation signedness affect persistent data. This led to inconsistent results when comparing char data across different platforms. This commit introduces a new 'default_char_signedness' field in ControlFileData to store the signedness of the 'char' type. While this change does not encourage the use of 'char' without explicitly specifying its signedness, this field can be used as a hint to ensure consistent behavior for pre-v18 data files that store data sorted by the 'char' type on disk (e.g., GIN and GiST indexes), especially in cross-platform replication scenarios. Newly created database clusters unconditionally set the default char signedness to true. pg_upgrade (with an upcoming commit) changes this flag for clusters if the source database cluster has signedness=false. As a result, signedness=false setting will become rare over time. If we had known about the problem during the last development cycle that forced initdb (v8.3), we would have made all clusters signed or all clusters unsigned. Making pg_upgrade the only source of signedness=false will cause the population of database clusters to converge toward that retrospective ideal. Bump catalog version (for the catalog changes) and PG_CONTROL_VERSION (for the additions in ControlFileData). Reviewed-by: Noah Misch <noah@leadboat.com> Discussion: https://postgr.es/m/CB11ADBC-0C3F-4FE0-A678-666EE80CBB07%40amazon.com

…signedness. With the newly added option --char-signedness, pg_resetwal updates the default char signedness flag in the controlfile. This option is primarily intended for an upcoming patch that pg_upgrade supports preserving the default char signedness during upgrades, and is not meant for manual operation. Reviewed-by: Noah Misch <noah@leadboat.com> Discussion: https://postgr.es/m/CB11ADBC-0C3F-4FE0-A678-666EE80CBB07%40amazon.com

Commit 44fe30f introduced the 'default_char_signedness' field in controlfile. Newly created database clusters always set this field to 'signed'. This change ensures that pg_upgrade updates the 'default_char_signedness' to 'unsigned' if the source database cluster has signedness=false. For source clusters from v17 or earlier, which lack the 'default_char_signedness' information, pg_upgrade assumes the source cluster was initialized on the same platform where pg_upgrade is running. It then sets the 'default_char_signedness' value according to the current platform's default character signedness. Reviewed-by: Noah Misch <noah@leadboat.com> Discussion: https://postgr.es/m/CB11ADBC-0C3F-4FE0-A678-666EE80CBB07%40amazon.com

…ess of new cluster. This change adds a new option --set-char-signedness to pg_upgrade. It enables user to set arbitrary signedness during pg_upgrade. This helps cases where user who knew they copied the v17 source cluster from x86 (signedness=true) to ARM (signedness=false) can pg_upgrade properly without the prerequisite of acquiring an x86 VM. Reviewed-by: Noah Misch <noah@leadboat.com> Discussion: https://postgr.es/m/CB11ADBC-0C3F-4FE0-A678-666EE80CBB07%40amazon.com

…different architectures. GIN and GiST indexes utilizing pg_trgm's opclasses store sorted trigrams within index tuples. When comparing and sorting each trigram, pg_trgm treats each character as a 'char[3]' type in C. However, the char type in C can be interpreted as either signed char or unsigned char, depending on the platform, if the signedness is not explicitly specified. Consequently, during replication between different CPU architectures, there was an issue where index scans on standby servers could not locate matching index tuples due to the differing treatment of character signedness. This change introduces comparison functions for trgm that explicitly handle signed char and unsigned char. The appropriate comparison function will be dynamically selected based on the character signedness stored in the control file. Therefore, upgraded clusters can utilize the indexes without rebuilding, provided the cluster upgrade occurs on platforms with the same character signedness as the original cluster initialization. The default char signedness information was introduced in 44fe30f, so no backpatch. Reviewed-by: Noah Misch <noah@leadboat.com> Discussion: https://postgr.es/m/CB11ADBC-0C3F-4FE0-A678-666EE80CBB07%40amazon.com

There is a race condition between "GRANT role" and "DROP ROLE", which allows GRANT to install pg_auth_members entries that refer to dropped roles. (Commit 6566133 prevented that for the grantor field, but not for the granted or grantee roles.) We'll soon fix that, at least in HEAD, but pg_dumpall needs to cope with the situation in case of pre-existing inconsistency. As pg_dumpall stands, it will emit invalid commands like 'GRANT foo TO ""', which causes pg_upgrade to fail. Fix it to emit warnings and skip those GRANTs, instead. There was some discussion of removing the problem by changing dumpRoleMembership's query to use JOIN not LEFT JOIN, but that would result in silently ignoring such entries. It seems better to produce a warning. Pre-v16 branches already coped with dangling grantor OIDs by simply omitting the GRANTED BY clause. I left that behavior as-is, although it's somewhat inconsistent with the behavior of later branches. Reported-by: Virender Singla <virender.cse@gmail.com> Discussion: https://postgr.es/m/CAM6Zo8woa62ZFHtMKox6a4jb8qQ=w87R2L0K8347iE-juQL2EA@mail.gmail.com Backpatch-through: 13

Oversight in a8238f8 where the test has been added. Discussion: https://postgr.es/m/CB11ADBC-0C3F-4FE0-A678-666EE80CBB07%40amazon.com

When nloops > 1, we now display two digits after the decimal point, rather than none. This is important because what we print is actually planstate->instrument->ntuples / nloops, and sometimes what you want to know is planstate->instrument->ntuples. You can estimate that by multiplying the displayed row count by the displayed nloops value, but the fact that the displayed value is rounded makes that inexact. It's still inexact even if we show these two extra decimal places, but less so. Perhaps we will agree on a way to further improve this output later, but for now this seems better than doing nothing. Author: Ibrar Ahmed <ibrar.ahmad@gmail.com> Author: Ilia Evdokimov <ilya.evdokimov@tantorlabs.com> Reviewed-by: David G. Johnston <david.g.johnston@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Reviewed-by: Vignesh C <vignesh21@gmail.com> Reviewed-by: Greg Stark <stark@mit.edu> Reviewed-by: Naeem Akhter <akhternaeem@gmail.com> Reviewed-by: Hamid Akhtar <hamid.akhtar@percona.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Andrei Lepikhov <a.lepikhov@postgrespro.ru> Reviewed-by: Guillaume Lelarge <guillaume@lelarge.info> Reviewed-by: Matheus Alcantara <matheusssilv97@gmail.com> Reviewed-by: Alena Rybakina <a.rybakina@postgrespro.ru> Discussion: http://postgr.es/m/603c8f070905281830g2e5419c4xad2946d149e21f9d%40mail.gmail.com

Concurrently dropping either the granted role or the grantee does not stop GRANT from completing, instead resulting in a dangling role reference in pg_auth_members. That's relatively harmless in the short run, but inconsistent catalog entries are not a good thing. This patch solves the problem by adding the granted and grantee roles as explicit shared dependencies of the pg_auth_members entry. That's a bit indirect, but it works because the pg_shdepend code applies the necessary locking and rechecking. Commit 6566133 previously established similar handling for the grantor column of pg_auth_members; it's not clear why it didn't cover the other two role OID columns. A side-effect of this approach is that DROP OWNED BY will now drop pg_auth_members entries that mention the target role as either the granted or grantee role. That's clearly appropriate for the grantee, since we'll drop its other privileges too. It doesn't seem too far out of line for the granted role, since we're presumably about to drop it and besides we're removing all reasons why it'd matter to be a member of it. (One could argue that this makes DropRole's code to auto-drop pg_auth_members entries unnecessary, but I chose to leave it in place since perhaps some people's workflows expect that to work without a DROP OWNED BY.) Note to patch readers: CreateRole's first CommandCounterIncrement call is now unconditional, because this change creates another case in which it's needed, and it seemed to be more trouble than it's worth to preserve that micro-optimization. Arguably this is a bug fix, but the fact that it changes the expected contents of pg_shdepend seems like not a great thing to do in the stable branches, and perhaps we don't want the change in DROP OWNED BY semantics there either. On the other hand, I opted not to force a catversion bump in HEAD, because the presence or absence of these entries doesn't matter for most purposes. Reported-by: Virender Singla <virender.cse@gmail.com> Reviewed-by: Laurenz Albe <laurenz.albe@cybertec.at> Discussion: https://postgr.es/m/CAM6Zo8woa62ZFHtMKox6a4jb8qQ=w87R2L0K8347iE-juQL2EA@mail.gmail.com

Per the buildfarm, these tests appear to be unstable in the wake of commit ddb17e3. I'm not sure that just hiding this output is the right way forward, because I think there may be other test cases that will fail with lower probability even after this fix. However, it's hard to tell right now, because this is failing on a number of buildfarm animals. So let's try this for now to either get a clearer picture of what else is broken, or as a stopgap until we decide what the permanent fix should be, or perhaps this will be the permanent fix after all.

To implement AIO writes, the backend initiating writes needs to transfer the lock ownership to the AIO subsystem, so the lock held during the write can be released in another backend. Other backends need to be able to "complete" an asynchronously started IO to avoid deadlocks (consider e.g. one backend starting IO for a buffer and then waiting for a heavyweight lock held by another relation followed by the current holder of the heavyweight lock waiting for the IO to complete). To that end, this commit adds LWLockDisown() and LWLockReleaseDisowned(). If code uses LWLockDisown() it's the code's responsibility to ensure that the lock is released in case of errors. Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Discussion: https://postgr.es/m/1f6b50a7-38ef-4d87-8246-786d39f46ab9@iki.fi

The 'cached-plan-inval' test suite, introduced in 525392d under src/test/modules/delay_execution, aimed to verify that cached plan invalidation triggers replanning after deferred locks are taken. However, its ExecutorStart_hook-based approach relies on lock timing assumptions that, in retrospect, are fragile. This instability was exposed by failures on BF animal trilobite, which builds with CLOBBER_CACHE_ALWAYS. One option was to dynamically disable the cache behavior that causes the test suite to fail by setting "debug_discard_caches = 0", but it seems better to remove the suite. The risk of future failures due to other cache flush hazards outweighs the benefit of catching real breakage in the backend behavior it tests. Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/2990641.1740117879@sss.pgh.pa.us

(Initially the proposal was to keep \conninfo alone and add this feature as \conninfo+, but we decided against keeping the original.) Also display more fields than before, though not as many as were suggested during the discussion. In particular, we don't show 'role' nor 'session authorization', for both which a case can probably be made. These can be added as followup commits, if we agree to it. Some (most?) reviewers actually reviewed rather different versions of the patch and do not necessarily endorse the current one. Co-authored-by: Maiquel Grassi <grassi@hotmail.com.br> Co-authored-by: Hunaid Sohail <hunaidpgml@gmail.com> Reviewed-by: Nathan Bossart <nathandbossart@gmail.com> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Reviewed-by: Sami Imseih <simseih@amazon.com> Reviewed-by: David G. Johnston <david.g.johnston@gmail.com> Reviewed-by: Jim Jones <jim.jones@uni-muenster.de> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Pavel Luzanov <p.luzanov@postgrespro.ru> Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com> Reviewed-by: Erik Wienhold <ewie@ewie.name> Discussion: https://postgr.es/m/CP8P284MB24965CB63DAC00FC0EA4A475EC462@CP8P284MB2496.BRAP284.PROD.OUTLOOK.COM

Reported-by: Hayato Kuroda (Fujitsu) <kuroda.hayato@fujitsu.com> Reported-by: Andrew Dunstan <andrew@dunslane.net> Discussion: https://postgr.es/m/OSCPR01MB149665630030E7F54FDA8B27BF5C72@OSCPR01MB14966.jpnprd01.prod.outlook.com Discussion: https://postgr.es/m/25d26774-25fa-46f2-9888-c6a707d1fef7@dunslane.net

The original message did not mention where the checkpoint record LSN was found, a control file or a backup_label file. A couple of LOG messages are generated before this FATAL check is reached, providing more details about the way recovery is set up. However, knowing this information in this specific message is useful for debugging. This is also useful for instances where log_min_messages is set to FATAL or more, where LOG messages do not show up. Author: Benoit Lobréau <benoit.lobreau@dalibo.com> Reviewed-by: David Steele <david@pgbackrest.org> Discussion: https://postgr.es/m/4ed10bc8-5513-4d8e-8643-8abcaa08336d@dalibo.com

This patch introduces the '--enable-two-phase' option to the 'pg_createsubscriber' utility, allowing users to enable two-phase commit for all subscriptions during their creation. Note that even without this option users can enable the two_phase option for the subscriptions created by pg_createsubscriber. However, it requires the subscription to be disabled first which could be inconvenient for users. When two-phase commit is enabled, prepared transactions are sent to the subscriber at the time of 'PREPARE TRANSACTION', and they are processed as two-phase transactions on the subscriber as well. If disabled, prepared transactions are sent only when committed and are processed immediately by the subscriber. Author: Shubham Khanna <khannashubham1197@gmail.com> Reviewed-by: vignesh C <vignesh21@gmail.com> Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Peter Smith <smithpb2250@gmail.com> Reviewed-by: Ajin Cherian <itsajin@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/CAHv8RjLPdFP=kA5LNSmWZ=+GMXmO+LczvV6p9HJjsXxZz10KGA@mail.gmail.com

All the processes that generate WAL should call pgstat_report_wal() to report all their statistics related to WAL, and this is already what happens in the tree. Keeping pgstat_report_wal() is confusing while the other routine is encouraged. This routine is not required since fc415ed, where it was lastly used in pgstat_report_stat() before an equivalent callback existed. Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://postgr.es/m/Z71oPkJJICrRB5Ws@paquier.xyz

This new structure contains the counters and the data related to the WAL activity statistics gathered from WalUsage, separated into its own structure so as it can be shared across more than one Stats structure in pg_stat.h. This refactoring will be used by an upcoming patch introducing backend-level WAL statistics. Bump PGSTAT_FILE_FORMAT_ID. Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://postgr.es/m/Z3zqc4o09dM/Ezyz@ip-10-97-1-34.eu-west-3.compute.internal

Using the Python Limited API would allow building PL/Python against any Python 3.x version and using another Python 3.x version at run time. This commit does not activate that, but it prepares the code to only use APIs supported by the Limited API. Implementation details: - Convert static types to heap types (https://docs.python.org/3/howto/isolating-extensions.html#heap-types). - Replace PyRun_String() with component functions. - Replace PyList_SET_ITEM() with PyList_SetItem(). Reviewed-by: Jakob Egger <jakob@eggerapps.at> Discussion: https://www.postgresql.org/message-id/flat/ee410de1-1e0b-4770-b125-eeefd4726a24@eisentraut.org

The test in 005_char_signedness.pl was missing a dash in the --set-char-signedness option. Although the test didn't fail since it doesn't check the error message, it resulted in an unexpected error message instead of the intended one. Oversight in 1aab680. Author: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org> Discussion: https://postgr.es/m/87tt8h1vb7.fsf@wibble.ilmari.org

This reverts commit c47e8df. That commit makes the plpython tests crash with Python 3.6.* and 3.7.*. It will need further investigation and testing, so revert for now.

Previously we used attname for both table and index columns, but that is problematic for indexes because their attnames are assigned by internal rules that don't guarantee to preserve the names across dump and reload. (This is what's causing the remaining buildfarm failures in cross-version-upgrade tests.) Fortunately we can use attnum instead, since there's no such thing as adding or dropping columns in an existing index. We met this same problem previously with ALTER INDEX ... SET STATISTICS, and solved it the same way, cf commit 5b6d13e. In pg_restore_attribute_stats() itself, we accept either attnum or attname, but the policy used by pg_dump is to always use attname for tables and attnum for indexes. Author: Tom Lane <tgl@sss.pgh.pa.us> Author: Corey Huinker <corey.huinker@gmail.com> Discussion: https://postgr.es/m/1457469.1740419458@sss.pgh.pa.us

Reported-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/Z77IkjmmfbFfNh3f@paquier.xyz

9d9b9d4 has added spinlocks to protect the fields in ProcSignal flags, introducing a code path in ProcSignalInit() where a spinlock could be released twice if the pss_pid field of a ProcSignalSlot is found as already set. Multiple spinlock releases have no effect with most spinlock implementations, but this could cause the code to run into issues when the spinlock is acquired concurrently by a different process. This sanity check on pss_pid generates a LOG that can be delayed until after the spinlock is released as, like older versions up to v17, the code expects the initialization of the ProcSignalSlot to happen even if pss_pid is found incorrect. The code is changed so as the old pss_pid is read while holding the slot's spinlock, with the LOG from the sanity check generated after releasing the spinlock, preventing the double release. Author: Maksim Melnikov <m.melnikov@postgrespro.ru> Co-authored-by: Maxim Orlov <orlovmg@gmail.com> Reviewed-by: Andrey Borodin <x4mmm@yandex-team.ru> Discussion: https://postgr.es/m/dca47527-2d8b-4e3b-b5a0-e2deb73371a4@postgrespro.ru

This commit adds to pgstatfuncs.c a new routine called pg_stat_wal_build_tuple(), helper routine for pg_stat_get_wal(). This is in charge of filling one tuple based on the contents of PgStat_WalStats retrieved from pgstats. This refactoring will be used by an upcoming patch introducing backend-level WAL statistics, simplifying the main patch. Note that it is not possible for stats_reset to be NULL in pg_stat_wal; backend statistics need to be able to handle this case. Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://postgr.es/m/Z3zqc4o09dM/Ezyz@ip-10-97-1-34.eu-west-3.compute.internal

Author: vignesh C <vignesh21@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/CALDaNm0zsFUYpe-tLha+-sp3K8KmBXu0o=LUN=8FFtxMLYikPA@mail.gmail.com

After commit f41d846, a process could acquire and use a replication slot that had just been invalidated, leading to failures while accessing WAL. To ensure that we don't accidentally start using invalid slots, we must perform the invalidation check after acquiring the slot or under the spinlock where we associate the slot with a particular process. We choose the earlier method to keep the code simple. Reported-by: Hou Zhijie <houzj.fnst@fujitsu.com> Author: Nisha Moond <nisha.moond412@gmail.com> Reviewed-by: Hou Zhijie <houzj.fnst@fujitsu.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/CABdArM7J-LbGoMPGUPiFiLOyB_TZ5+YaZb=HMES0mQqzVTn8Gg@mail.gmail.com

The function in charge of freeing the memory from a result created by PQescapeIdentifier() has to be PQfreemem(), to ensure that both allocation and free come from libpq, but one spot in pg_amcheck was missing that. Oversight in b859d94. Author: Ranier Vilela <ranier.vf@gmail.com> Reviewed-by: vignesh C <vignesh21@gmail.com> Discussion: https://postgr.es/m/CAEudQArD_nKSnYCNUZiPPsJ2tNXgRmLbXGSOrH1vpOF_XtP0Vg@mail.gmail.com Discussion: https://postgr.es/m/CAEudQArbTWVSbxq608GRmXJjnNSQ0B6R7CSffNnj2hPWMUsRNg@mail.gmail.com Backpatch-through: 14

Previously the internal queue of buffers was capped at max_ios * 4, though not less than io_combine_limit, at allocation time. That was done in the first version based on conservative theories about resource usage and heuristics pending later work. The configured I/O depth could not always be reached with dense random streams generated by ANALYZE, VACUUM, the proposed Bitmap Heap Scan patch, and also sequential streams with the proposed AIO subsystem to name some examples. The new formula is (max_ios + 1) * io_combine_limit, enough buffers for the full configured I/O concurrency level using the full configured I/O combine size, plus the buffers from one finished but not yet consumed full-sized I/O. Significantly more memory would be needed for high GUC values if the client code requests a large per-buffer data size, but that is discouraged (existing and proposed stream users try to keep it under a few words, if not zero). With this new formula, an intermediate variable could have overflowed under maximum GUC values, so its data type is adjusted to cope. Discussion: https://postgr.es/m/CA%2BhUKGK_%3D4CVmMHvsHjOVrK6t4F%3DLBpFzsrr3R%2BaJYN8kcTfWg%40mail.gmail.com

As spotted by Coverity, the calculation of ojrelid mixes signed and unsigned types causes possible overflow and undefined behavior. Instead of trying to fix the expression, this commit eliminates the relied local variable. The explicit branching is used to replace the -1 value. That, in turn, requires changing the signature of the remove_rel_from_eclass() function. Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/914330.1740330169%40sss.pgh.pa.us Reviewed-by: Andrei Lepikhov <lepihov@gmail.com>

pgbench wants to record the starting line number of each command in its scripts. It was computing that by scanning from the script start and counting newlines, so that O(N^2) work had to be done for an N-command script. In a script with 50K lines, this adds up to about 10 seconds on my machine. To add insult to injury, the results were subtly wrong, because expr_scanner_offset() scanned to find the NUL that flex inserts at the end of the current token --- and before the first yylex call, no such NUL has been inserted. So we ended by computing the script's last line number not its first one. This was visible only in case of \gset at the start of a script, which perhaps accounts for the lack of complaints. To fix, steal an idea from plpgsql and track the current lexer ending position and line count as we advance through the script. (It's a bit simpler than plpgsql since we can't need to back up.) Also adjust a couple of other places that were invoking scans from script start when they didn't really need to. I made a new psqlscan function psql_scan_get_location() that replaces both expr_scanner_offset() and expr_scanner_get_lineno(), since in practice expr_scanner_get_lineno() was only being invoked to find the line number of the current lexer end position. Reported-by: Daniel Vérité <daniel@manitou-mail.org> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/84a8a89e-adb8-47a9-9d34-c13f7150ee45@manitou-mail.org

ParseScript only needs the lineno for meta-commands, so let's not bother computing it otherwise. While this doesn't save much given the previous patch, there's no point in doing unnecessary work. While we're at it, avoid calling psql_scan_get_location() twice for a meta-command. One reason for making this change is that the line number computed in ParseScript's main loop was actually wrong in most cases: it would point just past the semicolon of the previous SQL command, not at what the user thinks the current command's line number is. We could add some code to skip whitespace before capturing the line number, but it would be pretty pointless at present. Just move the call to avoid the temptation to rely on that value. (Once we've lexed the backslash, the computed line number will be right.) This change also means that pgbench never inquires about the location before it's lexed something, so that the care taken in the previous patch to behave sanely in that case is unnecessary. It seems best to keep that logic, though, as future callers might depend on it. Author: Daniel Vérité <daniel@manitou-mail.org> Discussion: https://postgr.es/m/84a8a89e-adb8-47a9-9d34-c13f7150ee45@manitou-mail.org

Stop comparing access method OID values against HASH_AM_OID and BTREE_AM_OID, and instead check the IndexAmRoutine for an index to see if it advertises its ability to perform the necessary ordering, hashing, or cross-type comparing functionality. A field amcanorder already existed, this uses it more widely. Fields amcanhash and amcancrosscompare are added for the other purposes. Author: Mark Dilger <mark.dilger@enterprisedb.com> Discussion: https://www.postgresql.org/message-id/flat/E72EAA49-354D-4C2E-8EB9-255197F55330@enterprisedb.com

This is a preparation step for allowing subscripting containers to transform only a prefix of an indirection list and modify the list in-place by removing the processed elements. Currently, all elements are consumed, and the list is set to NIL after transformation. In the following commit, subscripting containers will gain the flexibility to stop transformation when encountering an unsupported indirection and return the remaining indirections to the caller.

This change extends generic type subscripting to recognize dot notation (.) in addition to bracket notation ([]). While this does not yet provide full support for dot notation, it enables subscripting containers to process it in the future. For now, container-specific transform functions only handle subscripting indices and stop processing when encountering dot notation. It is up to individual containers to decide how to transform dot notation in subsequent updates.

This is a preparation step for a future commit that will reuse the aforementioned function.

Now that we are allowing container generic subscripting to take dot notation in the list of indirections, and it is transformed as a String node. For jsonb, we want to represent field accessors as String nodes in refupperexprs for distinguishing from ordinary text subscripts which can be needed for correct EXPLAIN. Strings node is no longer a valid expression nodes, so added special handling for them in walkers in nodeFuncs etc.

This patch introduces JSONB member access using dot notation, wildcard access, and array subscripting with slicing, aligning with the JSON simplified accessor specified in SQL:2023. Specifically, the following syntax enhancements are added: 1. Simple dot-notation access to JSONB object fields 2. Wildcard dot-notation access to JSONB object fields 2. Subscripting for index range access to JSONB array elements Examples: -- Setup create table t(x int, y jsonb); insert into t select 1, '{"a": 1, "b": 42}'::jsonb; insert into t select 1, '{"a": 2, "b": {"c": 42}}'::jsonb; insert into t select 1, '{"a": 3, "b": {"c": "42"}, "d":[11, 12]}'::jsonb; -- Existing syntax predates the SQL standard: select (t.y)->'b' from t; select (t.y)->'b'->'c' from t; select (t.y)->'d'->0 from t; -- JSON simplified accessor specified by the SQL standard: select (t.y).b from t; select (t.y).b.c from t; select (t.y).d[0] from t; The SQL standard states that simplified access is equivalent to: JSON_QUERY (VEP, 'lax $.JC' WITH CONDITIONAL ARRAY WRAPPER NULL ON EMPTY NULL ON ERROR) where: VEP = <value expression primary> JC = <JSON simplified accessor op chain> For example, the JSON_QUERY equivalents of the above queries are: select json_query(y, 'lax $.b' WITH CONDITIONAL ARRAY WRAPPER NULL ON EMPTY NULL ON ERROR) from t; select json_query(y, 'lax $.b.c' WITH CONDITIONAL ARRAY WRAPPER NULL ON EMPTY NULL ON ERROR) from t; select json_query(y, 'lax $.d[0]' WITH CONDITIONAL ARRAY WRAPPER NULL ON EMPTY NULL ON ERROR) from t; Implementation details: Extends the existing container subscripting interface to support container-specific information, specifically a JSONPath expression for jsonb. During query transformation, detects dot-notation, wildcard access, and sliced subscripting. If any of these accessors are present, constructs a JSONPath expression representing the access chain. During execution, if a JSONPath expression is present in JsonbSubWorkspace, executes it via JsonPathQuery(). Does not transform accessors directly into JSON_QUERY during transformation to preserve the original query structure for EXPLAIN and CREATE VIEW.

If the number of sync requests is big enough, the palloc() call in AbsorbSyncRequests() will attempt to allocate more than 1 GB of memory, resulting in failure. This can lead to an infinite loop in the checkpointer process, as it repeatedly fails to absorb the pending requests. This commit introduces the following changes to cope with this problem: 1. Turn pending checkpointer requests array in shared memory into a bounded ring buffer. 2. Limit maximum ring buffer size to 10M items. 3. Make AbsorbSyncRequests() process requests incrementally in 10K batches. Even #2 makes the whole queue size fit the maximum palloc() size of 1 GB. of continuous lock holding. This commit is for master only. Simpler fix, which just limits a request queue size to 10M, will be backpatched. Reported-by: Ekaterina Sokolova <e.sokolova@postgrespro.ru> Discussion: https://postgr.es/m/db4534f83a22a29ab5ee2566ad86ca92%40postgrespro.ru Author: Maxim Orlov <orlovmg@gmail.com> Co-authored-by: Xuneng Zhou <xunengzhou@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com>

There've been a few complaints that it can be overly difficult to figure out why the planner picked a Memoize plan. To help address that, here we adjust the EXPLAIN output to display the following additional details: 1) The estimated number of cache entries that can be stored at once 2) The estimated number of unique lookup keys that we expect to see 3) The number of lookups we expect 4) The estimated hit ratio Technically postgres#4 can be calculated using #1, #2 and #3, but it's not a particularly obvious calculation, so we opt to display it explicitly. The original patch by Lukas Fittl only displayed the hit ratio, but there was a fear that might lead to more questions about how that was calculated. The idea with displaying all 4 is to be transparent which may allow queries to be tuned more easily. For example, if #2 isn't correct then maybe extended statistics or a manual n_distinct estimate can be used to help fix poor plan choices. Author: Ilia Evdokimov <ilya.evdokimov@tantorlabs.com> Author: Lukas Fittl <lukas@fittl.com> Reviewed-by: David Rowley <dgrowleyml@gmail.com> Reviewed-by: Andrei Lepikhov <lepihov@gmail.com> Reviewed-by: Robert Haas <robertmhaas@gmail.com> Discussion: https://postgr.es/m/CAP53Pky29GWAVVk3oBgKBDqhND0BRBN6yTPeguV_qSivFL5N_g%40mail.gmail.com

truncate_useless_pathkeys() seems to have neglected to account for PathKeys that might be useful for WindowClause evaluation. Modify it so that it properly accounts for that. Making this work required adjusting two things: 1. Change from checking query_pathkeys to check sort_pathkeys instead. 2. Add explicit check for window_pathkeys For #1, query_pathkeys gets set in standard_qp_callback() according to the sort order requirements for the first operation to be applied after the join planner is finished, so this changes depending on which upper planner operations a particular query needs. If the query has window functions and no GROUP BY, then query_pathkeys gets set to window_pathkeys. Before this change, this meant PathKeys useful for the ORDER BY were not accounted for in queries with window functions. Because of #1, #2 is now required so that we explicitly check to ensure we don't truncate away PathKeys useful for window functions. Author: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/CAApHDvrj3HTKmXoLMbUjTO=_MNMxM=cnuCSyBKidAVibmYPnrg@mail.gmail.com

jeff-davis and others added 30 commits February 20, 2025 10:21

Add missing entry to oauth_validator test .gitignore

2c53dec

Commit b3f0be7 accidentally missed adding the oauth client test binary to the relevant .gitignore. Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/2839306.1740082041@sss.pgh.pa.us

Fix for pg_restore_attribute_stats().

b50a554

Use RelationGetIndexExpressions() rather than rd_indexprs directly. Author: Corey Huinker <corey.huinker@gmail.com>

doc: remove non-breaking space in SGML files, causes make error

6ea0734

Add test 005_char_signedness.pl to meson.build.

78d3f48

Oversight in a8238f8 where the test has been added. Discussion: https://postgr.es/m/CB11ADBC-0C3F-4FE0-A678-666EE80CBB07%40amazon.com

michaelpq and others added 26 commits February 26, 2025 14:26

Revert "Prepare for Python "Limited API" in PL/Python"

f734c9f

This reverts commit c47e8df. That commit makes the plpython tests crash with Python 3.6.* and 3.7.*. It will need further investigation and testing, so revert for now.

Remove stray diff introduced by a5cbdeb.

15df9d7

Reported-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/Z77IkjmmfbFfNh3f@paquier.xyz

Doc: Additional clarification for -d option of pg_createsubscriber.

845511a

Author: vignesh C <vignesh21@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/CALDaNm0zsFUYpe-tLha+-sp3K8KmBXu0o=LUN=8FFtxMLYikPA@mail.gmail.com

Export jsonPathFromParseResult()

6029a27

This is a preparation step for a future commit that will reuse the aforementioned function.

Extract coerce_jsonpath_subscript()

f03fa16

This is a preparation step for a future commit that will reuse the aforementioned function.

Allow wild card member access for jsonb

a90375b

l-wang closed this Feb 27, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Json v9 #2

Json v9 #2

Uh oh!

l-wang commented Feb 27, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

18 participants

Json v9 #2

Json v9 #2

Uh oh!

Conversation

l-wang commented Feb 27, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

18 participants