Conversation
contrib/intarray considers "arraycol <@ constant-array" to be indexable, but its GiST opclass code fails to reliably find index entries for empty array values (which of course should trivially match such queries). This is because the test condition to see whether we should descend through a non-leaf node is wrong. Unfortunately, empty array entries could be anywhere in the index, as these index opclasses are currently designed. So there's no way to fix this except by lobotomizing <@ indexscans to scan the whole index ... which is what this patch does. That's pretty unfortunate: the performance is now actually worse than a seqscan, in most cases. We'd be better off to remove <@ from the GiST opclasses entirely, and perhaps a future non-back-patchable patch will do so. In the meantime, applications whose performance is adversely impacted have a couple of options. They could switch to a GIN index, which doesn't have this bug, or they could replace "arraycol <@ constant-array" with "arraycol <@ constant-array AND arraycol && constant-array". That will provide about the same performance as before, and it will find all non-empty subsets of the given constant-array, which is all that could reliably be expected of the query before. While at it, add some more regression test cases to improve code coverage of contrib/intarray. In passing, adjust resize_intArrayType so that when it's returning an empty array, it uses construct_empty_array for that rather than cowboy hacking on the input array. While the hack produces an array that looks valid for most purposes, it isn't bitwise equal to empty arrays produced by other code paths, which could have subtle odd effects. I don't think this code path is performance-critical enough to justify such shortcuts. (Back-patch this part only as far as v11; before commit 01783ac we were not careful about this in other intarray code paths either.) Back-patch the <@ fixes to all supported versions, since this was broken from day one. Patch by me; thanks to Alexander Korotkov for review. Discussion: https://postgr.es/m/458.1565114141@sss.pgh.pa.us
When parsing a timetz string with a dynamic timezone abbreviation or a timezone not specified, it was possible to generate incorrect timestamps based on a date which uses some non-initialized variables if the input string did not specify fully a date to parse. This is already checked when a full timezone spec is included in the input string, but the two other cases mentioned above missed the same checks. This gets fixed by generating an error as this input is invalid, or in short when a date is not fully specified. Valgrind was complaining about this problem. Bug: #15910 Author: Alexander Lakhin Discussion: https://postgr.es/m/15910-2eba5106b9aa0c61@postgresql.org Backpatch-through: 9.4
In serializable mode, heap_hot_search_buffer() incorrectly acquired a predicate lock on the root tuple, not the returned tuple that satisfied the visibility checks. As explained in README-SSI, the predicate lock does not need to be copied or extended to other tuple versions, but for that to work, the correct, visible, tuple version must be locked in the first place. The original SSI commit had this bug in it, but it was fixed back in 2013, in commit 81fbbfe. But unfortunately, it was reintroduced a few months later in commit b89e151. Wising up from that, add a regression test to cover this, so that it doesn't get reintroduced again. Also, move the code that sets 't_self', so that it happens at the same time that the other HeapTuple fields are set, to make it more clear that all the code in the loop operate on the "current" tuple in the chain, not the root tuple. Bug spotted by Andres Freund, analysis and original fix by Thomas Munro, test case and some additional changes to the fix by Heikki Linnakangas. Backpatch to all supported versions (9.4). Discussion: https://www.postgresql.org/message-id/20190731210630.nqhszuktygwftjty%40alap3.anarazel.de
VACUUM's reference page had this text, but ANALYZE's didn't. That's a clear oversight given that section 5.7 explicitly delegates the responsibility to define permissions requirements to the individual commands' man pages. Per gripe from Isaac Morland. Back-patch to all supported branches. Discussion: https://postgr.es/m/CAMsGm5fp3oBUs-2iRfii0iEO=fZuJALVyM2zJLhNTjG34gpAVQ@mail.gmail.com
With some newer gcc versions (8 and 9) you get a -Wformat-overflow warning here. In PG11 and later this was already fixed. Since it's trivial, backport it to get the older branches building without warnings.
Commit 07b3908 inserted an unconditional reference to pg_opfamily, which of course fails on servers predating that catalog. Fortunately, the case it's trying to solve can't occur on such old servers (AFAIK). Hence, just skip the additional code when the source predates 8.3. Per bug #15955 from sly. Back-patch to all supported branches, like the previous patch. Discussion: https://postgr.es/m/15955-1daa2e676e903d87@postgresql.org
ALTER SYSTEM itself normally won't make duplicate entries (although up till this patch, it was possible to confuse it by writing case variants of a GUC's name). However, if some external tool has appended entries to the file, that could result in duplicate entries for a single GUC name. In such a situation, ALTER SYSTEM did exactly the wrong thing, because it replaced or removed only the first matching entry, leaving the later one(s) still there and hence still determining the active value. This patch fixes that by making ALTER SYSTEM sweep through the file and remove all matching entries, then (if not ALTER SYSTEM RESET) append the new setting to the end. This means entries will be in order of last setting rather than first setting, but that shouldn't hurt anything. Also, make the comparisons case-insensitive so that the right things happen if you do, say, ALTER SYSTEM SET "TimeZone" = 'whatever'. This has been broken since ALTER SYSTEM was invented, so back-patch to all supported branches. Ian Barwick, with minor mods by me Discussion: https://postgr.es/m/aed6cc9f-98f3-2693-ac81-52bb0052307e@2ndquadrant.com
Clarify what external tools can do to this file, and add a bit of detail about what ALTER SYSTEM itself does. Discussion: https://postgr.es/m/aed6cc9f-98f3-2693-ac81-52bb0052307e@2ndquadrant.com
This is a variant of the problem fixed in commit 25b6925, which unfortunately we failed to detect at the time. If an update trigger returns the "old" tuple, as it's entitled to do, then a subsequent iteration of the loop in ExecBRUpdateTriggers would have "oldtuple" equal to "trigtuple" and would fail to notice that it shouldn't free that. In addition to fixing the code, extend the test case added by 25b6925 so that it covers multiple-trigger-iterations cases. This problem does not manifest in v12/HEAD, as a result of the relevant code having been largely rewritten for slotification. However, include the test case into v12/HEAD anyway, since this is clearly an area that someone could break again in future. Per report from Piotr Gabriel Kosinski. Back-patch into all supported branches, since the bug seems quite old. Diagnosis and code fix by Thomas Munro, test case by me. Discussion: https://postgr.es/m/CAFMLSdP0rd7LqC3j-H6Fh51FYSt5A10DDh-3=W4PPc4LLUQ8YQ@mail.gmail.com
All those fixes are already included on HEAD thanks to for example c96581a and 66bde49, and have gone missing on back-branches. Author: Alexander Lakhin, Liudmila Mantrova Discussion: https://postgr.es/m/CAEkD-mDJHV3bhgezu3MUafJLoAKsOOT86+wHukKU8_NeiJYhLQ@mail.gmail.com Backpatch-through: 9.4
Author: Alexander Lakhin Discussion: https://postgr.es/m/20190819072244.GE18166@paquier.xyz
The "Express" flavor of Visual Studio exists up to 2017, and the documentation referred to "Express" for Visual Studio 2019. Author: Takuma Hoshiai Discussion: https://postgr.es/m/20190820120231.f905542e685140258ca73d82@sraoss.co.jp Backpatch-through: 9.4
POSIX permits getopt() to advance optind beyond argc when the last argv entry is an option that requires an argument and hasn't got one. It seems that no major platforms actually do that, but musl does, so that something like "psql -f" would crash with that libc. Add a check that optind is in range before trying to look at the possibly-bogus option. Report and fix by Quentin Rameau. Back-patch to all supported branches. Discussion: https://postgr.es/m/20190825100617.GA6087@fifth.space
On msys2, 'uname -s' reports a string starting MSYS instead on MINGW as happens on msys1. Treat these both the same way. This reverts 608a710 in favor of a more general solution. Backpatch to all live branches.
Section 4.2.7 says that unless otherwise specified, built-in aggregates ignore rows in which any input is null. This is not true of the JSON aggregates, but it wasn't documented. Fix that. Of the other entries in table 9.55, some were explicit about ignoring nulls, and some weren't; for consistency and self-contained-ness, make them all say it explicitly. Per bug #15884 from Tim Möhlmann. Back-patch to all supported branches. Discussion: https://postgr.es/m/15884-c32d848f787fcae3@postgresql.org
The comment did not match what the code actually did for integers with the 43rd bit set. You get an integer like that, if you have a posting list with two adjacent TIDs that are more than 2^31 blocks apart. According to the comment, we would store that in 6 bytes, with no continuation bit on the 6th byte, but in reality, the code encodes it using 7 bytes, with a continuation bit on the 6th byte as normal. The decoding routine also handled these 7-byte integers correctly, except for an overflow check that assumed that one integer needs at most 6 bytes. Fix the overflow check, and fix the comment to match what the code actually does. Also fix the comment that claimed that there are 17 unused bits in the 64-bit representation of an item pointer. In reality, there are 64-32-11=21. Fitting any item pointer into max 6 bytes was an important property when this was written, because in the old pre-9.4 format, item pointers were stored as plain arrays, with 6 bytes for every item pointer. The maximum of 6 bytes per integer in the new format guaranteed that we could convert any page from the old format to the new format after upgrade, so that the new format was never larger than the old format. But we hardly need to worry about that anymore, and running into that problem during upgrade, where an item pointer is expanded from 6 to 7 bytes such that the data doesn't fit on a page anymore, is implausible in practice anyway. Backpatch to all supported versions. This also includes a little test module to test these large distances between item pointers, without requiring a 16 TB table. It is not backpatched, I'm including it more for the benefit of future development of new posting list formats. Discussion: https://www.postgresql.org/message-id/33bfc20a-5c86-f50c-f5a5-58e9925d05ff%40iki.fi Reviewed-by: Masahiko Sawada, Alexander Korotkov
These have been there a long time, but their format was never explained in the docs. Per complaint from Yusuke Egashira. Discussion: https://postgr.es/m/848B1649C8A6274AA527C4472CA11EDD5FC70CBE@G01JPEXMBYT02
After an unexpected connection loss and successful reconnection, psql neglected to resynchronize its internal state about the server, such as server version. Ordinarily we'd be reconnecting to the same server and so this isn't really necessary, but there are scenarios where we do need to update --- one example is where we have a list of possible connection targets and they're not all alike. Define "resynchronize" as including connection_warnings(), so that this case acts the same as \connect. This seems useful; for example, if the server version did change, the user might wish to know that. An attuned user might also notice that the new connection isn't SSL-encrypted, for example, though this approach isn't especially in-your-face about such changes. Although this part is a behavioral change, it only affects interactive sessions, so it should not break any applications. Also, in do_connect, make sure that we desynchronize correctly when abandoning an old connection in non-interactive mode. These problems evidently are the result of people patching only one of the two places where psql deals with connection changes, so insert some cross-referencing comments in hopes of forestalling future bugs of the same ilk. Lastly, in Windows builds, issue codepage mismatch warnings only at startup, not during reconnections. psql's codepage can't change during a reconnect, so complaining about it again seems like useless noise. Peter Billen and Tom Lane. Back-patch to all supported branches. Discussion: https://postgr.es/m/CAMTXbE8e6U=EBQfNSe01Ej17CBStGiudMAGSOPaw-ALxM-5jXg@mail.gmail.com
The old code didn't differentiate between a read error and a concurrent truncation. fread reports both of these by returning 0; you have to use feof() or ferror() to distinguish between them, which this code did not do. It might be a better idea to use read() rather than fread() here, so that we can display a less-generic error message, but I'm not sure that would qualify as a back-patchable bug fix, so just do this much for now. Jeevan Chalke, reviewed by Jeevan Ladhe and by me. Discussion: http://postgr.es/m/CA+TgmobG4ywMzL5oQq2a8YKp8x2p3p1LOMMcGqpS7aekT9+ETA@mail.gmail.com
Some systems don't ship with "python" by default anymore, only "python3" or "python2" or some combination, so include those in the configure search. Back-patch of commit 7291733. At the time that was only pushed back as far as v10, because of concerns about interactions with commit b21c569. Closer analysis shows that if we just s/AC_PATH_PROG/AC_PATH_PROGS/, there is no effect on the older branches' behavior when PYTHON is explicitly specified, so it should be okay to back-patch: this will not break any configuration that worked before. And the need to support platforms with only a "python3" or "python2" executable is getting ever more urgent. Original patch by Peter Eisentraut, back-patch analysis by me Discussion: https://www.postgresql.org/message-id/flat/1457.1543184081%40sss.pgh.pa.us#c9cc1199338fd6a257589c6dcea6cf8d
Previously plain float comparison was used in GiST pairing heap. Such comparison doesn't provide proper ordering for value sets containing Inf and Nan values. This commit fixes that by usage of float8_cmp_internal(). Note, there is remaining problem with NULL distances, which are represented as Inf in pairing heap. It would be fixes in subsequent commit. Backpatch to all supported versions. Reported-by: Andrey Borodin Discussion: https://postgr.es/m/CAPpHfdsNvNdA0DBS%2BwMpFrgwT6C3-q50sFVGLSiuWnV3FqOJuQ%40mail.gmail.com Author: Alexander Korotkov Reviewed-by: Heikki Linnakangas Backpatch-through: 9.4
In order to implement NULL LAST semantic GiST previously assumed distance to the NULL value to be Inf. However, our distance functions can return Inf and NaN for non-null values. In such cases, NULL LAST semantic appears to be broken. This commit fixes that by introducing separate array of null flags for distances. Backpatch to all supported versions. Discussion: https://postgr.es/m/CAPpHfdsNvNdA0DBS%2BwMpFrgwT6C3-q50sFVGLSiuWnV3FqOJuQ%40mail.gmail.com Author: Alexander Korotkov Backpatch-through: 9.4
…cks. Some of these are quite old, but that doesn't make them not bugs. We'd rather report a failure via elog than SIGSEGV. While at it, uniformly spell the error check as !RelationIsValid(rel) rather than a bare rel == NULL test. The machine code is the same but it seems better to be consistent. Coverity complained about this today, not sure why, because the mistake is in fact old.
Modern versions of msys2 have changed the treatment of "cmd /c" so that the runtime will try to convert the switch to a native file path. This patch adds a setting to inhibit that behaviour. Discussion: https://postgr.es/m/3227042f-cfcc-745a-57dd-fb8c471f8ddf@2ndQuadrant.com Backpatch to all live branches.
Depending on the system used, t/*.pl may not be expanded into a list of tests which can be consumed by prove when attempting to run TAP tests on a given path. Fix that by using glob() directly in the script, to make sure that a complete list of tests is provided. This has not proved to be an issue with MSVC as the list was properly expanded, but it is on Linux with perl's system(). This is extracted from a larger patch. Author: Tom Lane Discussion: https://postgr.es/m/6628.1567958876@sss.pgh.pa.us Backpatch-through: 9.4
The example used to explain 'Looping Through Query Results' uses pseudo-materialized views. Replace it with a more up-to-date example which does the same thing with actual materialized views, which have been available since PostgreSQL 9.3. In the passing, change '%' as format specifier instead of '%s' as is used in other examples in plpgsql.sgml. Reported-by: Ian Barwick Author: Ian Barwick Reviewed-by: Amit Kapila Backpatch-through: 9.4 Discussion: https://postgr.es/m/9a70d393-7904-4918-c97c-649f6d114b6a@2ndquadrant.com
Since WITH CHECK OPTION was introduced, ExecInitModifyTable has initialized WCO expressions with the wrong plan node as parent -- that is, it passed its input subplan not the ModifyTable node itself. Up to now we thought this was harmless, but bug #16006 from Vinay Banakar shows it's not: if the input node is a SubqueryScan then ExecInitWholeRowVar can get confused into doing the wrong thing. (The fact that ExecInitWholeRowVar contains such logic is certainly a horrid kluge that doesn't deserve to live, but figuring out another way to do that is a task for some other day.) Andres had already noticed the wrong-parent mistake and fixed it in commit 148e632, but not being aware of any user-visible consequences, he quite reasonably didn't back-patch. This patch is simply a back-patch of 148e632, plus addition of a test case based on bug #16006. I also added the test case to v12/HEAD, even though the bug is already fixed there. Back-patch to all supported branches. 9.4 lacks RLS policies so the new test case doesn't work there, but I'm pretty sure a test could be devised based on using a whole-row Var in a plain WITH CHECK OPTION condition. (I lack the cycles to do so myself, though.) Andres Freund and Tom Lane Discussion: https://postgr.es/m/16006-99290d2e4642cbd5@postgresql.org Discussion: https://postgr.es/m/20181205225213.hiwa3kgoxeybqcqv@alap3.anarazel.de
Most WAL records are ignored in early SnapBuild snapshot build phases. But it's critical to process some of them, so that later messages have the correct transaction state after the snapshot is completely built; in particular, XLOG_XACT_ASSIGNMENT messages are critical in order for sub-transactions to be correctly assigned to their parent transactions, or at least one assert misbehaves, as reported by Ildar Musin. Diagnosed-by: Masahiko Sawada Author: Masahiko Sawada Discussion: https://postgr.es/m/CAONYFtOv+Er1p3WAuwUsy1zsCFrSYvpHLhapC_fMD-zNaRWxYg@mail.gmail.com
It's important users be able to know (without looking at the source code) that running DDL or DDL-like commands can interrupt autovacuum which can lead to a lot of dead tuples and hence slower database operations. Reported-by: James Coleman Author: James Coleman Reviewed-by: Amit Kapila Backpatch-through: 9.4 Discussion:https://postgr.es/m/CAAaqYe-XYyNwML1=f=gnd0qWg46PnvD=BDrCZ5-L94B887XVxQ@mail.gmail.com
ssh and scl commands failing during cli_cross_subnet execution with error. "Unable to negotiate with 10.0.43.106 port 22: no matching host key type found. Their offer: ssh-rsa,ssh-dss lost connection" This could be an issue with OpenSSH (7.0 and greater).OpenSSH implements all of the cryptographic algorithms needed for compatibility with standards-compliant SSH implementations, but since some of the older algorithms have been found to be weak, not all of them are enabled by default. We can avoid this failures by using below flags while running ssh/scp commands. HostKeyAlgorithms=+ssh-dss and PubkeyAcceptedKeyTypes=+ssh-rsa
Commit 71ec37a removed the implicit format in favor of better aligning with Postgres. However, it broke existing customers upgrades from 5X to 6X. This patch brings back the old behavior behind a GUC 'enable_implicit_timeformat_YYYYMMDDHH24MISS' that is disabled by default. Co-authored-by: Ashwin Agrawal <aashwin@vmware.com>
* Docs: added gpmemwatcher and gpmemreport to the docs * minor changes * Some modifications after review * copyedit * lowercasing some more variables * copyedits Co-authored-by: David Yozie <dyozie@pivotal.io>
* Docs: added new gp_dispatch_keepalive gucs * Changes * Changed set classifications
For the following query, Orca generated plan that caused wrong results:
CREATE TABLE dist_tab_a (a varchar(15)) DISTRIBUTED BY(a);
INSERT INTO dist_tab_a VALUES('1 '), ('2 '), ('3 ');
CREATE TABLE dist_tab_b (a char(15), b bigint) DISTRIBUTED BY(a);
INSERT INTO dist_tab_b VALUES('1 ', 1), ('2 ', 2), ('3 ', 3);
CREATE TABLE result_tab_a AS
(SELECT a.a, b.b FROM dist_tab_a a LEFT JOIN dist_tab_b b ON a.a=b.a) DISTRIBUTED BY(a);
EXPLAIN CREATE TABLE result_tab_a AS
(SELECT a.a, b.b FROM dist_tab_a a LEFT JOIN dist_tab_b b ON a.a=b.a) DISTRIBUTED BY(a);
Hash Left Join (cost=0.00..862.00 rows=2 width=16)
Hash Cond: ((dist_tab_a.a)::bpchar = dist_tab_b.a)
-> Redistribute Motion 3:3 (slice1; segments: 3) (cost=0.00..431.00 rows=1 width=8)
Hash Key: dist_tab_a.a
-> Seq Scan on dist_tab_a (cost=0.00..431.00 rows=1 width=8)
-> Hash (cost=431.00..431.00 rows=1 width=16)
-> Seq Scan on dist_tab_b (cost=0.00..431.00 rows=1 width=16)
Optimizer: Pivotal Optimizer (GPORCA)
SELECT gp_table_distribution_check('result_tab_a');
gp_table_distribution_check
-----------------------------
(2,t)
(0,f)
(1,f)
(3 rows)
Correct plan and results should be:
EXPLAIN CREATE TABLE result_tab_a AS
(SELECT a.a, b.b FROM dist_tab_a a LEFT JOIN dist_tab_b b ON a.a=b.a) DISTRIBUTED BY(a);
Result (cost=0.00..862.05 rows=2 width=16)
-> Redistribute Motion 3:3 (slice1; segments: 3) (cost=0.00..862.00 rows=2 width=16)
Hash Key: dist_tab_a.a
-> Hash Left Join (cost=0.00..862.00 rows=1 width=16)
Hash Cond: ((dist_tab_a.a)::bpchar = dist_tab_b.a)
-> Redistribute Motion 3:3 (slice2; segments: 3) (cost=0.00..431.00 rows=1 width=8)
Hash Key: dist_tab_a.a
-> Seq Scan on dist_tab_a (cost=0.00..431.00 rows=1 width=8)
-> Hash (cost=431.00..431.00 rows=1 width=16)
-> Seq Scan on dist_tab_b (cost=0.00..431.00 rows=1 width=16)
Optimizer: Pivotal Optimizer (GPORCA)
(11 rows)
SELECT gp_table_distribution_check('result_tab_a');
gp_table_distribution_check
-----------------------------
(2,t)
(0,t)
(1,t)
(3 rows)
I think the root cause is that when we derive distribution of Hash Left
Outer Join, we create a hash distribution spec with wrong opfamilies.
As a result, there isn't a mismatch between distribution spec
requested/derived, Motion Redistribute node is not added to plantree
Authored-by: Robert Mu <muguoqing@hashdata.cn>
(cherry picked from commit 9ea1445)
…ning-functions (#12897)
…ursor(). This commit also invalidate CatalogSnapshot at the end of the function readSharedLocalSnapshot_forCursor(), so that next it will create a new CatalogSnapshot that contains the correct command id from the shared snapshot loaded.
The buf-file of snapshot passed to reader cursor is not large so we do not need to put it in temp tablespaces even the GUC is set. This can fix a complicated infinite recursive function call problem mentioned in Github Issue 12871. (So long, not suitable to elaborate here). Fix Github Issue: #12871, refer to the issue page for details.
The server collects up to a bufferload of data whenever it reads data from the client socket. When SSL or GSS encryption is requested during startup, any additional data received with the initial request message remained in the buffer, and would be treated as already-decrypted data once the encryption handshake completed. Thus, a man-in-the-middle with the ability to inject data into the TCP connection could stuff some cleartext data into the start of a supposedly encryption-protected database session. This could be abused to send faked SQL commands to the server, although that would only work if the server did not demand any authentication data. (However, a server relying on SSL certificate authentication might well not do so.) To fix, throw a protocol-violation error if the internal buffer is not empty after the encryption handshake. Our thanks to Jacob Champion for reporting this problem. Security: CVE-2021-23214 (cherry picked from commit 046c2c8)
libpq collects up to a bufferload of data whenever it reads data from the socket. When SSL or GSS encryption is requested during startup, any additional data received with the server's yes-or-no reply remained in the buffer, and would be treated as already-decrypted data once the encryption handshake completed. Thus, a man-in-the-middle with the ability to inject data into the TCP connection could stuff some cleartext data into the start of a supposedly encryption-protected database session. This could probably be abused to inject faked responses to the client's first few queries, although other details of libpq's behavior make that harder than it sounds. A different line of attack is to exfiltrate the client's password, or other sensitive data that might be sent early in the session. That has been shown to be possible with a server vulnerable to CVE-2021-23214. To fix, throw a protocol-violation error if the internal buffer is not empty after the encryption handshake. Our thanks to Jacob Champion for reporting this problem. Security: CVE-2021-23222 (cherry picked from commit d83cdfd)
* docs - pxf removed from greenplum server distribution * update redirects to release tag * misc edits to clarify * update a few installing xrefs so redirects will work * update a few xrefs so redirects will work
… (#12888) With GDD enabled, QE might be blocked by other transaction's UPDATE operation, and later wakes up, EvalPlanQual logic is triggered. To re-evaluate the plan, it will try to init all subplans (even if the subplan is not needed in the plan to re-eval), we should also consider nmotions when init such subplans. Throw error to avoid PANIC when such subplans contain motion. Fix Issue: greenplum-db/gpdb#12902. NOTE: this is not a behavior change and this just makes thing correct. See Greenplum's document: https://docs.greenplum.org/6-16/admin_guide/dml.html#topic_gdd for details. Error out here to make the database consistent is the same idea as the Postgres's repeatable read isolation level, see the Postgres doc for details: https://www.postgresql.org/docs/9.4/transaction-iso.html.
Cherry-picked from e4c6c36 and follow-up 780005f with minor conflicts resolved. Original commit message follows along with a manual repro for 6X that involves named portals in extended protocol: ------------------------------------------------------------------------- Since commit 2fa7c06, we have introduced an opportunity where the active statement count is leaked (the active statements is not decremented with ResLockUpdateLimit(.., .., false, ..). This can happen during a deadlock report or a statement cancellation of a statement if the session has at least one other active named portal. During CheckDeadlock()/ResLockWaitCancel(), we would clean up the locallock, which would cause a subsequent call to ResLockRelease(), for the other active portal to early return here: /* * If the lock request did not get very far, cleanup is easy. */ if (!locallock || !locallock->lock || !locallock->proclock) { elog(LOG, "Resource queue %d: no lock to release", locktag->locktag_field1); if (locallock) { RemoveLocalLock(locallock); } return false; } and not call ResLockUpdateLimit(.., .., false, ..) The resultant active statement leak would cause subsequently submitted statements to block forever on the resource queue lock, once the #active statements = active statement limit. Additionally, added test coverage for query cancellation and general sanity checks for leaks in other tests. ------------------------------------------------------------------------- Manual repro applicable to 6X with extended protocol and named portals: Consider a JDBC program using named portals with extended protocol (with prepareThreshold=1, autoCommit=false, fetchSize=1). Let the program run 2 select statements inside the same transaction. Have 2 sessions run each statement in an interleaved fashion similar to the tests added in resource_queue_multi_portal.sql with jdb. Use a similar resource queue with a limit of 2 active statements. This will give rise to a deadlock and active statement leak. Note: using the same program described above, a leak with cancellation can also be achieved. Co-authored-by: Ashwin Agrawal <aashwin@vmware.com> Co-authored-by: Yao Wang <wayao@vmware.com> Co-authored-by: Hongxu Ma <interma@outlook.com> Co-authored-by: Zhenghua Lyu <78182909+kainwen@users.noreply.github.com>
Discussion: https://groups.google.com/a/greenplum.org/g/gpdb-dev/c/A2JUpJ0NrEA/m/3S6rmd2zBAAJ (cherry picked from commit b1dc547)
Collaborator
|
There were conflicts caused by:
|
Collaborator
|
@maksm90, @InnerLife0. There are only known failures of behave tests. So this PR is ready for review. |
InnerLife0
approved these changes
Jan 20, 2022
InnerLife0
left a comment
There was a problem hiding this comment.
Looked for resolved conflicts caused by my PR. Seems OK.
Others looks OK too.
maksm90
approved these changes
Jan 20, 2022
RekGRpth
pushed a commit
that referenced
this pull request
Mar 4, 2026
Tests for gpload were disabled for ubuntu, even though they pass without errors. Remove the ubuntu check to add them to the active test suite. Also move its Python dependencies to be installed during the docker building step, not at runtime. Ticket: ADBDEV-9317
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Here are some reminders before you submit the pull request
make installcheck