Skip to content

[pull] master from postgres:master#805

Merged
pull[bot] merged 17 commits intoMu-L:masterfrom
postgres:master
Mar 3, 2021
Merged

[pull] master from postgres:master#805
pull[bot] merged 17 commits intoMu-L:masterfrom
postgres:master

Conversation

@pull
Copy link
Copy Markdown

@pull pull bot commented Mar 2, 2021

See Commits and Changes for more details.


Created by pull[bot]

Can you help keep this open source service alive? 💖 Please sponsor : )

tglsfdc and others added 5 commits March 2, 2021 11:34
POSIX defines the behavior of back-references thus:

    The back-reference expression '\n' shall match the same (possibly
    empty) string of characters as was matched by a subexpression
    enclosed between "\(" and "\)" preceding the '\n'.

As far as I can see, the back-reference is supposed to consider only
the data characters matched by the referenced subexpression.  However,
because our engine copies the NFA constructed from the referenced
subexpression, it effectively enforces any constraints therein, too.
As an example, '(^.)\1' ought to match 'xx', or any other string
starting with two occurrences of the same character; but in our code
it does not, and indeed can't match anything, because the '^' anchor
constraint is included in the backref's copied NFA.  If POSIX intended
that, you'd think they'd mention it.  Perl for one doesn't act that
way, so it's hard to conclude that this isn't a bug.

Fix by modifying the backref's NFA immediately after it's copied from
the reference, replacing all constraint arcs by EMPTY arcs so that the
constraints are treated as automatically satisfied.  This still allows
us to enforce matching rules that depend only on the data characters;
for example, in '(^\d+).*\1' the NFA matching step will still know
that the backref can only match strings of digits.

Perhaps surprisingly, this change does not affect the results of any
of a rather large corpus of real-world regexes.  Nonetheless, I would
not consider back-patching it, since it's a clear compatibility break.

Patch by me, reviewed by Joel Jacobson

Discussion: https://postgr.es/m/661609.1614560029@sss.pgh.pa.us
In some cases, at the time that we're doing an NFA-based precheck
of whether a backref subexpression can match at a particular place
in the string, we already know which substring the referenced
subexpression matched.  If so, we might as well forget about the NFA
and just compare the substring; this is faster and it gives an exact
rather than approximate answer.

In general, this optimization can help while we are prechecking within
the second child expression of a concat node, while the capture was
within the first child expression; then the substring was saved during
cdissect() of the first child and will be available to NFA checks done
while cdissect() recurses into the second child.  It can help quite a
lot if the tree looks like

              concat
             /      \
      capture        concat
                    /      \
     expensive stuff        backref

as we will be able to avoid recursively dissecting the "expensive
stuff" before discovering that the backref isn't satisfied with a
particular midpoint that the lower concat node is testing.  This
doesn't help if the concat tree is left-deep, as the capture node
won't get set soon enough (and it's hard to fix that without changing
the engine's match behavior).  Fortunately, right-deep concat trees
are the common case.

Patch by me, reviewed by Joel Jacobson

Discussion: https://postgr.es/m/661609.1614560029@sss.pgh.pa.us
This extends the changes made in commit cebc1d3, teaching
parseqatom() to generate fewer or cheaper subre nodes in some edge
cases.  The case of interest here is a quantified atom that is "messy"
only because it has greediness opposite to what preceded it (whereas
captures and backrefs are intrinsically messy).  In this case we don't
need an iteration node, since we don't care where the sub-matches of
the quantifier are; and we might also not need a second concatenation
node.  This seems of only marginal real-world use according to my
testing, but I wanted to get it in before wrapping up this series of
regex performance fixes.

Discussion: https://postgr.es/m/1340281.1613018383@sss.pgh.pa.us
On Windows, CMD.EXE allegedly does not run a command that uses forward slashes,
so let's convert the path to use backslashes instead.

Backpatch to 10.

Author: Nitin Jadhav <nitinjadhavpostgres@gmail.com>
Reviewed-by: Juan José Santamaría Flecha <juanjo.santamaria@gmail.com>
Discussion: https://postgr.es/m/CAMm1aWaNDuaPYFYMAqDeJrZmPtNvLcJRS++CcZWY8LT6KcoBZw@mail.gmail.com
This allows clients to find out the setting at connection time without
having to expend a query round trip to do so; which is helpful when
trying to identify read/write servers.  (One must also look at
in_hot_standby, but that's already GUC_REPORT, cf bf8a662.)
Modifying libpq to make use of this will come soon, but I felt it
cleaner to push the server change separately.

Haribabu Kommi, Greg Nancarrow, Vignesh C; reviewed at various times
by Laurenz Albe, Takayuki Tsunakawa, Peter Smith.

Discussion: https://postgr.es/m/CAF3+xM+8-ztOkaV9gHiJ3wfgENTq97QcjXQt+rbFQ6F7oNzt9A@mail.gmail.com
@pull pull bot added the ⤵️ pull label Mar 2, 2021
petergeoghegan and others added 12 commits March 2, 2021 13:02
Adjust some "can't happen" error messages that assumed that the page
deletion target page must be a half-dead page.  This assumption was
wrong in the case of an internal target page.  Simply refer to these
pages as the target page instead.

Internal pages are never marked half-dead.  There is exactly one
half-dead page for each subtree undergoing deletion.  The half-dead page
is also the target subtree's leaf-level page.  This has been the case
since commit efada2b, which totally overhauled nbtree page deletion.
Add documenting assertion.  This makes it easier to follow how we
maintain the top parent link in target subtree's half-dead/leaf level
page.
This option provides REINDEX (TABLESPACE) for reindexdb, applying the
tablespace value given by the caller to all the REINDEX queries
generated.

While on it, this commit adds some tests for REINDEX TABLESPACE, with
and without CONCURRENTLY, when run on toast indexes and tables.  Such
operations are not allowed, and toast relation names are not stable
enough to be part of the main regression test suite (even if using a PL
function with a TRY/CATCH logic, as CONCURRENTLY could not be tested).

Author: Michael Paquier
Reviewed-by: Mark Dilger, Daniel Gustafsson
Discussion: https://postgr.es/m/YDiaDMnzLICqeukl@paquier.xyz
In addition to the existing options of "any" and "read-write", we
now support "read-only", "primary", "standby", and "prefer-standby".
"read-write" retains its previous meaning of "transactions are
read-write by default", and "read-only" inverts that.  The other
three modes test specifically for hot-standby status, which is not
quite the same thing.  (Setting default_transaction_read_only on
a primary server renders it read-only to this logic, but not a
standby.)

Furthermore, if talking to a v14 or later server, no extra network
round trip is needed to detect the session's status; the GUC_REPORT
variables delivered by the server are enough.  When talking to an
older server, a SHOW or SELECT query is issued to detect session
read-only-ness or server hot-standby state, as needed.

Haribabu Kommi, Greg Nancarrow, Vignesh C, Tom Lane; reviewed at
various times by Laurenz Albe, Takayuki Tsunakawa, Peter Smith.

Discussion: https://postgr.es/m/CAF3+xM+8-ztOkaV9gHiJ3wfgENTq97QcjXQt+rbFQ6F7oNzt9A@mail.gmail.com
…ion_slot.

Commit 0aa8a01 extends the output plugin API to allow decoding of
prepared xacts and allowed the user to enable/disable the two-phase option
via pg_logical_slot_get_changes(). This can lead to a problem such that
the first time when it gets changes via pg_logical_slot_get_changes()
without two_phase option enabled it will not get the prepared even though
prepare is after consistent snapshot. Now next time during getting changes,
if the two_phase option is enabled it can skip prepare because by that
time start decoding point has been moved. So the user will only get commit
prepared.

Allow to enable/disable this option at the create slot time and default
will be false. It will break the existing slots which is fine in a major
release.

Author: Ajin Cherian
Reviewed-by: Amit Kapila and Vignesh C
Discussion: https://postgr.es/m/d0f60d60-133d-bf8d-bd70-47784d8fabf3@enterprisedb.com
Move our qsort implementation into a header that can be used to define
specialized functions for better performance and reduced duplication.

Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://postgr.es/m/CA%2BhUKGJ2-eaDqAum5bxhpMNhvuJmRDZxB_Tow0n-gse%2BHG0Yig%40mail.gmail.com
Reduce duplication by using the new template.

Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://postgr.es/m/CA%2BhUKGJ2-eaDqAum5bxhpMNhvuJmRDZxB_Tow0n-gse%2BHG0Yig%40mail.gmail.com
Replace the Perl code previously used to generate specialized sort
functions with sort_template.h.

Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://postgr.es/m/CA%2BhUKGJ2-eaDqAum5bxhpMNhvuJmRDZxB_Tow0n-gse%2BHG0Yig%40mail.gmail.com
Per buildfarm; this fix is from Michael Paquier (vignesh C
proposed nearly the same).

Discussion: https://postgr.es/m/YD8IZ9OKfUf9X1eF@paquier.xyz
It was not clear in the docs that the max_replication_slots is also used
to track replication origins on the subscriber side.

Author: Paul Martinez
Reviewed-by: Amit Kapila
Backpatch-through: 10 where logical replication was introduced
Discussion: https://postgr.es/m/CACqFVBZgwCN_pHnW6dMNCrOS7tiHCw6Retf_=U2Vvj3aUSeATw@mail.gmail.com
This expands the binary validation in pg_upgrade with a version
check per binary to ensure that the target cluster installation
only contains binaries from the target version.

In order to reduce duplication, validate_exec is exported from
port.h and the local copy in pg_upgrade is removed.

Author: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://www.postgresql.org/message-id/flat/9328.1552952117@sss.pgh.pa.us
@pull pull bot merged commit f06b1c5 into Mu-L:master Mar 3, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants