Skip to content

Conversation

simi
Copy link
Owner

@simi simi commented Jun 14, 2020

COPY (SELECT * FROM test) TO '/tmp/ids';

yr=# SELECT * from pg_stat_progress_copy;
   pid   | datid | datname | relid | direction | file | program | lines_processed | file_bytes_processed 
---------+-------+---------+-------+-----------+------+---------+-----------------+----------------------
 3347126 | 16384 | yr      |     0 | TO        | t    | f       |         3529943 |             24906226
(1 row)

COPY test FROM '/tmp/ids'

yr=# SELECT * from pg_stat_progress_copy;
   pid   | datid | datname | relid | direction | file | program | lines_processed | file_bytes_processed 
---------+-------+---------+-------+-----------+------+---------+-----------------+----------------------
 3347126 | 16384 | yr      | 16385 | FROM      | t    | f       |       121591999 |            957218816
(1 row)

@simi simi force-pushed the copy-progress branch 2 times, most recently from bb528a2 to f9cb4fe Compare June 20, 2020 17:27
@simi simi force-pushed the copy-progress branch 3 times, most recently from 1dd84b5 to 5b1dddd Compare June 21, 2020 11:23
hlinnaka and others added 20 commits September 21, 2020 14:50
Since we're bypassing the buffer manager, we need to call
PageSetChecksumInplace() directly. As reported by Justin Pryzby.

In the passing, add RelationOpenSmgr() calls before all smgrwrite() and
smgrextend() calls. Tom added one before the first smgrextend() call in
commit c2bb287, which seems to be enough, but let's play it safe and
do it before each one. That's how it's done in the similar code in
nbtsort.c, too.

Discussion: https://www.postgresql.org/message-id/20200920224446.GF30557@telsasoft.com
The standard order in PostgreSQL and other code is use strict first,
but some code was uselessly inconsistent about this.
99% of this is docs, but also a couple of comments.  No code changes.

Justin Pryzby

Discussion: https://postgr.es/m/20200919175804.GE30557@telsasoft.com
This completes the project of making all our derived files be
pgindent-clean (or else explicitly excluded from indentation),
so that no surprises result when running pgindent in a built-out
development tree.

Discussion: https://postgr.es/m/79ed5348-be7a-b647-dd40-742207186a22@2ndquadrant.com
The existing message about "a column definition list is only allowed for
functions returning "record"" could be given in some cases where it was
fairly confusing; in particular, a function with multiple OUT parameters
*does* return record according to pg_proc.  Break it down into a couple
more cases to deliver a more on-point complaint.  Per complaint from
Bruce Momjian.

Discussion: https://postgr.es/m/798909.1600562993@sss.pgh.pa.us
pgindent messes up entries in this file if their names match
typedef names.  While there's reason to avoid choosing conflicting
names, we have some historical exceptions, and there's no guarantee
that more duplicates won't appear in future.  Since this is a derived
file anyway, there's little harm in just excluding it.

I said yesterday that all our derived files are pgindent-clean, or else
explicitly excluded from indentation, but I'd forgotten about this one.
Now that project is really done, as confirmed by a test run.

Discussion: https://postgr.es/m/79ed5348-be7a-b647-dd40-742207186a22@2ndquadrant.com
Further experience says that the appending behavior offered by
pg_get_line_append is useful to only a very small minority of callers.
For most, the requirement to reset the buffer after each line is just
an error-prone nuisance.  Hence, invent another alternative call
pg_get_line_buf, which takes care of that detail.

Noted while reviewing a patch from Daniel Gustafsson.

Discussion: https://postgr.es/m/48A4FA71-524E-41B9-953A-FD04EF36E2E7@yesql.se
Use a StringInfo instead of a fixed-size buffer in parseServiceInfo().
While we've not heard complaints about the existing 255-byte limit,
it certainly seems possible that complex cases could run afoul of it.

Daniel Gustafsson

Discussion: https://postgr.es/m/48A4FA71-524E-41B9-953A-FD04EF36E2E7@yesql.se
pg_restore previously coped with overlength TOC-file lines using some
complicated logic to ignore additional bufferloads.  While this isn't
wrong, since we don't expect that the interesting part of a line would
run to more than a dozen or so bytes, it's more complex than it needs
to be.  Use a StringInfo instead of a fixed-size buffer so that we can
process long lines as single entities and thus not need the extra
logic.

Daniel Gustafsson

Discussion: https://postgr.es/m/48A4FA71-524E-41B9-953A-FD04EF36E2E7@yesql.se
Due to flaws in commit 3347c98, using WaitLatch() without
WL_LATCH_SET could cause an assertion failure or crash.  Repair.

While here, also add a check that the latch we're switching to belongs
to this backend, when changing from one latch to another.

Discussion: https://postgr.es/m/CA%2BhUKGK1607VmtrDUHQXrsooU%3Dap4g4R2yaoByWOOA3m8xevUQ%40mail.gmail.com
tsearch_readline() saves the string pointer it returns to the caller
for possible use in the associated error context callback.  However,
the caller will usually pfree that string sometime before it next
calls tsearch_readline(), so that there is a window where an ereport
will try to print an already-freed string.

The built-in users of tsearch_readline() happen to all do that pfree
at the bottoms of their loops, so that the window is effectively
empty for them.  However, this is not documented as a requirement,
and contrib/dict_xsyn doesn't do it like that, so it seems likely
that third-party dictionaries might have live bugs here.

The practical consequences of this seem pretty limited in any case,
since production builds wouldn't clobber the freed string immediately,
besides which you'd not expect syntax errors in dictionary files
being used in production.  Still, it's clearly a bug waiting to bite
somebody.

Fix by pstrdup'ing the string to be saved for the error callback,
and then pfree'ing it next time through.  It's been like this for
a long time, so back-patch to all supported branches.

Discussion: https://postgr.es/m/48A4FA71-524E-41B9-953A-FD04EF36E2E7@yesql.se
We failed to pass down the query string to check_new_partition_bound,
so that its attempts to provide error cursor positions were for naught;
one must have the query string to get parser_errposition to do anything.
Adjust its API to require a ParseState to be passed down.

Also, improve the logic inside check_new_partition_bound so that the
cursor points at the partition bound for the specific column causing
the issue, when one can be identified.

That part is also for naught if we can't determine the query position of
the column with the problem.  Improve transformPartitionBoundValue so
that it makes sure that const-simplified partition expressions will be
properly labeled with positions.  In passing, skip calling evaluate_expr
if the value is already a Const, which is surely the most common case.

Alexandra Wang, Ashwin Agrawal, Amit Langote; reviewed by Ashutosh Bapat

Discussion: https://postgr.es/m/CACiyaSopZoqssfMzgHk6fAkp01cL6vnqBdmTw2C5_KJaFR_aMg@mail.gmail.com
Discussion: https://postgr.es/m/CAJV4CdrZ5mKuaEsRSbLf2URQ3h6iMtKD=hik8MaF5WwdmC9uZw@mail.gmail.com
Harmonize behavior by moving reponsibility for fsyncing directories down
into slru.c.  In 10 and later, only the multixact directories were
missed (see commit 1b02be2), and in older branches all SLRUs were
missed.

Back-patch to all supported releases.

Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/CA%2BhUKGLtsTUOScnNoSMZ-2ZLv%2BwGh01J6kAo_DM8mTRq1sKdSQ%40mail.gmail.com
Commit fbeb9da, which added the tsearch_readline APIs, left
t_readline() in place as a compatibility measure.  But that function
has been unused and deprecated for twelve years now, so that seems
like enough time to remove it.  Doing so, and merging t_readline's
code into tsearch_readline, aids in making several useful
improvements:

* The hard-wired 4K limit on line length in tsearch data files is
removed, by using a StringInfo buffer instead of a fixed-size buffer.

* We can buy back the per-line palloc/pfree added by 3ea7e95
in the common case where encoding conversion is not required.

* We no longer need a separate pg_verify_mbstr call, as that
functionality was folded into encoding conversion some time ago.

(We could have done some of this stuff while keeping t_readline as a
separate API, but there seems little point, since there's no reason
for anyone to still be using t_readline directly.)

Discussion: https://postgr.es/m/48A4FA71-524E-41B9-953A-FD04EF36E2E7@yesql.se
This makes it possible for code outside snapmgr.c to examine the
contents of this data structure. This commit does not add any code
which actually does so; a subsequent commit will make that change.

Patch by me, reviewed by Thomas Munro, Dilip Kumar, Hamid Akhtar.

Discussion: http://postgr.es/m/CA+TgmoY=aqf0zjTD+3dUWYkgMiNDegDLFjo+6ze=Wtpik+3XqA@mail.gmail.com
You can use this to view the contents of the time to XID mapping
which the server maintains when old_snapshot_threshold != -1.
Being able to view that information may be interesting for users,
and it's definitely useful for figuring out whether the mapping
is being maintained correctly. It isn't, so that will need to be
fixed in a subsequent commit.

Patch by me, reviewed by Thomas Munro, Dilip Kumar, Hamid Akhtar.

Discussion: http://postgr.es/m/CA+TgmoY=aqf0zjTD+3dUWYkgMiNDegDLFjo+6ze=Wtpik+3XqA@mail.gmail.com
- add pg_stat_progress_copy system view
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants