-
Notifications
You must be signed in to change notification settings - Fork 2
[#26240] YSQL: Import 'Transfer statistics during pg_upgrade' #5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
timothy-e
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How did you identify these 13 commits? How do we know there's not a 14th that fixes a sneaky bug? That's probably worth specifying in the description.
Otherwise, the commits look good to me. I didn't diff them to the original PG commits though
|
One nit: you could use markdown table syntax to format the |
jasonyb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For each conflict, please not only reference the filename but also another marker within that file, such as function name and/or code line. Line number is not reliable since there are two sides (which side are you talking about for line number?) and these can change in case someone else does an import before you.
- pg_dump.sgml
- No incoming changes accepted
- This is not true. You take some patches and ignore some patches. If you are meaning that, for the merge conflict you got, you don't take incoming changes, then say that more specifically (and it would be better to take upstream's changes instead in most cases). Or perhaps more easily, resolve the conflict so you don't have to describe this.
- No incoming changes accepted
- pg_backup_archiver.c
- 983: Removed txnCount related code that does not exist in pg15
- Are you talking about the upstream change that adds REQ_STATS? It's not super clear. "txnCount" is on a context line on the upstream side's patch.
- 983: Removed txnCount related code that does not exist in pg15
- pg_dump_sort.c
- (none)
- lack of "+StaticAssertDecl(lengthof(dbObjectTypePriority) == NUM_DUMPABLE_OBJECT_TYPES," change is not explained.
- (none)
- src/bin/pg_dump/t/002_pg_dump.pl
- (none)
- lack of explanation for odd differences in this file
- (none)
- other minor changes that must have been conflicts are not listed. not a blocker
Stopping review at this point.
|
40e27d0 resolution is very unclear.
What does that mean? I see a bunch of extra content brought in that does not exist in 40e27d0. Take for instance Also, the commit author is screwed up. It says hari when it should be |
I looked at the commit history of |
e92714d to
9307f45
Compare
|
please use |
jasonyb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, reviewed only the first commit eb53a12 and found issues:
- pg_dump.sql: there are missing patches that are not explained
- pg_backup_archiver.c: REQ_STATS explanation is now completely gone.
- pg_dump_sort.c: lack of
+StaticAssertDecl(lengthof(dbObjectTypePriority) == NUM_DUMPABLE_OBJECT_TYPES,is still not explained. In fact, it shouldn't be missing. It should be fixed, and YB change should be transfered to pg_dump.h. - 002_pg_dump.pl:
+ statistics_only => {change is uncalled for.- upstream's
+ no_schema => 1,patch on@@ -1598,6 +1631,7 @@ my %tests = (appears missing - binary_upgrade => 1,changes are uncalled for. "Test cases ... were also updated so that they match imcoming changes and for the tests to pass" is vague reasoning.
jasonyb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looked at the (new) first commit:
Removal of only_dump_measurement => 1, in section Check ordering of a matview that depends on a primary key (aka @@ -2354,7 +2352,6 @@ my %tests = () was never explained. It looks like it was trying to follow the original patch's change on @@ -3210,7 +3207,6 @@ my %tests = (, but that is on completely different 'COPY test_compression_method' => {.
Resolution: Keeping the blob terminology as per pg15. Skipping the runs that are not available in pg15.: you're not "keeping" blob terminology since these are concerning deleted lines. More importantly, section_post_data => 1, lines should not be changed: leave them spaced like section_post_data => 1, to reduce diff. Minor improvement is to reference the PG commit that redid the spacing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 5920cb4
41a2844 Clean up some pg_dump tests
-
For section
COPY test_compression_method, you need an explanation similar toCause: Missing 00d9dcf5bebbb355152a60f0e2120cdf7f9e7ddd: 'LO create (with no data)' test is not present in pg15. -
You should add an explanation for section
ALTER INDEX ... ATTACH PARTITION. -
This was uncalled for:
- }, + unlike => {},
1fd1bd8 Transfer statistics during pg_upgrade.
-
Is it okay to add
case 18aftercase 11? Should it be mapped down tocase 12? -
Section
If it has a statistics component that we want, then process thathas double newline for no reason. Make it single. -
case 5:: is it safe to skip cases 2-4? Or does this need to be remapped to case 2?
40e27d0 Use attnum to identify index columns in pg_restore_attribute_stats().
Previous cherry-pick ae8c8b3080e7cab45a991ea772602ba51c794ec2 imported the latest version of this file already.
- You are referencing a yugabyte/yugabyte-db commit. Please refer to eea554f instead.
Cause: Missing 374c7a2: 'CREATE TABLE regress_pg_dump_table_part' test case was added as part of a new feature.
- That's an invalid excuse. The code change is adjacent to
CREATE TABLE regress_pg_dump_table_part. Do the minor code change.
Previous version of the test file was invalid. The contents of the entire file have been updated to match the sate as of 40e27d0
-
Please elaborate on what was "invalid".
-
diff <(git show cd9ae07210c7b255c9e2e80550e8b5bd3350f6a7:src/test/regress/expected/stats_import.out) <(git show 40e27d04b4f643cfb78af8db42a1f2e700ec9876:src/test/regress/expected/stats_import.out)shows differences. This does not match your claim "the entire file have been updated to match". You show one diff afterwards, but please use English to say that you made further modifications after the entire update. There is also another diff regarding the outputfthat is not explained. -
s/sate/state/
650ab8a Stats: use schemaname/relname instead of regclass.
Differences found in documentation file
- so what action did you take?
catversion is not updated in pg15
- Please do not refer to "pg15". Use "yb-pg15". Ditto elsewhere
9c02e3a pg_dump: Retrieve attribute statistics in batches.
Cause: Missing 4694aed:
relallfrozencolumn was added.
Resolution: Simple merge with skippingrelallfrozencolumn.
- This conflict is concerning relallfrozen code adjacently added. You don't skip relallfrozen code: those were context lines.
5920cb4 Import 'Transfer statistics during pg_upgrade'
- This is a merge commit that should not exist.
* 5920cb4164f Import 'Transfer statistics during pg_upgrade' - Hari Krishna Sunder (BackupStats)
|\
| * ff9073a5202 pg_dump: Fix query for gathering attribute stats on older versions. - Nathan Bossart
| * c1fe52f64bb pg_dump: Retrieve attribute statistics in batches. - Nathan Bossart
| * ae866db06df pg_dump: Reduce memory usage of dumps with statistics. - Nathan Bossart
| * 509c84a6cfa Matview statistics depend on matview data. - Jeff Davis
| * a033888cd0c Add pg_dump --with-{schema|data|statistics} options. - Jeff Davis
| * da5faf1fdf7 Stats: use schemaname/relname instead of regclass. - Jeff Davis
| * 167ddc0bc98 Don't convert to and from floats in pg_dump. - Jeff Davis
| * 8ef62be4cb8 CREATE INDEX: don't update table stats if autovacuum=off. - Jeff Davis
| * 28fc7225dee Organize and deduplicate statistics import tests. - Jeff Davis
| * 2f834403150 Adjust pg_dump tag for relation stats. - Jeff Davis
| * cd9ae07210c Use attnum to identify index columns in pg_restore_attribute_stats(). - Tom Lane
| * d3116245b04 pg_dump: prepare attribute stats query. - Jeff Davis
| * b3aa4014557 Avoid unnecessary relation stats query in pg_dump. - Jeff Davis
| * 47f47fce245 Ignore hash's relallvisible when checking pg_upgrade from pre-v10. - Tom Lane
| * 2c18d3853fa Trial fix for old cross-version upgrades. - Jeff Davis
| * a236d972da3 Transfer statistics during pg_upgrade. - Jeff Davis
| * 676c7b8ebe5 Clean up some pg_dump tests - Peter Eisentraut
|/
* 12398eddbd5 Merge tag 'REL_15_12' into yb-pg15 - Jason Kim (yb-pg15)
Last commit should be ff9073a.
- Note: In your merge resolution notes, all your
<and>are not clear on which side represents the original patch and which side represents your patch.
Not sure which one you mean by case 18. For case 5 yes its safe to come after 1, and is preferred since in future we might selective backport 2-4. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't forget to push directly, not use GitHub interface to merge.
Clean up some pg_dump tests
-
I previously said "You should add an explanation for section ALTER INDEX ... ATTACH PARTITION.", but this still doesn't appear to be done. If you had done it before, then I would have realized my misread of
- }, + unlike => {},Now that I look at it, it makes sense: all the items inside
unlikewere removed. Your new version removes theunlikeentirely. Both versions were merged in a nontrivial way: the most trivial way would be to keepunlike => {and}as two separate lines. You can change the code to that, in which case an explanation is not needed. Or if you keep the current code, I suggest writing that you removed the entireunlikebecause it's completely empty (unlike the original patch which still hasexclude_measurement).
yb conflict resolutions:
- src/bin/pg_dump/t/002_pg_dump.pl
>@@ -1351,7 +1349,6 @@ my %tests = (
>- section_pre_data => 1,
Cause: Missing 00d9dcf: 'LO create (with no data)' test is not present in yb-pg15.
Resolution: Change is not needed.
>@@ -3210,7 +3207,6 @@ my %tests = (
>- only_dump_measurement => 1,
Cause: Missing 1a05c1d: `COPY test_compression_method` test is not present in yb-pg15.
Resolution: Change is not needed.
>@@ -3834,35 +3829,12 @@ my %tests = (
<@@ -3208,33 +3205,12 @@ my %tests = (
>- no_large_objects => 1,
<- no_blobs => 1,
>- test_schema_plus_large_objects => 1,
>- only_dump_measurement => 1,
>- exclude_measurement_data => 1,
<- test_schema_plus_blobs => 1,
Cause: Missing 35ce24c: `no_blobs` renamed to `no_large_objects`, and `test_schema_plus_blobs` renamed to `test_schema_plus_large_objects`.
Missing a563c24: `exclude_measurement_data` and `only_dump_measurement` runs were added.
Resolution: Keeping the blob terminology as per pg15. Skipping the runs that are not available in yb-pg15.
>@@ -3913,7 +3885,6 @@ my %tests = (
>- exclude_measurement_data => 1,
Cause: Missing a563c24: `exclude_measurement_data` run was added.
Resolution: Skipping the runs that are not available in yb-pg15.
>@@ -3927,35 +3898,12 @@ my %tests = (
<@@ -3282,34 +3258,10 @@ my %tests = (
>- no_large_objects => 1,
<- no_blobs => 1,
>- exclude_measurement_data => 1,
>- test_schema_plus_large_objects => 1,
<- test_schema_plus_blobs => 1,
Cause: Missing 35ce24c: `no_blobs` renamed to `no_large_objects`, and `test_schema_plus_blobs` renamed to `test_schema_plus_large_objects`.
Missing a563c24: `exclude_measurement_data` and `only_dump_measurement` runs were added.
Resolution: Keeping the blob terminology as per pg15. Skipping the runs that are not available in yb-pg15.
1) Remove useless entries from "unlike" lists. Runs that are not
listed in "like" don't need to be excluded in "unlike".
2) Ensure there is always a "like" list, even if it is empty. This
makes the test more self-documenting.
3) Use predefined lists such as %full_runs where appropriate, instead
of listing all runs separately.
Also add code that checks 1 and 2 automatically and dies with an error
for violations.
Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://www.postgresql.org/message-id/flat/1f8cb371-e84e-434e-0367-6b716fb16fa1@eisentraut.org
(cherry picked from commit 41a2844)
yb conflict resolutions:
- doc/src/sgml/ref/pg_dump.sgml
Only changes that were not conflicting were picked.
- src/bin/pg_dump/pg_backup_archiver.c
>@@ -1036,15 +1041,21 @@ restore_toc_entry(ArchiveHandle *AH, TocEntry *te, bool is_parallel)
<@@ -966,10 +972,17 @@ restore_toc_entry(ArchiveHandle *AH, TocEntry *te, bool is_parallel)
>- if ((reqs & (REQ_SCHEMA | REQ_DATA)) != 0 && ropt->txn_size > 0)
>+ if ((reqs & (REQ_SCHEMA | REQ_DATA | REQ_STATS)) != 0 && ropt->txn_size > 0)
Cause: Missing 959b38d: if conditions related to `ropt->txn_size` was added.
Resolution: Skip the incoming change since it is not relevant to pg15.
- src/bin/pg_dump/pg_dump.h
>@@ -83,10 +83,13 @@ typedef enum
<@@ -82,9 +82,12 @@ typedef enum
>+#define NUM_DUMPABLE_OBJECT_TYPES (DO_SUBSCRIPTION_REL + 1)
<+#define NUM_DUMPABLE_OBJECT_TYPES (DO_SUBSCRIPTION + 1)
Cause: Missing 9a17be1: DO_SUBSCRIPTION_REL was added as the last element in the DumpableObjectType macro.
Resolution: Set NUM_DUMPABLE_OBJECT_TYPES to DO_SUBSCRIPTION which is the last element in DumpableObjectType.
- src/bin/pg_dump/pg_dump_sort.c
>@@ -148,11 +149,12 @@ static const int dbObjectTypePriority[] =
>+ [DO_REL_STATS] = PRIO_STATISTICS_DATA_DATA,
Cause: Missing afd8ef3: CdbObjectTypePriority array switched to c99-designated initializer syntax.
Resolution: Use pg15 style syntax to assign PRIO_STATISTICS_DATA_DATA. StaticAssertDecl conflict is the same as the one mentioned in pg_dump.h.
<@@ -146,10 +147,11 @@ static const int dbObjectTypePriority[] =
<+ PRIO_STATISTICS_DATA_DATA, /* DO_STATISTICS_DATA_DATA */
>-StaticAssertDecl(lengthof(dbObjectTypePriority) == (DO_SUBSCRIPTION_REL + 1),
<-StaticAssertDecl(lengthof(dbObjectTypePriority) == (DO_SUBSCRIPTION + 1),
Cause: Missing <commit id>: <Explanation of differences>
Resolution: <Explanation of resolution>
- src/bin/pg_dump/t/002_pg_dump.pl
>@@ -1432,6 +1464,7 @@ my %tests = (
>+ no_schema => 1,
Cause: Missing 00d9dcf: 'LO create (with no data)' test was added.
Resolution: Skip test since its not in pg15.
>@@ -3310,6 +3354,7 @@ my %tests = (
>+ no_schema => 1,
Cause: Missing 1a05c1d: 'COPY test_compression_method' test was added.
Resolution: Skip test since its not in pg15.
>@@ -3480,6 +3525,7 @@ my %tests = (
>+ no_schema => 1,
Cause: Missing a563c24: 'COPY measurement' test was added.
Resolution: Skip test since its not in pg15.
- src/test/recovery/t/027_stream_regress.pl
>@@ -107,7 +107,7 @@ command_ok(
<@@ -93,14 +93,14 @@ $node_primary->wait_for_catchup($node_standby_1, 'replay',
>- '--no-sync',
>+ '--no-sync', '--no-statistics',
<- '--no-sync', '-p', $node_primary->port,
<+ '--no-sync', '--no-statistics', '-p', $node_primary->port,
Cause: Missing ce1b0f9: TAP test improvements were made.
Resolution: Combine merge the new --no-statistics param with existing pg15 params.
>@@ -116,7 +116,7 @@ command_ok(
>- '--no-sync',
>+ '--no-sync', '--no-statistics',
<- '--no-sync', '-p', $node_standby_1->port
<+ '--no-sync', '--no-statistics', '-p', $node_standby_1->port
Cause: Missing ce1b0f9: TAP test improvements were made.
Resolution: Combine merge the new --no-statistics param with existing pg15 params.
Add support to pg_dump for dumping stats, and use that during
pg_upgrade so that statistics are transferred during upgrade. In most
cases this removes the need for a costly re-analyze after upgrade.
Some statistics are not transferred, such as extended statistics or
statistics with a custom stakind.
Now pg_dump accepts the options --schema-only, --no-schema,
--data-only, --no-data, --statistics-only, and --no-statistics; which
allow all combinations of schema, data, and/or stats. The options are
named this way to preserve compatibility with the previous
--schema-only and --data-only options.
Statistics are in SECTION_DATA, unless the object itself is in
SECTION_POST_DATA.
The stats are represented as calls to pg_restore_relation_stats() and
pg_restore_attribute_stats().
Author: Corey Huinker, Jeff Davis
Reviewed-by: Jian He
Discussion: https://postgr.es/m/CADkLM=fzX7QX6r78fShWDjNN3Vcr4PVAnvXxQ4DiGy6V=0bCUA@mail.gmail.com
Discussion: https://postgr.es/m/CADkLM%3DcB0rF3p_FuWRTMSV0983ihTRpsH%2BOCpNyiqE7Wk0vUWA%40mail.gmail.com
(cherry picked from commit 1fd1bd8)
Per buildfarm and reports, it seems that 9.X to 18 upgrades were failing after commit 1fd1bd8 due to an incorrect regex. Loosen the regex to accommodate older versions. Reported-by: vignesh C <vignesh21@gmail.com> Reported-by: Andrew Dunstan <andrew@dunslane.net> Discussion: https://postgr.es/m/CALDaNm3GUs+U8Nt4S=V5zmb+K8-RfAc03vRENS0teeoq0Lc6Tw@mail.gmail.com Discussion: https://postgr.es/m/ea4cbbc1-c5a5-43d1-9618-8ff3f2155bfe@dunslane.net (cherry picked from commit ab84d0f)
Our cross-version upgrade tests have been failing for some pre-v10 source versions since commit 1fd1bd8. This turns out to be because relallvisible may change for tables that have hash indexes, because the upgrade process forcibly reindexes such indexes to deal with the changes made in v10. Fortunately, the set of tables that have such indexes is small and won't change anymore in those branches. So just hack up AdjustUpgrade.pm to not compare the relallvisible values of those specific tables. While here, also tighten the regex that suppresses comparison of version fields. Discussion: https://postgr.es/m/812817.1740277228@sss.pgh.pa.us (cherry picked from commit fc0d0ce)
The few fields we need can be easily collected in getTables() and getIndexes() and stored in RelStatsInfo. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reported-by: Andres Freund <andres@anarazel.de> Co-authored-by: Corey Huinker <corey.huinker@gmail.com> Co-authored-by: Jeff Davis <pgsql@j-davis.com> Discussion: https://postgr.es/m/CADkLM=f0a43aTd88xW4xCFayEF25g-7hTrHX_WhV40HyocsUGg@mail.gmail.com (cherry picked from commit 8f42718)
Follow precedent in pg_dump for preparing queries to improve performance. Also, simplify the query by removing unnecessary joins. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reported-by: Andres Freund <andres@anarazel.de> Co-authored-by: Corey Huinker <corey.huinker@gmail.com> Co-authored-by: Jeff Davis <pgsql@j-davis.com> Discussion: https://postgr.es/m/CADkLM=dRMC6t8gp9GVf6y6E_r5EChQjMAAh_vPyih_zMiq0zvA@mail.gmail.com (cherry picked from commit 6ee3b91)
yb conflict resolutions: - doc/src/sgml/func.sgml Only changes that were not conflicting were picked. - src/backend/statistics/attribute_stats.c Previous cherry-pick eea554f imported the latest version of this file already. - src/test/regress/expected/stats_import.out File was cherry-picked as part of eea554f which is a higher version. The contents of the file have been updated to match the state as of 40e27d0. The following changes had to be made: diff <(git show HEAD:src/test/regress/expected/stats_import.out) <(git show 40e27d0:src/test/regress/expected/stats_import.out) < ERROR: invalid input syntax for type integer: "four" --- > WARNING: invalid input syntax for type integer: "four" > pg_restore_attribute_stats > ---------------------------- > f > (1 row) > 458c463 < stats_import | test | id | f | 0.8 | 9 | -0.9 | | | | | | | | | | --- > stats_import | test | id | f | 0.9 | 10 | -0.4 | | | | | | | | | | Cause: Missing ccff2d2: Some ERRORs were converted to softer WARNINGS. Resolution: yb-pg15 does not contain such soft warning errors. - src/test/regress/sql/stats_import.sql File was cherry-picked as part of eea554f which is a higher version. The contents of the entire file have been updated to match the state as of 40e27d0. Previously we used attname for both table and index columns, but that is problematic for indexes because their attnames are assigned by internal rules that don't guarantee to preserve the names across dump and reload. (This is what's causing the remaining buildfarm failures in cross-version-upgrade tests.) Fortunately we can use attnum instead, since there's no such thing as adding or dropping columns in an existing index. We met this same problem previously with ALTER INDEX ... SET STATISTICS, and solved it the same way, cf commit 5b6d13e. In pg_restore_attribute_stats() itself, we accept either attnum or attname, but the policy used by pg_dump is to always use attname for tables and attnum for indexes. Author: Tom Lane <tgl@sss.pgh.pa.us> Author: Corey Huinker <corey.huinker@gmail.com> Discussion: https://postgr.es/m/1457469.1740419458@sss.pgh.pa.us (cherry picked from commit 40e27d0)
Do not use fmtId(), just use dobj->name directly, like for table data. (cherry picked from commit 424eded)
yb conflict resolutions:
- src/test/regress/expected/stats_import.out
[Unknown context]
>- 0 | -1 | 0 | 0
>+ 1 | 0 | 0 | 0
<- 0 | -1 | 0
<+ 1 | 0 | 0
>-SELECT relpages, reltuples, relallvisible, relallfrozen
<-SELECT relpages, reltuples, relallvisible
>- relpages | reltuples | relallvisible | relallfrozen
>-----------+-----------+---------------+--------------
>- 0 | -1 | 0 | 0
<- relpages | reltuples | relallvisible
<-----------+-----------+---------------
<- 0 | -1 | 0
>@@ -182,13 +153,12 @@ FROM pg_class
<@@ -181,13 +152,12 @@ FROM pg_class
>- 17 | 400 | 4 | 2
>+ -17 | 400 | 4 | 2
<- 17 | 400 | 4
<+ -17 | 400 | 4
>@@ -1006,20 +981,6 @@ SELECT 3, 'tre', (3, 3.3, 'TRE', '2003-03-03', NULL)::stats_import.complex_type,
<@@ -981,19 +956,6 @@ SELECT 3, 'tre', (3, 3.3, 'TRE', '2003-03-03', NULL)::stats_import.complex_type,
>- 'relallvisible', '0'::integer,
>- 'relallfrozen', '0'::integer
<- 'relallvisible', '0'::integer
>@@ -1207,160 +1179,17 @@ SELECT pg_catalog.pg_clear_attribute_stats(
<@@ -1181,154 +1154,17 @@ SELECT pg_catalog.pg_clear_attribute_stats(
>- 'relallvisible', 4::integer,
>- 'relallfrozen', 3::integer);
<- 'relallvisible', 4::integer);
>- 'relallvisible', 4::integer,
>- 'relallfrozen', 3::integer);
<- 'relallvisible', 4::integer);
>- 'relallvisible', 4::integer,
>- 'relallfrozen', 3::integer);
<- 'relallvisible', 4::integer);
>- 'relallvisible', 4::integer,
>- 'relallfrozen', 3::integer);
<- 'relallvisible', 4::integer);
>- 'relallfrozen', 3::integer,
>- 'relallvisible', 4::integer,
>- 'relallfrozen', 3::integer);
<- 'relallvisible', 4::integer);
Cause: Missing 99f8f3f: `relallfrozen` was introduced.
Resolution: Feature does not exist in pg15
>@@ -480,7 +536,7 @@ AND inherited = false
<@@ -455,7 +511,7 @@ AND inherited = false
>- stats_import | test | id | f | 0.9 | 10 | -0.4 | | | | | | | | | |
>+ stats_import | test | id | f | 0.23 | 5 | 0.6 | | | | | | | | | |
<- stats_import | test | id | f | 0.8 | 9 | -0.9 | | | | | | | | | |
<+ stats_import | test | id | f | 0.22 | 5 | 0.6 | | | | | | | | | |
>@@ -508,18 +560,15 @@ AND inherited = false
<@@ -483,18 +535,15 @@ AND inherited = false
>+ stats_import | test | id | f | 0.23 | 5 | 0.6 | {2,1,3} | {0.3,0.25,0.05} | | | | | | | |
<+ stats_import | test | id | f | 0.22 | 5 | 0.6 | {2,1,3} | {0.3,0.25,0.05} | | | | | | | |
Cause: Missing ccff2d2: Some ERRORs were converted to softer WARNINGS.
Resolution: pg15 values are not updated due to the ERROR.
- src/test/regress/sql/stats_import.sql
[Unknown context]
>-SELECT relpages, reltuples, relallvisible, relallfrozen
<-SELECT relpages, reltuples, relallvisible
>@@ -686,16 +675,6 @@ SELECT 4, 'four', NULL, int4range(0,100), NULL;
<@@ -674,15 +663,6 @@ SELECT 4, 'four', NULL, int4range(0,100), NULL;
>- 'relallvisible', '0'::integer,
>- 'relallfrozen', '0'::integer
<- 'relallvisible', '0'::integer
>@@ -848,160 +827,24 @@ FROM pg_statistic s
<@@ -835,154 +815,24 @@ FROM pg_statistic s
>- 'relallvisible', 4::integer,
>- 'relallfrozen', 3::integer);
<- 'relallvisible', 4::integer);
>- 'relallvisible', 4::integer,
>- 'relallfrozen', 3::integer);
<- 'relallvisible', 4::integer);
>- 'relallvisible', 4::integer,
>- 'relallfrozen', 3::integer);
<- 'relallvisible', 4::integer);
>- 'relallvisible', 4::integer,
>- 'relallfrozen', 3::integer);
<- 'relallvisible', 4::integer);
>- 'relallfrozen', 3::integer,
>- 'relallvisible', 4::integer,
>- 'relallfrozen', 3::integer);
<- 'relallvisible', 4::integer);
Cause: Missing 99f8f3f: `relallfrozen` was introduced.
Resolution: Feature does not exist in pg15
Author: Corey Huinker <corey.huinker@gmail.com>
Reported-by: Melanie Plageman <melanieplageman@gmail.com>
Discussion: https://postgr.es/m/CAAKRu_bWEqUfxhODfJ-XbZC75vq=P6DYOKK6biyey=yM1Ah3Hg@mail.gmail.com
Discussion: https://postgr.es/m/CADkLM=f1n2_Vomq0gKab7xdxDHmJGgn=DE48P8fzQOp3Mrs1Qg@mail.gmail.com
(cherry picked from commit 1d33de9)
yb conflict resolutions:
- src/test/regress/expected/stats_import.out
>@@ -12,7 +12,36 @@ CREATE TABLE stats_import.test(
<@@ -12,7 +12,35 @@ CREATE TABLE stats_import.test(
>+ 'relallvisible', 24::integer,
>+ 'relallfrozen', 27::integer);
<+ 'relallvisible', 24::integer);
>+SELECT relname, relpages, reltuples, relallvisible, relallfrozen
<+SELECT relname, relpages, reltuples, relallvisible
>+ relname | relpages | reltuples | relallvisible | relallfrozen
>+---------+----------+-----------+---------------+--------------
>+ test | 18 | 21 | 24 | 27
<+ relname | relpages | reltuples | relallvisible
<+---------+----------+-----------+---------------
<+ test | 18 | 21 | 24]
Cause: Missing 99f8f3f: `relallfrozen` was introduced.
Resolution: Feature does not exist in pg15
- src/test/regress/sql/stats_import.sql
>@@ -15,8 +15,25 @@ CREATE TABLE stats_import.test(
<@@ -15,8 +15,24 @@ CREATE TABLE stats_import.test(
>+ 'relallvisible', 24::integer,
>+ 'relallfrozen', 27::integer);
<+ 'relallvisible', 24::integer);
>+SELECT relname, relpages, reltuples, relallvisible, relallfrozen
<+SELECT relname, relpages, reltuples, relallvisible
Cause: Missing 99f8f3f: `relallfrozen` was introduced.
Resolution: Feature does not exist in pg15
We previously fixed this for binary upgrade in 71b6617, but a
similar problem remained when dumping statistics without data.
Fix by not opportunistically updating table stats during CREATE INDEX
when autovacuum is disabled. For stats to be stable at all, the server
needs to be aware that it should not take every opportunity to update
stats. Per discussion, autovacuum=off is a signal that the user
expects stats to be stable; though if necessary, we could create
a more specific mode in the future.
Reported-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Discussion: https://postgr.es/m/CAExHW5vf9D+8-a5_BEX3y=2y_xY9hiCxV1=C+FnxDvfprWvkng@mail.gmail.com
Discussion: https://postgr.es/m/ca81cbf6e6ea2af838df972801ad4da52640a503.camel%40j-davis.com
(cherry picked from commit d611f8b)
Commit 8f42718 improved performance by remembering relation stats as native types rather than issuing a new query for each relation. Using native types is fine for integers like relpages; but reltuples is floating point. The commit controllled for that complexity by using setlocale(LC_NUMERIC, "C"). After that, Alexander Lakhin found a problem in pg_strtof(), fixed in 00d61a0. While we aren't aware of any more problems with that approach, it seems wise to just use a string the whole way for floating point values, as Corey's original patch did, and get rid of the setlocale(). Integers are still converted to native types to avoid wasting memory. Co-authored-by: Corey Huinker <corey.huinker@gmail.com> Discussion: https://postgr.es/m/3049348.1740855411@sss.pgh.pa.us Discussion: https://postgr.es/m/560cca3781740bd69881bb07e26eb8f65b09792c.camel%40j-davis.com (cherry picked from commit 1852aea)
yb conflict resolutions:
- doc/src/sgml/func.sgml
Only changes that were not conflicting were picked.
- src/backend/statistics/relation_stats.c
[Unknown context]
>@@ -187,20 +199,22 @@ relation_statistics_update(FunctionCallInfo fcinfo)
<@@ -170,18 +182,20 @@ relation_statistics_update(FunctionCallInfo fcinfo)
>- LOCAL_FCINFO(newfcinfo, 5);
>+ LOCAL_FCINFO(newfcinfo, 6);
<- LOCAL_FCINFO(newfcinfo, 4);
<+ LOCAL_FCINFO(newfcinfo, 5);
>- InitFunctionCallInfoData(*newfcinfo, NULL, 5, InvalidOid, NULL, NULL);
>+ InitFunctionCallInfoData(*newfcinfo, NULL, 6, InvalidOid, NULL, NULL);
<- InitFunctionCallInfoData(*newfcinfo, NULL, 4, InvalidOid, NULL, NULL);
<+ InitFunctionCallInfoData(*newfcinfo, NULL, 5, InvalidOid, NULL, NULL);
>+ newfcinfo->args[5].value = UInt32GetDatum(0);
>+ newfcinfo->args[5].isnull = false;
<+ newfcinfo->args[4].value = UInt32GetDatum(0);
<+ newfcinfo->args[4].isnull = false;
Cause: Missing 99f8f3f: `relallfrozen` was introduced.
Resolution: New arg was added at position 2. So, instead of updating from 5 to 6, in yb-pg15 we update from 4 to 5.
- src/include/catalog/catversion.h
>@@ -57,6 +57,6 @@
>-#define CATALOG_VERSION_NO 202503241
>+#define CATALOG_VERSION_NO 202503251
catversion is not updated in yb-pg15
- src/test/regress/expected/stats_import.out
[Unknown context]
>@@ -238,7 +261,8 @@ WHERE oid = 'stats_import.test'::regclass;
>- 'relation', 'stats_import.test'::regclass,
>+ 'schemaname', 'stats_import',
>+ 'relname', 'test',
Cause: Missing 99f8f3f: `relallfrozen` was introduced with `ok: just relallfrozen` test.
Resolution: Test case does not exist in yb-pg15.
>@@ -1219,6 +1295,53 @@ AND attname = 'arange';
<@@ -1193,6 +1268,52 @@ AND attname = 'arange';
>+ 'relallvisible', 5::integer,
>+ 'relallfrozen', 3::integer);
<+ 'relallvisible', 5::integer);
>+SELECT relname, relpages, reltuples, relallvisible, relallfrozen
<+SELECT relname, relpages, reltuples, relallvisible
>+ relname | relpages | reltuples | relallvisible | relallfrozen
>+------------+----------+-----------+---------------+--------------
>+ stats_temp | -19 | 401 | 5 | 3
<+ relname | relpages | reltuples | relallvisible
<+------------+----------+-----------+---------------
<+ stats_temp | -19 | 401 | 5
Cause: Missing 99f8f3f: `relallfrozen` was introduced.
Resolution: Feature does not exist in yb-pg15
- src/test/regress/sql/stats_import.sql
[Unknown context]
>@@ -167,7 +189,8 @@ WHERE oid = 'stats_import.test'::regclass;
>- 'relation', 'stats_import.test'::regclass,
>+ 'schemaname', 'stats_import',
>+ 'relname', 'test',
Cause: Missing 99f8f3f: `relallfrozen` was introduced with `ok: just relallfrozen` test.
Resolution: Test case does not exist in yb-pg15.
>@@ -864,4 +940,34 @@ AND tablename = 'test'
<@@ -851,4 +926,33 @@ AND tablename = 'test'
>+ 'relallvisible', 5::integer,
>+ 'relallfrozen', 3::integer);
<+ 'relallvisible', 5::integer);
>+SELECT relname, relpages, reltuples, relallvisible, relallfrozen
<+SELECT relname, relpages, reltuples, relallvisible
Cause: Missing 99f8f3f: `relallfrozen` was introduced.
Resolution: Feature does not exist in yb-pg15
For import and export, use schemaname/relname rather than
regclass.
This is more natural during export, fits with the other arguments
better, and it gives better control over error handling in case we
need to downgrade more errors to warnings.
Also, use text for the argument types for schemaname, relname, and
attname so that casts to "name" are not required.
Author: Corey Huinker <corey.huinker@gmail.com>
Discussion: https://postgr.es/m/CADkLM=ceOSsx_=oe73QQ-BxUFR2Cwqum7-UP_fPe22DBY0NerA@mail.gmail.com
(cherry picked from commit 650ab8a)
yb conflict resolutions:
- src/bin/pg_dump/t/002_pg_dump.pl
>@@ @@ -1483,6 +1494,7 @@ my %tests = (
>+ schema_only_with_statistics => 1,
Cause: Missing 00d9dcf: 'LO create (with no data)' test was added.
Resolution: Skip test since its not in pg15.
>@@ -3380,6 +3400,7 @@ my %tests = (
>+ schema_only_with_statistics => 1,
Cause: Missing 1a05c1d: 'COPY test_compression_method' test was added.
Resolution: Skip test since its not in pg15.
<@@ -3552,6 +3573,7 @@ my %tests = (
>+ schema_only_with_statistics => 1,
Cause: Missing a563c24: 'COPY measurement' test was added.
Resolution: Skip test since its not in pg15.
By adding the positive variants of options, in addition to the
negative variants that already exist, users can be explicit about what
pg_dump should produce.
Discussion: https://postgr.es/m/bd0513e4b1ea2b2f2d06f02720c6579711cb62a6.camel@j-davis.com
Reviewed-by: Corey Huinker <corey.huinker@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
(cherry picked from commit bde2fb7)
yb conflict resolutions:
- src/bin/pg_dump/pg_backup_archiver.c
>@@ -102,7 +102,8 @@ static void pending_list_append(TocEntry *l, TocEntry *te);
<@@ -124,7 +124,8 @@ static void ready_list_insert(ParallelReadyList *ready_list, TocEntry *te);
>-static void move_to_ready_heap(TocEntry *pending_list,
>+static void move_to_ready_heap(ArchiveHandle *AH,
<-static void move_to_ready_list(TocEntry *pending_list,
<+static void move_to_ready_list(ArchiveHandle *AH,
Cause: Missing 9bfd44b: pg_restore's ready_list was converted to a priority queue
Resolution: Combine incoming addition of ArchiveHandle with pg15's ready_list.
>@@ -4331,7 +4352,7 @@ restore_toc_entries_parallel(ArchiveHandle *AH, ParallelState *pstate,
<@@ -4146,7 +4167,7 @@ restore_toc_entries_parallel(ArchiveHandle *AH, ParallelState *pstate,
>- move_to_ready_heap(pending_list, ready_heap, AH->restorePass);
>+ move_to_ready_heap(AH, pending_list, ready_heap, AH->restorePass);
<- move_to_ready_list(pending_list, &ready_list, AH->restorePass);
<+ move_to_ready_list(AH, pending_list, &ready_list, AH->restorePass);
Cause: Missing 9bfd44b: pg_restore's ready_list was converted to a priority queue
Resolution: Combine incoming addition of ArchiveHandle with pg15's ready_list.
>@@ -4380,7 +4401,7 @@ restore_toc_entries_parallel(ArchiveHandle *AH, ParallelState *pstate,
<@@ -4195,7 +4216,7 @@ restore_toc_entries_parallel(ArchiveHandle *AH, ParallelState *pstate,
>- move_to_ready_heap(pending_list, ready_heap, AH->restorePass);
>+ move_to_ready_heap(AH, pending_list, ready_heap, AH->restorePass);
<- move_to_ready_list(pending_list, &ready_list, AH->restorePass);
<+ move_to_ready_list(AH, pending_list, &ready_list, AH->restorePass);
Cause: Missing 9bfd44b: pg_restore's ready_list was converted to a priority queue
Resolution: Combine incoming addition of ArchiveHandle with pg15's ready_list.
>@@ -4551,7 +4572,8 @@ TocEntrySizeCompareBinaryheap(void *p1, void *p2, void *arg)
<@@ -4429,7 +4450,8 @@ TocEntrySizeCompare(const void *p1, const void *p2)
>-move_to_ready_heap(TocEntry *pending_list,
>+move_to_ready_heap(ArchiveHandle *AH,
<-move_to_ready_list(TocEntry *pending_list,
<+move_to_ready_list(ArchiveHandle *AH,
Cause: Missing 9bfd44b: pg_restore's ready_list was converted to a priority queue
Resolution: Combine incoming addition of ArchiveHandle with pg15's ready_list.
REFRESH MATERIALIZED VIEW replaces the storage, which resets
statistics, so statistics must be restored afterward.
If both statistics and data are being dumped for a materialized view,
add a dependency from the former to the latter. Defer the statistics
to SECTION_POST_DATA, and use RESTORE_PASS_POST_ACL.
Reported-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Discussion: https://postgr.es/m/CAExHW5s47kmubpbbRJzSM-Zfe0Tj2O3GBagB7YAyE8rQ-V24Uw@mail.gmail.com
(cherry picked from commit a0a4601)
yb conflict resolutions:
- src/bin/pg_dump/pg_backup_archiver.c
>@@ -3849,7 +3890,7 @@ _printTocEntry(ArchiveHandle *AH, TocEntry *te, const char *pfx)
>- * string if any, but we have three special cases:
>+ * string if any, but we have four special cases:
<@@ -3698,12 +3739,50 @@ _printTocEntry(ArchiveHandle *AH, TocEntry *te, const char *pfx)
>@@ -3862,6 +3903,11 @@ _printTocEntry(ArchiveHandle *AH, TocEntry *te, const char *pfx)
>+ * 4. Entries with a defnDumper need to call it to generate the
<+ * Entries with a defnDumper need to call it to generate the
Cause: Missing a45c78e: pg_dump changes to better handle large objects
Resolution: This improvement does not exist in pg15 so the comment is not relevant.
Right now, pg_dump stores all generated commands for statistics in
memory. These commands can be quite large and therefore can
significantly increase pg_dump's memory footprint. To fix, wait
until we are about to write out the commands before generating
them, and be sure to free the commands after writing. This is
implemented via a new defnDumper callback that works much like the
dataDumper one but is specifically designed for TOC entries.
Custom dumps that include data might write the TOC twice (to update
data offset information), which would ordinarily cause pg_dump to
run the attribute statistics queries twice. However, as a hack, we
save the length of the written-out entry in the first pass and skip
over it in the second. While there is no known technical issue
with executing the queries multiple times and rewriting the
results, it's expensive and feels risky, so let's avoid it.
As an exception, we _do_ execute the queries twice for the tar
format. This format does a second pass through the TOC to generate
the restore.sql file. pg_restore doesn't use this file, so even if
the second round of queries returns different results than the
first, it won't corrupt the output; the archive and restore.sql
file will just have different content. A follow-up commit will
teach pg_dump to gather attribute statistics in batches, which our
testing indicates more than makes up for the added expense of
running the queries twice.
Author: Corey Huinker <corey.huinker@gmail.com>
Co-authored-by: Nathan Bossart <nathandbossart@gmail.com>
Reviewed-by: Jeff Davis <pgsql@j-davis.com>
Discussion: https://postgr.es/m/CADkLM%3Dc%2Br05srPy9w%2B-%2BnbmLEo15dKXYQ03Q_xyK%2BriJerigLQ%40mail.gmail.com
(cherry picked from commit 7d5c83b)
yb conflict resolutions:
- src/bin/pg_dump/pg_dump.c
>@@ -10642,16 +10746,16 @@ dumpRelationStats_dumper(Archive *fout, const void *userArg)
<@@ -10062,15 +10166,16 @@ dumpRelationStats_dumper(Archive *fout, const void *userArg)
>+ /* Fetch the next batch of attribute statistics if needed. */
>+ if (rownum >= PQntuples(res))
>+ {
>+ PQclear(res);
>+ res = fetchAttributeStats(fout);
>+ rownum = 0;
>+ }
<+ /* Fetch the next batch of attribute statistics if needed. */
<+ if (rownum >= PQntuples(res))
<+ {
<+ PQclear(res);
<+ res = fetchAttributeStats(fout);
<+ rownum = 0;
<+ }
Cause: Missing 4694aed: `relallfrozen` column was added to the query.
Resolution: Simple merge of the code. This shows up as a conflict since the previous lines were different.
Currently, pg_dump gathers attribute statistics with a query per
relation, which can cause pg_dump to take significantly longer,
especially when there are many relations. This commit addresses
this by teaching pg_dump to gather attribute statistics for 64
relations at a time. Some simple tests showed this was the optimal
batch size, but performance may vary depending on the workload.
Our lookahead code determines the next batch of relations by
searching the TOC sequentially for relevant entries. This approach
assumes that we will dump all such entries in TOC order, which
unfortunately isn't true for dump formats that use
RestoreArchive(). RestoreArchive() does multiple passes through
the TOC and selectively dumps certain groups of entries each time.
This is particularly problematic for index stats and a subset of
matview stats; both are in SECTION_POST_DATA, but matview stats
that depend on matview data are dumped in RESTORE_PASS_POST_ACL,
while all other stats are dumped in RESTORE_PASS_MAIN. To handle
this, this commit moves all statistics data entries in
SECTION_POST_DATA to RESTORE_PASS_POST_ACL, which ensures that we
always dump them in TOC order. A convenient side effect of this
change is that we can revert a decent chunk of commit a0a4601,
but that is left for a follow-up commit.
Author: Corey Huinker <corey.huinker@gmail.com>
Co-authored-by: Nathan Bossart <nathandbossart@gmail.com>
Reviewed-by: Jeff Davis <pgsql@j-davis.com>
Discussion: https://postgr.es/m/CADkLM%3Dc%2Br05srPy9w%2B-%2BnbmLEo15dKXYQ03Q_xyK%2BriJerigLQ%40mail.gmail.com
(cherry picked from commit 9c02e3a)
Commit 9c02e3a taught pg_dump to retrieve attribute statistics for 64 relations at a time. pg_dump supports dumping from v9.2 and newer versions, but our query for retrieving statistics for multiple relations uses WITH ORDINALITY and multi-argument UNNEST(), both of which were introduced in v9.4. To fix, we resort to gathering statistics for a single relation at a time on versions older than v9.4. Per buildfarm member crake. Author: Corey Huinker <corey.huinker@gmail.com> Discussion: https://postgr.es/m/Z_BcWVMvlUIJ_iuZ%40nathan (cherry picked from commit f0d0083)
Import upstream 17 postgres commit from master related to the feature
Transfer statistics during pg_upgrade.Master commits:
f0d0083 | pg_dump: Fix query for gathering attribute stats on older versions. | 2025-04-04 21:05:30 -0500
9c02e3a | pg_dump: Retrieve attribute statistics in batches. | 2025-04-04 14:51:08 -0500
7d5c83b | pg_dump: Reduce memory usage of dumps with statistics. | 2025-04-04 14:51:08 -0500
a0a4601 | Matview statistics depend on matview data. | 2025-03-28 16:12:55 -0700
bde2fb7 | Add pg_dump --with-{schema|data|statistics} options. | 2025-03-25 17:36:38 -0700
650ab8a | Stats: use schemaname/relname instead of regclass. | 2025-03-25 11:16:06 -0700
1852aea | Don't convert to and from floats in pg_dump. | 2025-03-08 11:25:36 -0800
d611f8b | CREATE INDEX: don't update table stats if autovacuum=off. | 2025-03-06 19:36:34 2025 -0800
1d33de9 | Organize and deduplicate statistics import tests. | 2025-03-06 00:19:22 2025 -0800
424eded | Adjust pg_dump tag for relation stats. | 2025-02-27 20:42:12 -0800
40e27d0 | Use attnum to identify index columns in pg_restore_attribute_stats(). | 2025-02-26 16:36:20 -0500
6ee3b91 | pg_dump: prepare attribute stats query. | 2025-02-25 19:52:11 -0800
8f42718 | Avoid unnecessary relation stats query in pg_dump. | 2025-02-25 19:51:45 -0800
fc0d0ce | Ignore hash's relallvisible when checking pg_upgrade from pre-v10. | 2025-02-23 14:16:26 2025 -0500
ab84d0f | Trial fix for old cross-version upgrades. | Thu 2025-02-20 10:21:24 2025 -0800
1fd1bd8 | Transfer statistics during pg_upgrade. | 2025-02-20 01:29:06 -0800
41a2844 | Clean up some pg_dump tests | 2023-10-18 08:03:39 2023 +0200
Phorge Revision: D43002
Conflict resolution:
Individual commits contain the conflict resolution messages. analyze_cherry_pick.py was used to compute resolution messages.
Testing: