Skip to content

Implement cluster shrink (2nd phase)#2247

Merged
whitehawk merged 24 commits intofeature/ADBDEV-6608from
GG-110
Feb 25, 2026
Merged

Implement cluster shrink (2nd phase)#2247
whitehawk merged 24 commits intofeature/ADBDEV-6608from
GG-110

Conversation

@whitehawk
Copy link

@whitehawk whitehawk commented Feb 16, 2026

Implement cluster shrink (2nd phase)

List of changes:

  • Add support for redistribution of materialized views, external writable
    tables, partitioned tables, unlogged tables. Skip processing of temp tables.
    It is done to comply with the requirements.
  • Add checks that the database and the table exists before we actually start
    to rebalance the table. It is needed as one could drop it in parallel after we
    have created the rebalance table list.
  • Add retry logic into table rebalance worker. It is needed, when for ex.,
    other session opens a transaction after we have created the rebalance table
    list, drops the table before we started to rebalance it, and commits the
    transaction when we started to rebalance the table (and are hanging on the
    table's locks).
  • Change the order of shrunk segment processes stopping. Now mirrors are
    stopped strictly after primaries in order to avoid hanging replication
    processes.
  • Do not stop the tool execution in case we couldn't stop some of the shrinked
    segments. Now we only emit a warning. It is done to comply with the
    requirements.
  • Rework fault injection when stopping a segment due to the item above, as now
    we will not stop in case of an exception inside the 'SegmentStopAfterShrink'
    worker. So now, when a fault is injected, send SIGINT to the ggrebalance
    process to halt its work.
  • Improve logging inside 'SegmentStopAfterShrink'.
  • Remove not used flag 'needs_repopulate'.
  • Add new behave test cases and update old ones to cover the new functionality.
  • Add new behave step definitions to support the updates in the tests.
  • Fix behave test steps for view/matview creation - they opened a connection,
    but didn't use it. Instead, they tried to use the connection from the context,
    which was not properly configured.
  • Update code in the behave utils to support new test step definitions for
    materialized views and unlogged tables.
  • Add into the fault injector the ability to suspend execution instead of
    crashing it.

@whitehawk whitehawk changed the title Gg 110 Implement cluster shrink (2nd phase) Feb 18, 2026
@whitehawk whitehawk marked this pull request as ready for review February 18, 2026 05:48
@KnightMurloc

This comment was marked as resolved.

@whitehawk
Copy link
Author

2nd to perform 'REFRESH MATERIALIZED VIEW'.

Why is just rebalancing not enough? does gpexpand refresh mat view after expanding?

After f2f discussion, we need to evaluate CTAS approach for mat views, as current approach can have potential issues with race condition, if one mat view depends on another mat view. Created GG-225.

@KnightMurloc
Copy link

Skip processing of temp tables

Why? If the database is still available to users during shrink, what will be the state of their temporary tables after that?

@bimboterminator1
Copy link
Member

Created GG-225.

Current approach won't be cut off for now?

Problem description:
Before this patch, in order to rebalance a materialized view, 2 steps were
required: the actual rebalance where distribution policy was updated, and the
refresh step to update the data in the materialized view. This approach had 2
problems with respect to usage in 'ggrebalance' tool for cluster shrink:
1. It could change the actual data in the materialized view before the cluster
shrink, and after the shrink, if the view was not up-to-date. We intend to keep
the logical data in the cluster not altered.
2. If a materialized view depends on another materialized view, there could be
a race condition when doing the refresh, when we try to refresh based on the
yet-not-refreshed one.

Fix:
Use the CTAS approach from the EXPAND TABLE specifically when we are rebalancing
a materialized view. It creates a temp table with a correct distribution policy,
where all data from the materialized view is copied, and then the relfilenode
of the materialized view is swapped with the temp table. It keeps the data as it
was before the rebalance, even if it was not up-to-date (therefore we will not
surprise the user with the not expected view content), and it eliminates
dependencies on other objects besides the materialized view itself.

(cherry picked from commit 37dc7e7)
@whitehawk
Copy link
Author

2nd to perform 'REFRESH MATERIALIZED VIEW'.

Why is just rebalancing not enough? does gpexpand refresh mat view after expanding?

After f2f discussion, we need to evaluate CTAS approach for mat views, as current approach can have potential issues with race condition, if one mat view depends on another mat view. Created GG-225.

I've updated handling of matviews. Now they do not require REFRESH step.

Please note that there are changes in src/backend/* and src/test/regress/*. They are presented here for convenience (to make changes in the shrink workable in this branch). They will be reviewed and commited in other PR (#2249) prior to this PR merging.

@whitehawk
Copy link
Author

Skip processing of temp tables

Why? If the database is still available to users during shrink, what will be the state of their temporary tables after that?

According to requirements, we need to rebalance tables with "relpersistence = 'p' | relpersistence = 'u'". gpexpand also skips temp tables.
In normal workflow, by the end of the shrink procedure all sessions should be disconnected in order to stop shrunk segments, therefore in normal conditions all temp tables should not survive the shrink procedure anyway.

@whitehawk whitehawk merged commit a57039b into feature/ADBDEV-6608 Feb 25, 2026
1 check passed
@whitehawk whitehawk deleted the GG-110 branch February 25, 2026 11:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants