New option PGBK_PAUSE_TIMEOUT #33

fljdin · 2020-02-13T16:14:52Z

Add timeout behaviour when dumping standby in case of AccessExclusiveLocks #32

A long query with AccessShareLock on master will be replicated, and will cause an internal loop when invoking pg_back on standby. To prevent multiple executions of pg_back on daily routine, this introduces a new option PGBK_PAUSE_TIMEOUT (-T) that waits an amount of time in seconds before dying. The pause/resume replay logic has been simplified.

Save to output of SHOW ALL to pg_settings_DATETIME.out

orgrim · 2020-02-25T11:16:52Z

Hello, thanks for the patch.

I would prefer to have a name like PGBK_STANDBY_PAUSE_TIMEOUT, which says this timeout relates to standby clusters.

Other than that, it looks good.

Regards

fljdin · 2020-02-26T08:50:40Z

Hi,

Okay for renaming variable.

The blocking behavior is relevant on primary cluster too, but internal loop isn't implemented for this case. For users who haven't replication, a long blocking session could cause multiple execution of pg_back, day after day.

Regards

orgrim

Please rebase your branch on the tip of orgrim:master, merging right now would revert #31

Thank you

pg_back

orgrim · 2020-03-18T08:50:59Z

Concerning the fact that a long run could conflict with the backup schedule, with pg_back processes stacking up, I guess a locking mecanism could be implemented using the flock command, see #34

A long query with AccessShareLock on master will be replicated, and will cause an internal loop when invoking pg_back on standby. To prevent multiple executions of pg_back on daily routine, this introduces a new option PGBK_PAUSE_TIMEOUT (-T) that waits an amount of time in seconds before dying. The pause/resume replay logic has been simplified.

orgrim · 2020-03-20T08:31:01Z

pg_back

+            fi
+
+            echo "The standby database has exclusive locks (vacuum full, truncate or other locking command) running on primary"
+            echo "Resuming replication for ${PGBK_PAUSE}s"


I guess we could use warn() and info() here to print messages correctly. Could you please make to change?

Thank you

orgrim and others added 3 commits January 23, 2020 09:47

Save to output of SHOW ALL to pg_settings_DATETIME.out

f9de306

Fix message

3b7dcf5

fljdin requested a review from orgrim February 13, 2020 16:15

Merge pull request #31 from orgrim/save-settings

126ff14

Save to output of SHOW ALL to pg_settings_DATETIME.out

Renaming PGBK_STANDBY_PAUSE_TIMEOUT

3dd98a6

orgrim requested changes Mar 18, 2020

View reviewed changes

pg_back Outdated Show resolved Hide resolved

orgrim mentioned this pull request Mar 18, 2020

Long execution and cron scheduling can lead to processes stacking up #34

Closed

Florent Jardin added 5 commits March 19, 2020 11:13

Fix message

2a06148

Renaming PGBK_STANDBY_PAUSE_TIMEOUT

45b3a40

Merge branch 'master' of github.com:fljdin/pg_back

ee499bb

Using standardized cluster names

e7ba733

orgrim requested changes Mar 20, 2020

View reviewed changes

Replace echo by info()

8c813af

orgrim merged commit 9637871 into orgrim:master Mar 24, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New option PGBK_PAUSE_TIMEOUT #33

New option PGBK_PAUSE_TIMEOUT #33

fljdin commented Feb 13, 2020

orgrim commented Feb 25, 2020

fljdin commented Feb 26, 2020

orgrim left a comment

orgrim commented Mar 18, 2020

orgrim Mar 20, 2020

New option PGBK_PAUSE_TIMEOUT #33

New option PGBK_PAUSE_TIMEOUT #33

Conversation

fljdin commented Feb 13, 2020

orgrim commented Feb 25, 2020

fljdin commented Feb 26, 2020

orgrim left a comment

Choose a reason for hiding this comment

orgrim commented Mar 18, 2020

orgrim Mar 20, 2020

Choose a reason for hiding this comment