Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

barman backups failing due to idle in transaction termination #333

Closed
martinmarques opened this issue Apr 13, 2021 · 0 comments · Fixed by #340
Closed

barman backups failing due to idle in transaction termination #333

martinmarques opened this issue Apr 13, 2021 · 0 comments · Fixed by #340

Comments

@martinmarques
Copy link
Contributor

When idle_in_transaction_session_timeout is set to a low value, you can get failed backups because barman leaves connections opened with a live transaction.

Internally reported in RT70653

mikewallace1979 added a commit that referenced this issue Jun 23, 2021
Uses the psycopg2 connection context manager to commit transactions
when barman uses the connection to interact with postgres for
reasons other than starting or stopping the backup.

This prevents the following situation from arising:

1. Barman starts a new transaction, for example to retrieve the
   server version.
2. The transaction is left open putting the session into an
   idle-in-transaction state.
3. The backup lasts longer than the value of
   idle_in_transaction_session_timeout and fails because the
   session is terminated.

Because this patch ensures such transactions are committed,
via the psycopg2 connection context manager, such transactions
are no longer held open during the backup.

Note that in cases where we do not call `self.connect()` directly,
a new context manager is added around the `_cursor` property in
barman.postgres.PostgreSQL which uses with-statement to enter the
connection context manager and then yields the cursor context
manager. When the cursor context manager exits, the connection
context manager will also exit. It is not sufficient to just use
the cursor context manager as this does not commit transactions.

See https://www.psycopg.org/docs/usage.html#with-statement for
more details.

The contexxt manager is deliberately avoided for the functions
which call either `pg_{start,stop}_backup` or
`pgespresso_{start,stop}_backup` as these manage transactions
themselves by explicitly calling `rollback`.

Closes #333
mikewallace1979 added a commit that referenced this issue Jun 23, 2021
Uses the psycopg2 connection context manager so that transactions
are committed when barman uses the connection to interact with
postgres for reasons other than starting or stopping the backup.

This prevents the following situation from arising:

1. Barman starts a new transaction, for example to retrieve the
   server version.
2. The transaction is left open putting the session into an
   idle-in-transaction state.
3. The backup lasts longer than the value of
   idle_in_transaction_session_timeout and fails because the
   session is terminated.

Because this patch ensures such transactions are committed,
via the psycopg2 connection context manager, such transactions
are no longer held open during the backup.

Note that in cases where we do not call `self.connect()` directly,
a new context manager is added around the `_cursor` property in
barman.postgres.PostgreSQL which uses with-statement to enter the
connection context manager and then yields the cursor context
manager. When the cursor context manager exits, the connection
context manager will also exit. It is not sufficient to just use
the cursor context manager as this does not commit transactions.

See https://www.psycopg.org/docs/usage.html#with-statement for
more details.

The contexxt manager is deliberately avoided for the functions
which call either `pg_{start,stop}_backup` or
`pgespresso_{start,stop}_backup` as these manage transactions
themselves by explicitly calling `rollback`.

Closes #333
mikewallace1979 added a commit that referenced this issue Jun 23, 2021
Uses the psycopg2 connection context manager so that transactions
are committed when barman uses the connection to interact with
postgres for reasons other than starting or stopping the backup.

This prevents the following situation from arising:

1. Barman starts a new transaction, for example to retrieve the
   server version.
2. The transaction is left open putting the session into an
   idle-in-transaction state.
3. The backup lasts longer than the value of
   idle_in_transaction_session_timeout and fails because the
   session is terminated.

Because this patch ensures such transactions are committed,
via the psycopg2 connection context manager, such transactions
are no longer held open during the backup.

Note that in cases where we do not call `self.connect()` directly,
a new context manager is added around the `_cursor` property in
barman.postgres.PostgreSQL which uses with-statement to enter the
connection context manager and then yields the cursor context
manager. When the cursor context manager exits, the connection
context manager will also exit. It is not sufficient to just use
the cursor context manager as this does not commit transactions.

See https://www.psycopg.org/docs/usage.html#with-statement for
more details.

The context manager is deliberately avoided for the functions
which call either `pg_{start,stop}_backup` or
`pgespresso_{start,stop}_backup` as these manage transactions
themselves by explicitly calling `rollback`.

Closes #333
mikewallace1979 added a commit that referenced this issue Jun 23, 2021
Uses the psycopg2 connection context manager so that transactions
are committed when barman uses the connection to interact with
postgres for reasons other than starting or stopping the backup.

This prevents the following situation from arising:

1. Barman starts a new transaction, for example to retrieve the
   server version.
2. The transaction is left open putting the session into an
   idle-in-transaction state.
3. The backup lasts longer than the value of
   idle_in_transaction_session_timeout and fails because the
   session is terminated.

Because this patch ensures such transactions are committed,
via the psycopg2 connection context manager, such transactions
are no longer held open during the backup.

Note that in cases where we do not call `self.connect()` directly,
a new context manager is added around the `_cursor` property in
barman.postgres.PostgreSQL which uses with-statement to enter the
connection context manager and then yields the cursor context
manager. When the cursor context manager exits, the connection
context manager will also exit. It is not sufficient to just use
the cursor context manager as this does not commit transactions.

See https://www.psycopg.org/docs/usage.html#with-statement for
more details.

The context manager is deliberately avoided for the functions
which call either `pg_{start,stop}_backup` or
`pgespresso_{start,stop}_backup` as these manage transactions
themselves by explicitly calling `rollback`.

Closes #333
mikewallace1979 added a commit that referenced this issue Jul 9, 2021
Uses the psycopg2 connection context manager so that transactions
are committed when barman uses the connection to interact with
postgres for reasons other than starting or stopping the backup.

This prevents the following situation from arising:

1. Barman starts a backup by SELECTing from pg_start_backup and
   immediately rolls back the transaction.
2. Barman starts a new transaction to retrieve the server
   version.
2. The transaction is left open putting the session into an
   idle-in-transaction state.
3. The backup lasts longer than the value of
   idle_in_transaction_session_timeout and fails because the
   session is terminated.

Because this patch ensures such transactions are committed,
via the psycopg2 connection context manager, such transactions
are no longer held open during the backup.

Note that in cases where we do not call `self.connect()` directly,
a new context manager is added around the `_cursor` property in
barman.postgres.PostgreSQL which uses with-statement to enter the
connection context manager and then yields the cursor context
manager. When the cursor context manager exits, the connection
context manager will also exit. It is not sufficient to just use
the cursor context manager as this does not commit transactions.

See https://www.psycopg.org/docs/usage.html#with-statement for
more details.

The context manager is deliberately avoided for the functions
which call either `pg_{start,stop}_backup` or
`pgespresso_{start,stop}_backup` as these manage transactions
themselves by explicitly calling `rollback`.

Closes #333
amenonsen pushed a commit that referenced this issue Jul 9, 2021
Uses the psycopg2 connection context manager so that transactions
are committed when barman uses the connection to interact with
postgres for reasons other than starting or stopping the backup.

This prevents the following situation from arising:

1. Barman starts a backup by SELECTing from pg_start_backup and
   immediately rolls back the transaction.
2. Barman starts a new transaction to retrieve the server
   version.
2. The transaction is left open putting the session into an
   idle-in-transaction state.
3. The backup lasts longer than the value of
   idle_in_transaction_session_timeout and fails because the
   session is terminated.

Because this patch ensures such transactions are committed,
via the psycopg2 connection context manager, such transactions
are no longer held open during the backup.

Note that in cases where we do not call `self.connect()` directly,
a new context manager is added around the `_cursor` property in
barman.postgres.PostgreSQL which uses with-statement to enter the
connection context manager and then yields the cursor context
manager. When the cursor context manager exits, the connection
context manager will also exit. It is not sufficient to just use
the cursor context manager as this does not commit transactions.

See https://www.psycopg.org/docs/usage.html#with-statement for
more details.

The context manager is deliberately avoided for the functions
which call either `pg_{start,stop}_backup` or
`pgespresso_{start,stop}_backup` as these manage transactions
themselves by explicitly calling `rollback`.

Closes #333
mikewallace1979 added a commit that referenced this issue Jul 12, 2021
This reverts commit c9655da.

Reverting because the commit introduced the following behaviour:

1. Recursive entering of context managers.
2. Using autocommit=True inside a connection context manager.

These both resulted in undefined behaviour with psycopg2 < 2.9
however since the following commit they now cause exceptions:

   psycopg/psycopg2@e5ad0ab

Our commit is therefore reverted to prevent barman from breaking
with psycopg2 >=2.9 and we will resolve issue #333 another way.
amenonsen pushed a commit that referenced this issue Jul 13, 2021
This reverts commit c9655da.

Reverting because the commit introduced the following behaviour:

1. Recursive entering of context managers (e.g., the call to
   has_backup_privileges in switch_wal)
2. Using autocommit=True (for StreamingConnection) inside a
   connection context manager

These both resulted in undefined behaviour with psycopg2 < 2.9
however since the following commit they now cause exceptions:

   psycopg/psycopg2@e5ad0ab

Our commit is therefore reverted to prevent barman from breaking
with psycopg2 >=2.9 and we will resolve issue #333 another way.
netbsd-srcmastr pushed a commit to NetBSD/pkgsrc that referenced this issue May 11, 2022
Version 2.19 - 9 March 2022

- Change `barman diagnose` output date format to ISO8601.

- Add Google Cloud Storage (GCS) support to barman cloud.

- Support `current` and `latest` recovery targets for the `--target-tli`
  option of `barman recover`.

- Add documentation for installation on SLES.

- Bug fixes:

    - `barman-wal-archive --test` now returns a non-zero exit code when
      an error occurs.

    - Fix `barman-cloud-check-wal-archive` behaviour when `-t` option is
      used so that it exits after connectivity test.

    - `barman recover` now continues when `--no-get-wal` is used and
       `"get-wal"` is not set in `recovery_options`.

    - Fix `barman show-servers --format=json ${server}` output for
      inactive server.

    - Check for presence of `barman_home` in configuration file.

    - Passive barman servers will no longer store two copies of the
      tablespace data when syncing backups taken with
      `backup_method = postgres`.

- We thank richyen for his contributions to this release.

Version 2.18 - 21 January 2022

- Add snappy compression algorithm support in barman cloud (requires the
  optional python-snappy dependency).

- Allow Azure client concurrency parameters to be set when uploading
  WALs with barman-cloud-wal-archive.

- Add `--tags` option in barman cloud so that backup files and archived
  WALs can be tagged in cloud storage (aws and azure).

- Update the barman cloud exit status codes so that there is a dedicated
  code (2) for connectivity errors.

- Add the commands `barman verify-backup` and `barman generate-manifest`
  to check if a backup is valid.

- Add support for Azure Managed Identity auth in barman cloud which can
  be enabled with the `--credential` option.

- Bug fixes:

    - Change `barman-cloud-check-wal-archive` behavior when bucket does
      not exist.

    - Ensure `list-files` output is always sorted regardless of the
      underlying filesystem.

    - Man pages for barman-cloud-backup-keep, barman-cloud-backup-delete
      and barman-cloud-check-wal-archive added to Python packaging.

- We thank richyen and stratakis for their contributions to this
  release.

Version 2.17 - 1 December 2021

- Bug fixes:

    - Resolves a performance regression introduced in version 2.14 which
      increased copy times for `barman backup` or `barman recover` commands
      when using the `--jobs` flag.

    - Ignore rsync partial transfer errors for `sender` processes so that
      such errors do not cause the backup to fail (thanks to barthisrael).

Version 2.16 - 17 November 2021

- Add the commands `barman-check-wal-archive` and `barman-cloud-check-wal-archive`
  to validate if a proposed archive location is safe to use for a new PostgreSQL
  server.

- Allow Barman to identify WAL that's already compressed using a custom
  compression scheme to avoid compressing it again.

- Add `last_backup_minimum_size` and `last_wal_maximum_age` options to
  `barman check`.

- Bug fixes:

    - Use argparse for command line parsing instead of the unmaintained
      argh module.

    - Make timezones consistent for `begin_time` and `end_time`.

- We thank chtitux, George Hansper, stratakis, Thoro, and vrms for their
  contributions to this release.

Version 2.15 - 12 October 2021

- Add plural forms for the `list-backup`, `list-server` and
  `show-server` commands which are now `list-backups`, `list-servers`
  and `show-servers`. The singular forms are retained for backward
  compatibility.

- Add the `last-failed` backup shortcut which references the newest
  failed backup in the catalog so that you can do:

    - `barman delete <SERVER> last-failed`

- Bug fixes:

    - Tablespaces will no longer be omitted from backups of EPAS
      versions 9.6 and 10 due to an issue detecting the correct version
      string on older versions of EPAS.

Version 2.14 - 22 September 2021

- Add the `barman-cloud-backup-delete` command which allows backups in
  cloud storage to be deleted by specifying either a backup ID or a
  retention policy.

- Allow backups to be retained beyond any retention policies in force by
  introducing the ability to tag existing backups as archival backups
  using `barman keep` and `barman-cloud-backup-keep`.

- Allow the use of SAS authentication tokens created at the restricted
  blob container level (instead of the wider storage account level) for
  Azure blob storage

- Significantly speed up `barman restore` into an empty directory for
  backups that contain hundreds of thousands of files.

- Bug fixes:

    - The backup privileges check will no longer fail if the user lacks
      "userepl" permissions and will return better error messages if any
      required permissions are missing

Version 2.13 - 26 July 2021

- Add Azure blob storage support to barman-cloud

- Support tablespace remapping in barman-cloud-restore via
  `--tablespace name:location`

- Allow barman-cloud-backup and barman-cloud-wal-archive to run as
  Barman hook scripts, to allow data to be relayed to cloud storage
  from the Barman server

- Bug fixes:

    - Stop backups failing due to idle_in_transaction_session_timeout
      (EnterpriseDB/barman#333)

    - Fix a race condition between backup and archive-wal in updating
      xlog.db entries

    - Handle PGDATA being a symlink in barman-cloud-backup, which led to
      "seeking backwards is not allowed" errors on restore

    - Recreate pg_wal on restore if the original was a symlink

    - Recreate pg_tblspc symlinks for tablespaces on restore

    - Make barman-cloud-backup-list skip backups it cannot read, e.g.,
      because they are in Glacier storage

    - Add `-d database` option to barman-cloud-backup to specify which
      database to connect to initially

    - Fix "Backup failed uploading data" errors from barman-cloud-backup
      on Python 3.8 and above, caused by attempting to pickle the boto3
      client

    - Correctly enable server-side encryption in S3 for buckets that do
      not have encryption enabled by default.

      In Barman 2.12, barman-cloud-backup's `--encryption` option did
      not correctly enable encryption for the contents of the backup if
      the backup was stored in an S3 bucket that did not have encryption
      enabled. If this is the case for you, please consider deleting
      your old backups and taking new backups with Barman 2.13.

      If your S3 buckets already have encryption enabled by default
      (which we recommend), this does not affect you.

Version 2.12.1 - 30 June 2021

-   Bug fixes:

    -   Allow specifying target-tli with other target-* recovery options
    -   Fix incorrect NAME in barman-cloud-backup-list manpage
    -   Don't raise an error if SIGALRM is ignored
    -   Fetch wal_keep_size, not wal_keep_segments, from Postgres 13

Version 2.12 - 5 Nov 2020

-   Introduce a new backup_method option called local-rsync which
    targets those cases where Barman is installed on the same server
    where PostgreSQL is and directly uses rsync to take base backups,
    bypassing the SSH layer.

-   Bug fixes:

    -   Avoid corrupting boto connection in worker processes
    -   Avoid connection attempts to PostgreSQL during tests
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant