Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Option to backup and restore from gcp storage bucket (GCS) #1791

Closed
devopsevd opened this issue Aug 17, 2020 · 6 comments · Fixed by #2368
Closed

Option to backup and restore from gcp storage bucket (GCS) #1791

devopsevd opened this issue Aug 17, 2020 · 6 comments · Fixed by #2368

Comments

@devopsevd
Copy link

**What is the motivation or use case for the change? **
I do not see an option using the crunchydata operator to backup and restore from GCP storage bucket.

Describe the solution you'd like
I would to see an option while using the postgres-operator to backup and restore from GCP storage bucket.

Please tell us about your environment:

  • Operating System: Linux based
  • Where is this running ( Local, Cloud Provider) - GCP GKE cluster
  • Storage being used (NFS, Hostpath, Gluster, etc):GCP PVC,
  • Container Image Tag: centos7-4.3.0 / centos7-12.2-4.3.0
  • PostgreSQL Version:12.2
  • Platform (Docker, Kubernetes, OpenShift): Kubernetes
  • Platform Version: 1.16.11-gke.5

Additional context
Add any other context or screenshots about the enhancement request here.

@jkatz
Copy link
Contributor

jkatz commented Aug 17, 2020

For direct support of GCS, the PostgreSQL Operator will need for pgBackRest to support GCS.

Until then, I would recommend reviewing this comment about how GCS can be accomplished with MinIO today, substituting "Azure" with GCS for the supported gateway:

https://docs.min.io/docs/minio-gateway-for-gcs.html

@davi5e
Copy link

davi5e commented Aug 30, 2020

Hello,

I just tried to enable the backups in GCS using HMAC keys for service accounts as it should be S3 compatible but it would appear I just hit this issue where it's not fully supported.

It appears that some small changes in PGO/pgBackRest would seamlessly enable this feature and I do use this approach elsewhere.

To test it, I followed PGO's documentation but tweaked pgo.yaml:

Cluster:
  BackrestS3Endpoint: storage.googleapis.com
  # disable TLS because the current cert is for AWS
  BackrestS3VerifyTLS: false

The error stanza logs is shown below. Just to be clear, this is an "error" on Google's API AFAIK...

time="2020-08-30T22:58:15Z" level=info msg="pgo-backrest starts"
time="2020-08-30T22:58:15Z" level=info msg="debug flag set to false"
time="2020-08-30T22:58:15Z" level=info msg="backrest stanza-create command requested"
time="2020-08-30T22:58:15Z" level=info msg="s3 flag enabled for backrest command"
time="2020-08-30T22:58:15Z" level=info msg="command to execute is [pgbackrest stanza-create  --db-host=10.3.4.10 --db-path=/pgdata/website-cluster --repo1-type=s3 --no-repo1-s3-verify-tls]"
time="2020-08-30T22:58:15Z" level=info msg="command is pgbackrest stanza-create  --db-host=10.3.4.10 --db-path=/pgdata/website-cluster --repo1-type=s3 --no-repo1-s3-verify-tls "
time="2020-08-30T22:58:19Z" level=error msg="command terminated with exit code 39"
time="2020-08-30T22:58:19Z" level=info msg="output=[]"
time="2020-08-30T22:58:19Z" level=info msg="stderr=[ERROR: [039]: S3 request failed with 400: Bad Request\n       *** URI/Query ***:\n       /?delimiter=%2F&list-type=2&prefix=backrestrepo%2Fwebsite-cluster-backrest-shared-repo%2Farchive%2Fdb%2F\n       *** Request Headers ***:\n       authorization: <redacted>\n       content-length: 0\n       host: my-bucket-name-here.storage.googleapis.com\n       x-amz-content-sha256: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855\n       x-amz-date: <redacted>\n       *** Response Headers ***:\n       cache-control: private, max-age=0\n       content-length: 239\n       content-type: application/xml; charset=UTF-8\n       date: Sun, 30 Aug 2020 22:58:19 GMT\n       expires: Sun, 30 Aug 2020 22:58:19 GMT\n       server: UploadServer\n       x-guploader-uploadid: ABg5-UyZgtOk7vQuoGHtbE6G2nxVFheoWpmh8GV1R0_G7KqT8hcJocA1_768nQLPgaXuxxFLxYyvkFkWqNxSHgnN08BInmyXBA\n       *** Response Content ***:\n       <?xml version='1.0' encoding='UTF-8'?><Error><Code>NotImplemented</Code><Message>A header or query you provided requested a function that is not implemented.</Message><Details>GET ?list-type is not implemented for buckets</Details></Error>\n]"
time="2020-08-30T22:58:19Z" level=error msg="command terminated with exit code 39"
stream closed

@strigona-worksight
Copy link

Just FYI, the stanza logs indicates the issue at hand with GCS:

GET ?list-type is not implemented for buckets

GCS is not fully compatible with S3 protocol. pgbackrest needs to be modified to support the storage endpoint and this feature is on pgbackest's backlog: https://github.com/pgbackrest/pgbackrest/projects/2#card-33945748

@leosussan
Copy link

leosussan commented Dec 17, 2020

Similar problem for me. As a workaround, I've been experimenting with an implementation that uses a GCS CSI driver to mount a Cloud Storage bucket as a persistent volume:
https://github.com/ofek/csi-gcs

Mileage may vary, but it works well for me.

jkatz pushed a commit to jkatz/postgres-operator that referenced this issue Apr 12, 2021
pgBackRest 2.33 introduced support for GCS storage as a means
for taking backups. PGO now adds orchestration for using GCS
with its pgBackRest integration.

This adds a few new attributes to the pgclusters.crunchydata.com
custom resource to enable GCS support with PGO, including:

- BackrestGCSBucket (required)
- BackrestGCSEndpoint
- BackrestGCSKeyType

The pgBackRest repository Secret now supports a key called
"gcs-key", which references the GCS credential.

Similarly, additional flags are now available in the
`pgo create cluster` command to enable GCS support, including:

- `--pgbackrest-gcs-bucket`
- `--pgbackrest-gcs-endpoint`
- `--pgbackrest-gcs-key`
- `--pgbackrest-gcs-key-type`

Note that `--pgbackrest-gcs-key` references a file path in your
local environment. The GCS credential is a JSON file; for
convenience, the PGO client will accept the file and handle the
upload.

There are also several installation configuration parameters
available if, for example, you are using the same bucket.

The two parameters that are required are the GCS bucket name and
the GCS key; pgBackRest can figure out the rest.

This supports all of the same things that PGO supports around S3;
you can even create standbys using GCS.

However, note that in a "hybrid" setup, you can only use "posix,gcs";
"s3,gcs" is not supported at this time. In other words, the following
storage types are supported:

- posix
- s3
- gcs
- posix,s3
- posix,gcs

Issue: [ch11177]
Issue: CrunchyData#1791
jkatz pushed a commit to jkatz/postgres-operator that referenced this issue Apr 12, 2021
pgBackRest 2.33 introduced support for GCS storage as a means
for taking backups. PGO now adds orchestration for using GCS
with its pgBackRest integration.

This adds a few new attributes to the pgclusters.crunchydata.com
custom resource to enable GCS support with PGO, including:

- BackrestGCSBucket (required)
- BackrestGCSEndpoint
- BackrestGCSKeyType

The pgBackRest repository Secret now supports a key called
"gcs-key", which references the GCS credential.

Similarly, additional flags are now available in the
`pgo create cluster` command to enable GCS support, including:

- `--pgbackrest-gcs-bucket`
- `--pgbackrest-gcs-endpoint`
- `--pgbackrest-gcs-key`
- `--pgbackrest-gcs-key-type`

Note that `--pgbackrest-gcs-key` references a file path in your
local environment. The GCS credential is a JSON file; for
convenience, the PGO client will accept the file and handle the
upload.

There are also several installation configuration parameters
available if, for example, you are using the same bucket.

The two parameters that are required are the GCS bucket name and
the GCS key; pgBackRest can figure out the rest.

This supports all of the same things that PGO supports around S3;
you can even create standbys using GCS.

However, note that in a "hybrid" setup, you can only use "posix,gcs";
"s3,gcs" is not supported at this time. In other words, the following
storage types are supported:

- posix
- s3
- gcs
- posix,s3
- posix,gcs

Issue: [ch11177]
Issue: CrunchyData#1791
jkatz pushed a commit to jkatz/postgres-operator that referenced this issue Apr 12, 2021
pgBackRest 2.33 introduced support for GCS storage as a means
for taking backups. PGO now adds orchestration for using GCS
with its pgBackRest integration.

This adds a few new attributes to the pgclusters.crunchydata.com
custom resource to enable GCS support with PGO, including:

- BackrestGCSBucket (required)
- BackrestGCSEndpoint
- BackrestGCSKeyType

The pgBackRest repository Secret now supports a key called
"gcs-key", which references the GCS credential.

Similarly, additional flags are now available in the
`pgo create cluster` command to enable GCS support, including:

- `--pgbackrest-gcs-bucket`
- `--pgbackrest-gcs-endpoint`
- `--pgbackrest-gcs-key`
- `--pgbackrest-gcs-key-type`

Note that `--pgbackrest-gcs-key` references a file path in your
local environment. The GCS credential is a JSON file; for
convenience, the PGO client will accept the file and handle the
upload.

There are also several installation configuration parameters
available if, for example, you are using the same bucket.

The two parameters that are required are the GCS bucket name and
the GCS key; pgBackRest can figure out the rest.

This supports all of the same things that PGO supports around S3;
you can even create standbys using GCS.

However, note that in a "hybrid" setup, you can only use "posix,gcs";
"s3,gcs" is not supported at this time. In other words, the following
storage types are supported:

- posix
- s3
- gcs
- posix,s3
- posix,gcs

Issue: [ch11177]
Issue: CrunchyData#1791
jkatz pushed a commit to jkatz/postgres-operator that referenced this issue Apr 13, 2021
pgBackRest 2.33 introduced support for GCS storage as a means
for taking backups. PGO now adds orchestration for using GCS
with its pgBackRest integration.

This adds a few new attributes to the pgclusters.crunchydata.com
custom resource to enable GCS support with PGO, including:

- BackrestGCSBucket (required)
- BackrestGCSEndpoint
- BackrestGCSKeyType

The pgBackRest repository Secret now supports a key called
"gcs-key", which references the GCS credential.

Similarly, additional flags are now available in the
`pgo create cluster` command to enable GCS support, including:

- `--pgbackrest-gcs-bucket`
- `--pgbackrest-gcs-endpoint`
- `--pgbackrest-gcs-key`
- `--pgbackrest-gcs-key-type`

Note that `--pgbackrest-gcs-key` references a file path in your
local environment. The GCS credential is a JSON file; for
convenience, the PGO client will accept the file and handle the
upload.

There are also several installation configuration parameters
available if, for example, you are using the same bucket.

The two parameters that are required are the GCS bucket name and
the GCS key; pgBackRest can figure out the rest.

This supports all of the same things that PGO supports around S3;
you can even create standbys using GCS.

However, note that in a "hybrid" setup, you can only use "posix,gcs";
"s3,gcs" is not supported at this time. In other words, the following
storage types are supported:

- posix
- s3
- gcs
- posix,s3
- posix,gcs

Issue: [ch11177]
Issue: CrunchyData#1791
jkatz pushed a commit to jkatz/postgres-operator that referenced this issue Apr 14, 2021
pgBackRest 2.33 introduced support for GCS storage as a means
for taking backups. PGO now adds orchestration for using GCS
with its pgBackRest integration.

This adds a few new attributes to the pgclusters.crunchydata.com
custom resource to enable GCS support with PGO, including:

- BackrestGCSBucket (required)
- BackrestGCSEndpoint
- BackrestGCSKeyType

The pgBackRest repository Secret now supports a key called
"gcs-key", which references the GCS credential.

Similarly, additional flags are now available in the
`pgo create cluster` command to enable GCS support, including:

- `--pgbackrest-gcs-bucket`
- `--pgbackrest-gcs-endpoint`
- `--pgbackrest-gcs-key`
- `--pgbackrest-gcs-key-type`

Note that `--pgbackrest-gcs-key` references a file path in your
local environment. The GCS credential is a JSON file; for
convenience, the PGO client will accept the file and handle the
upload.

There are also several installation configuration parameters
available if, for example, you are using the same bucket.

The two parameters that are required are the GCS bucket name and
the GCS key; pgBackRest can figure out the rest.

This supports all of the same things that PGO supports around S3;
you can even create standbys using GCS.

However, note that in a "hybrid" setup, you can only use "posix,gcs";
"s3,gcs" is not supported at this time. In other words, the following
storage types are supported:

- posix
- s3
- gcs
- posix,s3
- posix,gcs

Issue: [ch11177]
Issue: CrunchyData#1791
jkatz pushed a commit to jkatz/postgres-operator that referenced this issue Apr 15, 2021
pgBackRest 2.33 introduced support for GCS storage as a means
for taking backups. PGO now adds orchestration for using GCS
with its pgBackRest integration.

This adds a few new attributes to the pgclusters.crunchydata.com
custom resource to enable GCS support with PGO, including:

- BackrestGCSBucket (required)
- BackrestGCSEndpoint
- BackrestGCSKeyType

The pgBackRest repository Secret now supports a key called
"gcs-key", which references the GCS credential.

Similarly, additional flags are now available in the
`pgo create cluster` command to enable GCS support, including:

- `--pgbackrest-gcs-bucket`
- `--pgbackrest-gcs-endpoint`
- `--pgbackrest-gcs-key`
- `--pgbackrest-gcs-key-type`

Note that `--pgbackrest-gcs-key` references a file path in your
local environment. The GCS credential is a JSON file; for
convenience, the PGO client will accept the file and handle the
upload.

There are also several installation configuration parameters
available if, for example, you are using the same bucket.

The two parameters that are required are the GCS bucket name and
the GCS key; pgBackRest can figure out the rest.

This supports all of the same things that PGO supports around S3;
you can even create standbys using GCS.

However, note that in a "hybrid" setup, you can only use "posix,gcs";
"s3,gcs" is not supported at this time. In other words, the following
storage types are supported:

- posix
- s3
- gcs
- posix,s3
- posix,gcs

Issue: [ch11177]
Issue: CrunchyData#1791
jkatz added a commit that referenced this issue Apr 15, 2021
pgBackRest 2.33 introduced support for GCS storage as a means
for taking backups. PGO now adds orchestration for using GCS
with its pgBackRest integration.

This adds a few new attributes to the pgclusters.crunchydata.com
custom resource to enable GCS support with PGO, including:

- BackrestGCSBucket (required)
- BackrestGCSEndpoint
- BackrestGCSKeyType

The pgBackRest repository Secret now supports a key called
"gcs-key", which references the GCS credential.

Similarly, additional flags are now available in the
`pgo create cluster` command to enable GCS support, including:

- `--pgbackrest-gcs-bucket`
- `--pgbackrest-gcs-endpoint`
- `--pgbackrest-gcs-key`
- `--pgbackrest-gcs-key-type`

Note that `--pgbackrest-gcs-key` references a file path in your
local environment. The GCS credential is a JSON file; for
convenience, the PGO client will accept the file and handle the
upload.

There are also several installation configuration parameters
available if, for example, you are using the same bucket.

The two parameters that are required are the GCS bucket name and
the GCS key; pgBackRest can figure out the rest.

This supports all of the same things that PGO supports around S3;
you can even create standbys using GCS.

However, note that in a "hybrid" setup, you can only use "posix,gcs";
"s3,gcs" is not supported at this time. In other words, the following
storage types are supported:

- posix
- s3
- gcs
- posix,s3
- posix,gcs

Issue: [ch11177]
Issue: #1791
@jkatz
Copy link
Contributor

jkatz commented Apr 15, 2021

Direct support for GCS will be available with the 4.7.0 release.

@kubaracek
Copy link
Contributor

kubaracek commented Apr 20, 2021

@jkatz Is there any ETA on 4.7.0? I would like to play around with the GCS so am thinking if I should wait for the official release or play with master for now. Thank you

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants