Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for running the proxy as a sidecar in Kubernetes with Workload Identity #701

Closed
magiccrafter opened this issue Mar 2, 2023 · 3 comments · Fixed by #718 or #723
Closed

Comments

@magiccrafter
Copy link

It would be awesome to bring the WI functionality into the PGAdapter.
That will make any transition from PostgreSQL Cloud SQL to Spanner a breeze.

Example:
https://github.com/GoogleCloudPlatform/cloud-sql-proxy/tree/main/examples/k8s-sidecar#run-the-cloud-sql-proxy-as-a-sidecar

@olavloite
Copy link
Collaborator

That feature already works with PGAdapter. I've been able to run a simple test application using that setup locally. I'll try to create a step-by-step tutorial for how to set this up for PGAdapter, but it will largely be the same as for the cloud-sql-proxy, so if you want to try it out in the meantime, you should be able to get it up and running using that tutorial.

My deployment file looks like this:

# Copyright 2021 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#      http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# [START gke_manifests_helloweb_deployment_deployment_helloweb]
# [START container_helloapp_deployment]
apiVersion: apps/v1
kind: Deployment
metadata:
  name: helloweb
  labels:
    app: hello
spec:
  selector:
    matchLabels:
      app: hello
      tier: web
  template:
    metadata:
      labels:
        app: hello
        tier: web
    spec:
      serviceAccountName: hello-app-sa
      containers:
      - name: hello-app
        image: europe-west1-docker.pkg.dev/my-project/hello-repo/hello-app:v1
        ports:
        - containerPort: 8080
        resources:
          requests:
            cpu: 200m
      - name: pgadapter
        image: gcr.io/cloud-spanner-pg-adapter/pgadapter
        ports:
          - containerPort: 5432
        args:
          - "-p my-project"
          - "-i my-instance"
          - "-d my-database"
          - "-x"
        resources:
          requests:
            memory: "1Gi"
            cpu: 200m
# [END container_helloapp_deployment]
# [END gke_manifests_helloweb_deployment_deployment_helloweb]
---

My Kubernetes Service Account (KSA) is named hello-app-sa and is linked to a Google IAM Service Account (GSA) that has permission to access my Cloud Spanner database. The hello-app that I'm deploying is a slightly modified hello-app from the standard quickstart tutorial for GKE. The code looks like this:

/**
 * Copyright 2021 Google Inc.
 *
 * Licensed under the Apache License, Version 2.0 (the "License");
 * you may not use this file except in compliance with the License.
 * You may obtain a copy of the License at
 *
 *   http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */

// [START gke_hello_app]
// [START container_hello_app]
package main

import (
	"context"
	"fmt"
	"log"
	"net/http"
	"os"

	"github.com/jackc/pgx/v4"
)

func main() {
	// register hello function to handle all requests
	mux := http.NewServeMux()
	mux.HandleFunc("/", hello)

	// use PORT environment variable, or default to 8080
	port := os.Getenv("PORT")
	if port == "" {
		port = "8080"
	}

	// start the web server on port and accept requests
	log.Printf("Server listening on port %s", port)
	log.Fatal(http.ListenAndServe(":"+port, mux))
}

// hello responds to the request with a plain-text "Hello, world" message.
func hello(w http.ResponseWriter, r *http.Request) {
	log.Printf("Serving request: %s", r.URL.Path)
	host, _ := os.Hostname()
	fmt.Fprintf(w, "Hello, world!\n")
	fmt.Fprintf(w, "Version: 3.0.0\n")
	fmt.Fprintf(w, "Hostname: %s\n", host)

	conn, err := pgx.Connect(context.Background(), "postgres://uid:pwd@127.0.0.1:5432/?sslmode=disable")
	if err != nil {
		fmt.Fprintf(w, "Unable to connect to database: %v\n", err)
		return
	}
	defer conn.Close(context.Background())

	var greeting string
	err = conn.QueryRow(context.Background(), "select 'Greeting from PGAdapter!'").Scan(&greeting)
	if err != nil {
		fmt.Fprintf(w, "QueryRow failed: %v\n", err)
		return
	}
	fmt.Fprintln(w, greeting)
}

// [END container_hello_app]
// [END gke_hello_app]

The output from the app is:

Hello, world!
Version: 3.0.0
Hostname: helloweb-d7f8b5679-mqmfm
Greeting from PGAdapter!

@magiccrafter
Copy link
Author

That's awesome news, thanks @olavloite
I couldn't find a word that suggests there would be such support.

    spec:
      serviceAccountName: hello-app-sa
      containers:
      - name: pgadapter
        image: gcr.io/cloud-spanner-pg-adapter/pgadapter
        ports:
          - containerPort: 5432
        args:
          - "-p my-project"
          - "-i my-instance"
          - "-d my-database"
          - "-x"

I hope you don't mind helping with the following questions:

  • What is your advice for the minimum CPU and memory container specs when using the image in a container as a sidecar?
  • What is the behavior when a SIGTERM signal is received? Will the PGadapter wait for all the connections to close before shutting down the process?
  • Does the PGadapter have diagnostics REST endpoints for readiness and liveness?
  • Do I have to pass a dummy password in the container-as-a-sidecar setup?
    i.e. postgres://user@some.iam:dummy_password@127.0.0.1:5432/?sslmode=disable

@olavloite
Copy link
Collaborator

CPU and Memory:

It obviously depends a lot on your workload, but in general I would recommend 384Mb + 2Mb memory per concurrent connection that you make to PGAdapter. So if you expect your application to make for example 200 concurrent connections, then reserve 384Mb + 200Mb = 584Mb memory.

PGAdapter does not cache very much data, but memory can be required when converting query results from the Cloud Spanner gRPC format to the PostgreSQL wire-format. Especially converting large binary columns can require more memory, as Cloud Spanner and PostgreSQL use completely different formats for that.

The proxy is generally not very CPU hungry, as most of what it does is just pass through data. The CPU usage will of course increase with the amount of data that you are sending/receiving. I find it very difficult to give any generic numbers on this, as it will depend on the type of workload. E.g. an application that executes queries that return a large amount of data will require more CPU power per connection than one that only executes queries that return a small number of rows/columns.

One concrete example: On my 8-core MacBook it is enough to assign 2.0 CPUs to the Docker container to get the maximum TPS out of a pgbench test. That is; assigning more CPU to the container does not increase TPS.

One other thing to consider when determining the amount of CPU to assign to the sidecar depends on the access pattern of your application. If your application mostly does database access synchronously, then your application will not require much CPU when the proxy does, as it would be anyway be waiting for data from the proxy.

SIGTERM

It shuts the proxy down without waiting for connections to close. It currently does not support any flag like -term_timeout. Please feel free to file a feature request if this is important for your application setup.

Diagnostic endpoints

PGAdapter does not have any diagnostic http endpoints. Currently, the only way to check that would be to use for example a psql client to execute a simple SELECT 1 query. Please feel free to file a feature request if this is important for your application setup.

Dummy password

PGAdapter does not require you to supply a dummy username and/or password, but some drivers require it. I believe that pgx is one of them. If the driver that you are using accepts connection strings without a username and password, then PGAdapter will also accept those connections.

olavloite added a commit that referenced this issue Mar 12, 2023
Adds documentation and sample for running PGAdapter as a sidecar proxy on GKE.

Fixes #701
olavloite added a commit that referenced this issue Mar 15, 2023
Adds documentation and sample for running PGAdapter as a sidecar proxy on GKE.

Fixes #701
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants