Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add support for Auto IAM AuthN #358

Merged
merged 1 commit into from
Nov 15, 2023
Merged

feat: add support for Auto IAM AuthN #358

merged 1 commit into from
Nov 15, 2023

Conversation

enocom
Copy link
Member

@enocom enocom commented Aug 16, 2023

AlloyDB uses a metadata exchange prior to handing over to the database driver to begin the Postgres protocol. As part of this metadata exchange, the dialer sends the IAM principal's OAuth2 token which the proxy server uses to verify connection permissions and, if IAM AuthN is requested, to authenticate to the database.

dialer.go Outdated Show resolved Hide resolved
dialer.go Outdated Show resolved Hide resolved
dialer.go Show resolved Hide resolved
dialer.go Outdated
"fmt"
"net"
"strings"
"sync"
"sync/atomic"
"time"

alloydbadmin "cloud.google.com/go/alloydb/apiv1beta"
alloydbadmin "cloud.google.com/go/alloydb/apiv1alpha"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

apiv1 should support this feature right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless v1 generated clients are not ready yet, this should also work with v1 endpoints.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The v1 changes for go should be out now based on this release: googleapis/google-cloud-go@b44c4b3

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes aren't in a released version yet. So I bumped to beta. I'll send a separate PR to bump to v1 when the support is tagged in a release.

dialer.go Outdated
ts := cfg.tokenSource
if ts == nil {
var err error
ts, err = google.DefaultTokenSource(ctx, "https://www.googleapis.com/auth/cloud-platform")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you could use CloudPlatformScope from options.go

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

dialer.go Outdated
"fmt"
"net"
"strings"
"sync"
"sync/atomic"
"time"

alloydbadmin "cloud.google.com/go/alloydb/apiv1beta"
alloydbadmin "cloud.google.com/go/alloydb/apiv1alpha"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless v1 generated clients are not ready yet, this should also work with v1 endpoints.

}

// Reset IO deadline before read
err = conn.SetDeadline(time.Now().Add(ioTimeout))
Copy link
Contributor

@cthumuluru cthumuluru Nov 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have to clear the deadline too similar to line # 315?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, good catch. Fixed.

dialer.go Outdated
@@ -42,6 +48,9 @@ const (
defaultTCPKeepAlive = 30 * time.Second
// serverProxyPort is the port the server-side proxy receives connections on.
serverProxyPort = "5433"
// ioTimeout is the maximum amount of time to wait before aborting a
// metadata exhange
ioTimeout = time.Minute
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would 30 seconds be enough?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A minute at least is consistent with the timeout we have on the server side if that matters

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am +1 to making this 30 seconds potentially. I think the decision should come down to does 1 minute serve a purpose? If we change it to 30 seconds would that make any difference? (do we expect anything to change between 30s and 60s) If the metadata exchange has not resolved within 30 seconds than something is probably wrong, why wait until a minute to surface the error?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made it 30 seconds. We can further adjust as needed. It's probably better for the server to be more lenient on the timeout and have the client wait for only 30 seconds.

return err
}

resp := &connectorspb.MetadataExchangeResponse{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if the response code is a parameter to this function so you can add a test to simulate the server returning an error?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, good idea. This will require some restructing of the test code. Let's do that in a separate PR. Meanwhile, thinking about error paths made me realize we weren't closing the TLS connection when the metadata exchange protocol fails. The test cleanup will help put an assertion around this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Filed #427 for this work.

conn, err := d.Dial(ctx, "/projects/my-project/locations/my-region/clusters/my-cluster/instances/my-instance")
if err != nil {
t.Fatalf("expected Dial to succeed, but got error: %v", err)
for i := 0; i < 10; i++ {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this for loop necessary?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a comment for clarity. Short answer: we're ensuring there's no state left over in our shared buffer.

Copy link
Collaborator

@nancynh nancynh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left a few comments

dialer.go Outdated
@@ -42,6 +48,9 @@ const (
defaultTCPKeepAlive = 30 * time.Second
// serverProxyPort is the port the server-side proxy receives connections on.
serverProxyPort = "5433"
// ioTimeout is the maximum amount of time to wait before aborting a
// metadata exhange
ioTimeout = time.Minute
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am +1 to making this 30 seconds potentially. I think the decision should come down to does 1 minute serve a purpose? If we change it to 30 seconds would that make any difference? (do we expect anything to change between 30s and 60s) If the metadata exchange has not resolved within 30 seconds than something is probably wrong, why wait until a minute to surface the error?

dialer.go Outdated
@@ -86,6 +95,12 @@ type Dialer struct {
// dialFunc is the function used to connect to the address on the named
// network. By default it is golang.org/x/net/proxy#Dial.
dialFunc func(cxt context.Context, network, addr string) (net.Conn, error)

useIAMAuthN bool
tokenSource oauth2.TokenSource
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be iamTokenSource for consistency with cloud sql?: https://github.com/GoogleCloudPlatform/cloud-sql-go-connector/blob/main/dialer.go#L97

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, fixed.

dialer.go Outdated
// response is not OK, return the response's error. If there is no error, the
// metadata exchange has succeeded and the connection is complete.
//
// Subsequent interactions with the test server use the database protocol.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this meant to say "test server"?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope. Fixed.

@enocom enocom force-pushed the auto-iam-authn branch 2 times, most recently from b32005b to b522e31 Compare November 8, 2023 16:35
AlloyDB uses a metadata exchange prior to handing over to the database
driver to begin the Postgres protocol. As part of this metadata
exchange, the dialer sends the IAM principal's OAuth2 token which the
proxy server uses to verify connection permissions and, if IAM AuthN is
requested, to authenticate to the database.
Copy link
Contributor

@ttosta-google ttosta-google left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@cthumuluru cthumuluru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dailer changes related to metadata exchange looks good to me.

Copy link
Collaborator

@nancynh nancynh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Eno

@enocom enocom merged commit e50dd25 into main Nov 15, 2023
16 checks passed
@enocom enocom deleted the auto-iam-authn branch November 15, 2023 16:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants