-
Notifications
You must be signed in to change notification settings - Fork 789
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add AWS_MSK_IAM SASL Mechanism #763
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exciting to see this shape up!
Thanks for the quick response! I think I've addressed all the feedback here. |
Just checking in to see if you want anything else done / what the next steps are |
@cmaher sorry about that, I had a couple other questions I wanted to raise but the last few days have been a little hectic for me. |
sasl/aws_msk_iam/msk_iam.go
Outdated
// The sigv4.Signer to use when signing the request. Required. | ||
Signer *sigv4.Signer | ||
// The host of the kafka broker to connect to. Required. | ||
Host string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any time I've gone to set up an MSK cluster, each broker (which requires more than 1 in all cases) has it's own distinct host name. So far, I have not found a unified host name like I see in other services with clusters (eg: Redshift, Elasticache) so should we be supporting multiple values here? (it looks like the host name is part of the signing process, so maybe that isn't as easy at it seems?)
Additionally, it seems like the MSK API exposes endpoints to retrieve the broker host names, depending on the specific authentication strategy being used. I think it would be more future-proof to allow using the MSK API to perform this look-up internally, but I think we can pass on this bit of "magic" for now and make it a later enhancement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right. I was using a single broker in my experiments, but the host needs to be dynamic. I think the best way to do this while maintaining the existing sasl interface is to set the host on the context. This approach raises some questions, however, which I'll point out in my commit
@@ -258,6 +258,8 @@ func (d *Dialer) connectTLS(ctx context.Context, conn net.Conn, config *tls.Conf | |||
return | |||
} | |||
|
|||
const ContextKeyBrokerAddr = "brokerAddr" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was wondering where the proper place to put this is, assuming you want to use constants instead of literals
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like this is actually unnecessary now that sasl.MetadataFromContext
is being used. Would you mind removing this const and updating the tests that still reference it?
@@ -6,3 +6,5 @@ require ( | |||
github.com/aws/aws-sdk-go v1.41.3 | |||
github.com/segmentio/kafka-go v0.4.21 | |||
) | |||
|
|||
replace github.com/segmentio/kafka-go => ../.. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is needed for the tests to pass in this pull-request, but it should not be merged. How should this be handled? 2 separate pull-requests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yeah, in that case I would think we can just remove this line before merging, once we've gotten tests to pass. I would rather avoid 2 PRs.
Indent json for godoc Co-authored-by: Achille <achille.roussel@gmail.com>
c054939
to
fb5d688
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like we're getting closer!
@@ -258,6 +258,8 @@ func (d *Dialer) connectTLS(ctx context.Context, conn net.Conn, config *tls.Conf | |||
return | |||
} | |||
|
|||
const ContextKeyBrokerAddr = "brokerAddr" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like this is actually unnecessary now that sasl.MetadataFromContext
is being used. Would you mind removing this const and updating the tests that still reference it?
@@ -6,3 +6,5 @@ require ( | |||
github.com/aws/aws-sdk-go v1.41.3 | |||
github.com/segmentio/kafka-go v0.4.21 | |||
) | |||
|
|||
replace github.com/segmentio/kafka-go => ../.. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yeah, in that case I would think we can just remove this line before merging, once we've gotten tests to pass. I would rather avoid 2 PRs.
} | ||
signUrl := url.URL{ | ||
Scheme: "kafka", | ||
Host: saslMeta.Host, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Locally, my tests panic because saslMeta
does not exist, which I think will be corrected by updating your test cases.
That being said, should we guard against this and return an explicit error message if we detect this? (rather than panicking)
}{ | ||
{ | ||
ctx: func() context.Context { | ||
return context.WithValue(context.Background(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think these need to be updated to use sasl.WithMetadata
.
Howdy again! Sorry about the delay, I've been trying to get this code to work with a manually-provisioned MSK cluster just to make sure I've at least seen it in action for myself (since as you pointed out, unit-testing this is hard/unfeasible). Thankfully, after some trial and error of my own, I was able to get it to work! That makes me more comfortable moving forward with merging even without some added test coverage beyond what we have here. I'll follow up on the existing threads, as I think we have a few outstanding ones to finish addressing. That being said, this is looking to be really close. |
@@ -40,6 +40,7 @@ jobs: | |||
paths: | |||
- /go/pkg/mod | |||
- run: go test -race -cover ./... | |||
- run: find ./sasl -name go.mod -execdir go test -race -cover ./... \; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know this was my idea, but it looks like it was a bad one, sorry about that! It looks like the exit code from the -execdir
is being lost, so CircleCI is treating these as passing even though they panic. :(
If we don't have any other ideas, feel free to revert back to what you had before.
Hey, thanks for getting back to this. I probably don't have time this week, but I'll be able to get to it within the next two weeks after the holiday. |
This PR has been addressed by #798 so I will close this. Thank you so much for your contribution! |
Addresses #661
I've verified this against a live MSK cluster and confirmed the behavior relative to other implementations. Unfortunately, live connections can't be unit-tested, so I'm testing the signature generation, which is the heart of this mechanism. Also, as discussed in the original ticket, I've made this a submodule, with the intent that it won't cause conflicts with general user code. Please let me know if I need to change/add anything.