-
Notifications
You must be signed in to change notification settings - Fork 237
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Azure private endpoint support #1817
Azure private endpoint support #1817
Conversation
Skipping CI for Draft Pull Request. |
Cool. So this new field is per spoke cluster, not per hub? |
@2uasimojo yes, that's the idea. Each cluster has their own private endpoint so it has to be on per cluster basis. |
I tested this in context of ARO and it seems to work as expected: without |
Codecov Report
@@ Coverage Diff @@
## master #1817 +/- ##
=======================================
Coverage 41.68% 41.69%
=======================================
Files 353 353
Lines 33128 33156 +28
=======================================
+ Hits 13810 13823 +13
- Misses 18155 18170 +15
Partials 1163 1163
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make update
should rebuild the saas template that's causing verify
to fail.
But first...
I've been staring at the REST/HTTP code/docs for a while now, and I can't get my head around how Dial plays in relation to Host. My goal was really just to vet that APIServerEndpointIP is a sensible name for the new field. Because the way it's used makes it feel more like a "proxy". Or does Host not get used at all when Dial is set? Would the original code still work if instead of using APIURLOverride to set Host, we used it to do this DialContext thing? And if so, would that mean that we wouldn't need the new field, and you could just use APIURLOverride for this purpose? If not, what happens if you set both fields?
All of this ignorance makes me leery of introducing this field to the permanent API we have to support until we retire v1.
We have a couple of ways we could make this field "softer" to mitigate those concerns.
- We could wrap it in an alpha feature gate, so in order to use it, you have to explicitly opt into it in HiveConfig, and you're agreeing it might go away at any point without a version bump. (Of course we would consult with you before considering such a change.)
- We could feed it in via an annotation instead of making it a real API field. That way it's not part of the documented API, and we've got the flexibility to change/remove it if we come up with something better later on.
Thoughts?
@@ -668,6 +668,11 @@ type ControlPlaneConfigSpec struct { | |||
// active, Hive will use the override URL for further communications with the API server of the remote cluster. | |||
// +optional | |||
APIURLOverride string `json:"apiURLOverride,omitempty"` | |||
|
|||
// APIServerEndpointIP is the optional override of the API server IP address. | |||
// Hive will use this IP address for creating tcp connections using client-go. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit, I wouldn't mention implementation details like client-go in this docstring.
6901df7
to
0018c56
Compare
@2uasimojo
Let's imagine that we have an API server in kubeconfig set to Without a custom dialer we will be making a TCP connection to With
Host will not be used in transport/dial (TCP, layer 4), but will still be used on the application layer (in the
We considered replacing API server hostname in kubeconfig with IP address such as I believe it will be exact the same issue if we set
Given that we have the following:
I think this is what is going to happen:
I'll leave the decision on how to support this up to you. For ARO it is important to get guarantees that support won't get dropped and that this thing is tested (minimising chances of a regression). |
And then what happens? Does the other end of the connection use IIUC the point is to allow the console and cluster workloads to be public, but hide the API behind a private endpoint, right? Hypothetically, would the effect here be the same as if DNS resolved console.example.com to Regarding unit testing, I thought I had left this comment last week, sorry: |
There could potentially be multiple hosts at the same IP address. It is like setting up
We also support fully private clusters where both api server and ingress are private. To access the cluster you have to be on the same network as the cluster. From ARO resource provider perspective we treat all the cluster as if they were private: we connect to all of them (even public ones) via private endpoint. |
0018c56
to
5e3b2e4
Compare
@2uasimojo I updated Please take another look. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dig the unit tests ✓
After talking with the team:
We're cool making it a real API field, as it seems like it would be generally useful,
BUT
We (okay, I) would like to understand why this wouldn't be solved simply by repointing DNS for the API server URL to the desired IP
Inline suggestions for a couple of renames.
// APIServerEndpointIP is the optional override of the API server IP address. | ||
// Hive will use this IP address for creating TCP connections. | ||
// +optional | ||
APIServerEndpointIP string `json:"apiServerEndpointIP,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Talked about this with the team.
We would like to propose that the name include the word "Override"; and suggest that "Endpoint" is probably redundant.
Probably also useful to mention that the port from the original API server URL will be used.
// APIServerEndpointIP is the optional override of the API server IP address. | |
// Hive will use this IP address for creating TCP connections. | |
// +optional | |
APIServerEndpointIP string `json:"apiServerEndpointIP,omitempty"` | |
// APIServerIPOverride is the optional override of the API server IP address. | |
// Hive will use this IP address (with the port from the original API URL) for creating TCP connections. | |
// +optional | |
APIServerIPOverride string `json:"apiServerEndpointIP,omitempty"` |
|
||
type dialerMock func(ctx context.Context, network, address string) | ||
|
||
func (d dialerMock) DialContext(ctx context.Context, network, address string) (net.Conn, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, so: This DialContext
gets called by the DialContext
in remoteclient.go:345 as a method on the dialer
we pass into the DialContext
called in the test driver (L352 below) which is actually calling the DialContext in remoteclient.go:334 -- phew!
I think this would be (slightly) more scrutable if the func in remoteclient.go:334 were named something different, like CreateDialContext
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@2uasimojo Np, I renamed to createDialContext
.
CreateDialContext
will be returning DialContext
which is a wrapper around go's standard library DialContext
from Dialer
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also noticed that in this repo mocks are being generated with GoMock for every interface. I was going to use these mocks, but then realised that I cant: there will be a circular import.
If APIServerEndpointIP is present
5e3b2e4
to
70383b9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We (okay, I) would like to understand why this wouldn't be solved simply by repointing DNS for the API server URL to the desired IP
We were considering solving it with the DNS however the issue is that we have customers who:
- Use our managed DNS
aroapp.io
- Who use their own custom domain.
To handle clusters with custom domains we would have to create private DNS zone on the ARO RP side and there is a max limit of 1k private DNS zones per subscription. And we group multiple regions supported by ARO RP within a single subscription. So in every ARO RP sub, we would have an upper limit of 1k clusters with custom DNS. Sooner or later we will hit this limit.
|
||
type dialerMock func(ctx context.Context, network, address string) | ||
|
||
func (d dialerMock) DialContext(ctx context.Context, network, address string) (net.Conn, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@2uasimojo Np, I renamed to createDialContext
.
CreateDialContext
will be returning DialContext
which is a wrapper around go's standard library DialContext
from Dialer
.
pkg/remoteclient/remoteclient.go
Outdated
@@ -51,6 +53,10 @@ type Builder interface { | |||
UseSecondaryAPIURL() Builder | |||
} | |||
|
|||
type Dialer interface { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will make Dialer
interface and DialContext
/CreateDialContext
non-exported. These are internal implementation details and don't need to be used externally.
|
||
type dialerMock func(ctx context.Context, network, address string) | ||
|
||
func (d dialerMock) DialContext(ctx context.Context, network, address string) (net.Conn, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also noticed that in this repo mocks are being generated with GoMock for every interface. I was going to use these mocks, but then realised that I cant: there will be a circular import.
@m1kola: all tests passed! Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Functionally this is good, so I'm happy to land it. Thanks for picking this up @m1kola!
Couple items for a followon PR:
- Let's add to the docstring a statement like, "This field can be used when repointing the APIServer's DNS is not viable option."
- I'd like to clean up the mocking. I think moving the Dialer interface to a separate file would probably be the first move; then we can either continue to generate that mock if we're going to use it (moving the interface should let us avoid the circular import) or just get rid of it.
/lgtm
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: 2uasimojo, m1kola The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/override tide |
@2uasimojo: Overrode contexts on behalf of 2uasimojo: tide In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
This passed CI and nothing has merged since. ARO team is waiting on it, so I'm going to merge it manually. Otherwise it'd end up retesting. |
It includes support for private endpoint: openshift/hive#1817
It includes support for private endpoint: openshift/hive#1817
@@ -289,6 +295,14 @@ func (b *builder) RESTConfig() (*rest.Config, error) { | |||
} | |||
} | |||
|
|||
if override := b.cd.Spec.ControlPlaneConfig.APIServerIPOverride; override != "" { | |||
dialer := &net.Dialer{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nts: net.Dialer implements DialContext -- that's the prod path. We define the interface so we can swap out the impl for UT.
It includes support for private endpoint: openshift/hive#1817
https://issues.redhat.com/browse/HIVE-1955