Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide example of configuring Private Google Access #308

Closed
morgante opened this issue Nov 6, 2019 · 4 comments · Fixed by #310
Closed

Provide example of configuring Private Google Access #308

morgante opened this issue Nov 6, 2019 · 4 comments · Fixed by #310
Assignees
Labels
enhancement New feature or request triaged Scoped and ready for work

Comments

@morgante
Copy link
Contributor

morgante commented Nov 6, 2019

We should add an equivalent of the public networking example with a private networking example which includes:

  1. Setting up the required VPC subnets
  2. Setting up private Google Access
  3. Creating the actual GKE cluster
  4. Any required firewall rules

This would help to address issues like this one.

@morgante morgante added enhancement New feature or request triaged Scoped and ready for work labels Nov 6, 2019
@ideasculptor
Copy link
Contributor

ideasculptor commented Nov 6, 2019

I believe this is still a bug, not an enhancement. I have had private google access enabled through all of my tests (which explains why I failed to note the issue, since I had certainly read every word of the page you linked numerous times before I started documenting the issue), and node pools have consistently failed to come up. When a node pool is not private, private google access has no effect (according to the google docs), and the nodes, with the default configuration, are unable to download necessary resources from storage.googleapis.com over port 443 until I add routes and rules for public access. It is possible that the failure is something different when I run a completely private node pool, but since I can't get that configuration to come up at all, I have limited ability to diagnose the problem.

I can well believe that the nodes need to use a different hostname in order to resolve to the private google access endpoint, but there is no mention of that in the documentation for the module, nor even an obvious way to override the url.

@morgante
Copy link
Contributor Author

morgante commented Nov 6, 2019

I believe this is still a bug, not an enhancement.

It's not a bug. Nothing in our module is meant to configure Private Google Access for you, and we explicitly don't attempt to control networking from the module.

The fact that your chosen network config didn't have a workable path to GCP APIs isn't the scope of this module.

I think you're overfocusing on this module. Private Google Access isn't specific to GKE and if you'd attempted to use GCE on that network it'd also have similar issues with accessing any GCP samples.

I know you spent a lot of time troubleshooting this, but we really can't cover everything. If you had attempted to manually create the same GKE cluster through the console, you'd have had the same issue.

Anyways, the private access docs to cover the requirements for setting it up: https://cloud.google.com/vpc/docs/configure-private-google-access#requirements

We can add examples but to claim this is a bug when it's an issue with your underlying network config is inaccurate. Reminder: most customers we work with explicitly separate network creation/management from the GKE clusters themselves.

I understand your frustration but continuing to claim this is a bug isn't going to help. We are happy to look at pull requests, and will also prioritize this enhancement in due time, but our staffing is quite constrained and we can't expect to repeat the GCP docs.

@ideasculptor
Copy link
Contributor

ideasculptor commented Nov 6, 2019

First, my code explicitly separates network creation/management from the GKE cluster, as you suggest. I'm not sure why you think mine doesn't. I've repeatedly stated that I use the network module for creating networks/subnets.

My subnet DOES enable private google access, and always has.

You keep claiming that my underlying network config is incorrect, but I have yet to see why you believe this to be true. Never mind that I have tried every possible variation of every network and gke variable that isn't purely aesthetic (names and such).

At this point, having now created a private zonal cluster with networking example, which does work correctly, my assumption is that there is a problem with private clusters coexisting with shared vpc, or else last weekend's 3 day outage permanently corrupted the state of my service project, host project, or both. There isn't an alternative explanation, as my code is now identical to the working example I just created, except for the shared vpc thing and my projects were created either immediately before the outage, or coincident with it (whereas the ci-gke project and its networks were only just created). Additionally, there is nothing that I can find in any of the GCP or module documentation which suggests that private google access is incompatible with shared vpc networking

And again, I have no reason to suspect any problem with my shared vpc setup, which is identical to every shared vpc example I can find. I use only the project-factory module to create both host and client project, and the client project is set up to access the host project network when the project is created, by the project factory module. I have no trouble at all launching individual compute instances into the host network from the same service project I am using to launch the GKE cluster - via the bastion module, for example, which works just fine.

I'll inevitably set about creating a 'private zonal with networking and shared vpc' example, next. But first, I guess I'm going to tear my entire account back down to the studs and recreate everything.

Update: Rather surprisingly, when I tore down my networks and put them back in exactly the same condition - no code was modified - I was able to bring up a private cluster successfully. It would appear that I've been battling subnets/network that were still in a broken state after last weekend. At least, I can't come up with another obvious explanation for my problems until now or for it suddenly working after subnets were recreated.

I'll send a PR for the new example as soon as I get the test verification code to work. I don't know much about the kitchen tests, other than how to run them, and I'm getting odd errors with what I have now.

@ideasculptor
Copy link
Contributor

Note - that simple example in the PR would be well served by including the use of the firewall network module and an explicitly declared node pool, with tags to enable the minimum network permissions, but it at least shows setting up the subnets and getting a private cluster running

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request triaged Scoped and ready for work
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants