-
Notifications
You must be signed in to change notification settings - Fork 561
[kubernetes] unable to create cluster with custom vnet #120
Comments
cc @colemickens. I can provide private keys to help debug, but it should be fairly easy to get a repro |
We go straight to the route table that is listed in the config file, but we also check the subnet to see if it's properly configured. Options:
I think #1 might possibly be the right thing to do, depending on if we can support multiple subnets of machines with same route table. If we can, then CC: @brendandburns for any thoughts. |
This presents another question though, where does the route table live. Need to find out if the route table for the subnet in the existing vnet can live in both resource groups, only one, etc. If it can live in the existing-vnet's resource group, then we need to support the full identifier string for the |
Just came back here to report the same issue (finally got around to looking at it from #99, sorry for the delay). My two cents would be that full Resource IDs feels like it would be the azure idiomatic way of declaring resources, makes me wonder if point 3 of yours makes more sense here? |
Thanks @colemickens. Would it not be easiest to modify the I think this is what you meant by option 3. As deploying Kubernetes under a custom VNET never worked, there would be no need to start versioning this config......yet. |
@jpoon How would that help anything? The problem is that the code assumes that all resources are in the same resource group specified in the config file. |
In our case, everything is under the same resource group. Are there situations where people would deploy a VNET under a separate resource group? |
I had assumed as much, but I don't actually know that as a solid fact. Is that something you would have data on? @rgardler or @sauryadas for customers who want to deploy clusters (particularly Kubernetes) into existing vnets... are they generally putting the cluster into the existing resource group, or are the existing vnet and new cluster typically in different resource groups? |
As custom vnets don't work at all, would it be reasonable to do a quick fix to support things in the same RG? |
Due to the fact that the apimodel takes vnetSubnetID as the full identifier string (which I agree with), it means that for this "quick fix" we need two template functions - one to extract the vnet name, and another to extract the subnet name. So that in the template we can write PRs are very welcome, I'm not going to be able to get to this for a while. |
I have a branch here that might fix this issue for deployments into a single RG. I don't have an easy way of testing it. Is anyone here willing to give it a shot? https://github.com/colemickens/acs-engine/tree/colemickens-pr-fix-custom-vnet |
@colemickens I tested the branch with a config very similar to the custom vnet example. New single RG, new vnet, and so on. Basically just changed the vnet and RG names. The result after applying the templates was the same as before. Both vnetName and subnetName are fully qualified in |
I think I fixed it. Could I get you to pull, rebuild and try again? Thanks so much. |
No dice. Error on the deployment:
|
Okay, I pushed another iteration up, let me know if you try it (thanks for guinea-pigging it, I really appreciate it). |
Hmm something went wrong again. Both the masters and agents looks like this now:
|
@lmickh ha, off by one. Just pushed another one if you want to try. |
@colemickens I'm not seeing a new commit on that branch. |
I force pushed.
…On Jan 12, 2017 9:28 AM, "Lucas Harms" ***@***.***> wrote:
@colemickens <https://github.com/colemickens> I'm not seeing a new commit
on that branch.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#120 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAT9dLHsD2s7NnQG88oYXHx0d_g1LQy-ks5rRmKxgaJpZM4K7jVZ>
.
|
Sorry @lmickh apparently I commit --amended last night, but forgot to push. I've just pushed the change up now... |
Latest one worked. The short names are listed in azure.json properly and all hosts were able to create routes. |
I'd missed your reply, @lmickh. Thanks very much for dogfooding for me and confirming. |
Cole is fixing in #172 |
@colemickens. We still have the issue of the initial post from @jpoon. We've already tried several days to deploy acs cluster with k8s in custom VNET. Using fix in #172 - still not working for us. |
@MoTAUser Can you elaborate? I've had other people report it's working. ACS does not support custom vnet, only ACS-Engine does... |
Hi @colemickens,
Do you see anything suspicious so far? Regards, |
Heh, yes, the last one of course: "kubernetes is successfully installed, but no routes are created." Does the apiserver actually start running, or no? That will help me guess as to why routes aren't being created. |
Hey, thanks for your reply. Unfortunately we are sitting here in Germany and aren't at work anymore and can't access the azure resource. We are going to dig into this again tomorrow morning. Anything else we should look out for? Do you want to have logs of any kind? |
The full logs of |
It seems apiserver is running normally. colemickens-[kubernetes] unable to create cluster with custom vnet #120 logs.zip |
This is now fixed by the merging of #172. |
@MoTAUser Please file a new issue if you're still having issues after updating ACS-Engine, rebuilding and redeploying. Thanks. |
@jpoon I created the custom acs cluster today. I am looking for how to connect my cluster using kubectl, could you please share the steps how to connect my k8s cluster using kubectl. |
@jpoon @colemickens @mogthesprog @lmickh can anyone please share the steps to expose our custom acs pods in loadbalancer service?? When i try to create LB service i got the below error. Events: Normal EnsuringLoadBalancer 1s (x6 over 2m) service-controller Ensuring load balancer Regards, |
Most likely a bad service principal: https://docs.microsoft.com/en-us/azure/aks/kubernetes-service-principal |
What happened:
Creating a k8s cluster using an existing vnet, the cluster is unable to create routes in the Azure Route table, and is therefore unable to schedule any pods.
How to reproduce it:
When the cluster is up, the nodes report as ready:
Wtih NetworkUnavailable message of RouteController failed tocreate a route:
Looking at the kube-controller logs (/var/log/containers):
Notice the error message has an malform resource:
Microsoft.Network/virtualNetworks/subscriptions
.Workaround
We've deduced this to the
/etc/kubernetes/azure.json
expecting unqualified names for both the vnet and subnet. Instead, the fully-qualified names are present:After changing the subnet and vnet to unqualified names and restarting kubelet, we see the routes as being created and things are back to normal.
Much of the credit in debugging this goes to @jamesbak.
The text was updated successfully, but these errors were encountered: