Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vRack (Private network) integration #15

Closed
mhurtrel opened this issue Oct 22, 2020 · 44 comments
Closed

vRack (Private network) integration #15

mhurtrel opened this issue Oct 22, 2020 · 44 comments
Labels
functionnal and UX New functional feature or experience improvment Security & Compliance New certifications or security features improvments

Comments

@mhurtrel
Copy link
Collaborator

mhurtrel commented Oct 22, 2020

As a Managed Kubernetes Service user
I want my worker nodes to be deployed in one of my private networks
so that I can expose and acceess other IaaS deployed in my vRack

Note :
The choice of private network used for a given cluster will be done at cluster creation
Beta documentation : Here is a full updated documentation : https://dl.plik.ovh/file/lc9J4dqIITkRBVhL/mhppqw02G1gENR7d/vrackbetafinalstage-allexceptmanager.pdf

@mhurtrel mhurtrel added functionnal and UX New functional feature or experience improvment Managed Kubernetes Service Security & Compliance New certifications or security features improvments labels Oct 22, 2020
@mhurtrel mhurtrel changed the title MKS vRack (Private network) integration vRack (Private network) integration Oct 26, 2020
@mhurtrel
Copy link
Collaborator Author

mhurtrel commented Nov 17, 2020

Private beta has started for a selection of customers.
Our load balancer colleagues plan the Public IP to Private nodes in January

@mhurtrel
Copy link
Collaborator Author

mhurtrel commented Nov 27, 2020

We can now welcome now private beta testers.

@mhurtrel
Copy link
Collaborator Author

mhurtrel commented Jan 15, 2021

The beta is now open to anyone and the feature is self service.

Please note that there are still known limits at this point :

  • compatibilitiy with LBaaS is targeted late February
  • functionnality is accessible only though http API, in our web control panel and terraform ressource are planned for late February/early March

@mhurtrel mhurtrel mentioned this issue Feb 3, 2021
@mhurtrel
Copy link
Collaborator Author

mhurtrel commented Feb 4, 2021

The feature (with LB pub to private integration) will be available early March

@mhurtrel
Copy link
Collaborator Author

The current ETA for the feature with full LB public to private support is now the 24th of March

@telenieko
Copy link

The current ETA for the feature with full LB public to private support is now the 24th of March

🎉 thanks for the update on the ETA!

@Escaflow
Copy link

We gave it a go for our evaluation progress, but we hit the (expected) wall where the LB cant use the internal cidr yet:

Error syncing load balancer: failed to ensure load balancer: 400 Bad Request
(EU.ext-1.***.***.****) - https://api.ovh.com/1.0/cloud/project/*****/loadbalancer/*****/configuration: 
{
    "class":"Client::BadRequest::UnprocessableEntity",
    "message": "private IPs are not allowed on this loadbalancer (server:staging-node-03bba2 -\u003e 10.3.174.16)"
}

What would be awesome is that we could define internal loadbalancers with annotations (https://kubernetes.io/docs/concepts/services-networking/service/#service-tabs-5)

Either use the "offical" openstack annotation:

[...]
metadata:
    name: my-service
    annotations:
        service.beta.kubernetes.io/openstack-internal-load-balancer: "true"
[...]

Or stick with the ovh branding:

[...]
metadata:
  name: my-service
  annotations:
    service.beta.kubernetes.io/ovh-internal-load-balancer: "true"
[...]

We have currently valid reasons for some internal LB (eg. clients that only Communicates with VPN Acces, Security for internal apps, customers data flow,...) where we neither want nor need an external loadbalancer.
While we could just use nodeport and DNAT it to the service, we would like to keep the current convenience to just use the LB internally.

@mhurtrel
Copy link
Collaborator Author

We have an additionnal slight delay, due to our Netwok colleagues helping customers affected by the SBG incident a few days ago. We now expect to ship the feature next week.

@mhurtrel
Copy link
Collaborator Author

My Network colleagues are informing me of a new ETA wich would bring the feature to be available in the week starting on the 5th of April... really sorry about this multiple delays.

@mhurtrel mhurtrel added the Available in Beta Available in Beta (details in the last issue comments) label Mar 30, 2021
@mhurtrel
Copy link
Collaborator Author

mhurtrel commented Apr 4, 2021

My Network colleagues are unfortunately informing me of an additionnal delay, and will communicate an update ETA soon

@mhurtrel
Copy link
Collaborator Author

We should be able to activate the LB compatibility next week. The vrack feature will then be GA complete and we would release the control panel in the following weeks).

@mhurtrel
Copy link
Collaborator Author

The feature is now compatible with External Load Balancers orchestrated with Kubernetes.
The feature is available in all Kubernetes regions, for existing and new cluster, whether they already started experimenting with vRack or not.

The feature is currently available through OVH HTTP API and will soon be available from the control panel ( a.k.a. manager).

Here is a full updated documentation : https://dl.plik.ovh/file/lc9J4dqIITkRBVhL/mhppqw02G1gENR7d/vrackbetafinalstage-allexceptmanager.pdf

@cambierr
Copy link

Any chance to have this in GRA11?

@mhurtrel
Copy link
Collaborator Author

The feature will be available in the control panel (aka Manager) in about 2 days.
@cambierr Kubernetes is not deployed in GRA11 yep (gra11 was accelerated following the situation in SBG.
As a workaround, any customer with GRA11 should also have GRA5 available where Kubernetes is deployed.
I will check on an ETA for GRA11 too, sorry for the inconvenience.

@ryam4u
Copy link

ryam4u commented Apr 26, 2021

When i create a cluster with vrack support following the guide, kubernetes can't provisioning an IP for LB, it stuck on "PENDING" state. I've waited for over an hour.

Any ideas?

@mhurtrel
Copy link
Collaborator Author

@ryam4u this look like a specific incident, I confirm you should get an IP assigned. Please open a support ticket (note the that the loas balancer team is also avaialble in best effort on https://gitter.im/ovh/cloud-loadbalancer

@matmicro you can find the documentation as a pdf link in the issue above. The feature should be avialable in the control pannel in about 2 days

@cambierr
Copy link

@mhurtrel cool; tell me as soon as you have an ETA... I really want to avoid having to setup networking on a new region :)

@ZuSe
Copy link

ZuSe commented Apr 27, 2021

@mhurtrel
I am getting a 500 when trying to create a new cluster as described in the docs.
Are you guys on it?

image

@mhurtrel
Copy link
Collaborator Author

@ZuSe We really have to improve the quality of our error returns, our apologies for this.
From what I see here, the flavor you ask is not yet available ( we are currently compatible with B2, R2, C2 and i1 flavor families, D2-4 and D2-8 will come in a few weeks.

However if this change doesn't solve the issue, please open a ticket or exchange with other users on https://gitter.im/ovh/kubernetes who I think will find whats wrong.

@ZuSe
Copy link

ZuSe commented Apr 27, 2021

Hi @mhurtrel

i set it up with b2-15 now. However also here, u need make sure that u write the flavor in smaller case characters (e.g. B2-15 -> copy&paste from web, won't work).

I there an ETA when the Discovery nodes are available?

And second question. How can I migrate existing clusters to my private network/vrack?

@mhurtrel
Copy link
Collaborator Author

Concerning the D2 its is a matter of weeks. You can subsribe to #19 to be notified. If you have payed for monthly instances you can enable the private network and keep those instances by resetting the cluster, but you cnan't add/change the network used for an existing cluster without restetting it. You may want to use tools like Velero to move workload from one to another cluster.

@ZuSe
Copy link

ZuSe commented Apr 28, 2021

Hi @mhurtrel

thanks for the clarification. I will give a try to velero. Do you know if it works with the OVH Object Storage S3 API?
I know a couple of s3-clients that only accept the aws implementation.

@ZuSe
Copy link

ZuSe commented Apr 28, 2021

I just gave it a try. It works ;)

For the ones who are interested. This my install command:

velero install --provider aws --plugins velero/velero-plugin-for-aws:v1.2.0 --bucket backups --secret-file ./credentials-velero --backup-location-config s3Url=https://storage.gra.cloud.ovh.net,s3ForcePathStyle=true,region=GRA --snapshot-location-config region=GRA,s3Url=https://storage.gra.cloud.ovh.net --features=EnableCSI

@mhurtrel mhurtrel removed the Available in Beta Available in Beta (details in the last issue comments) label Apr 28, 2021
@mhurtrel
Copy link
Collaborator Author

The feature is now available in the control panel (a.k.a. manager)
Full documentation should be made available on https://docs.ovh.com/gb/en/kubernetes/ before end of the week.
image

@pzalews
Copy link

pzalews commented Apr 29, 2021

When we could expect the support of this feature in terraform provider?

@mhurtrel
Copy link
Collaborator Author

Hi @pzalews. My colleague @d33d33 is currently working on it.

@mhurtrel
Copy link
Collaborator Author

mhurtrel commented May 7, 2021

@pzalews The private network feature can now be used withing OVHcloud terraform provider ovh/terraform-provider-ovh#189

@vdieulesaint
Copy link

Hi @mhurtrel ,
How do we add a Vrack to an existing k8s cluster ?
Thks.

@giudicelli
Copy link

How do we add a Vrack to an existing k8s cluster ?

You cannot, you will have to reset your K8S cluster to be able to attach a vRack to it....

@fkalinowski
Copy link

Hi @mhurtrel,

I'm currently testing the Managed K8S inside a vRack.

Here is my setup:

  • Subnet 172.16.1.0/24 (VLAN 0) is used by my baremetal hosts in the vRack
  • Subnet 172.16.5.0/24 (VLAN 0) is configured for any OpenStack VM popped in my Public Cloud Private Network (via Horizon UI)
  • Gateway (pfSense ) with 2 network interface (172.16.5.1 + 172.16.1.252) as OpenStack Instance in my Public Cloud to route trafic between both subnets 172.16.1.0/24 and 172.16.5.0/24
  • The Public Cloud Subnet is also configured (via Horizon UI) to enable DHCP + provide the Gateway 172.16.5.1 + provide DNS server 172.16.1.1 + push the route 172.16.1.0/24 via 172.16.5.1 (i.e. the pfSense gateway)

To fully test this setup, I've deployed a D2-2 (Ubuntu 20.04) instance in my Public Cloud, here is the result:

  • the instance get an IP in the expected DHCP range of subnet 172.16.5.0/24 ==> OK
  • I can ping the instance from the same subnet (via pfSense) ==> OK
  • I can ping the instance from the other subnet (via baremetal host) ==> OK
  • the instance has the appropriate routes to reach subnet 172.16.1.0/24 ==> OK
  • the configured DNS servers are available in /etc/resolv.conf and DNS resolution works ==> OK

After validating my setup, I've configured a Managed K8S inside the appropriate Private Network/vRack, here is the result:

  • the worker nodes get an IP in the expected DHCP range of 172.16.5.0/24 ==> OK
  • I can ping the worker node from the same subnet (via pfSense) ==> OK
  • I CANNOT ping the worker node from the other subnet ==> NOK
  • the worker node DOES NOT have the appropriate routes to reach subnet 172.16.1.0/24 ==> NOK
  • If I manually add the appropriate route via a Pod with hostNework and NET_ADMIN capability then the routing works...
  • the configured DNS servers are NOT available in the worker NODES ==> NOK

In conclusion, it seems that OpenStack subnet configuration (Gateway, Routes, DNS servers) is NOT honored by provisioned Managed K8S Worker Nodes.

How can we enforce the DNS + Routes + Gateway in the Worker Nodes ?

@Escaflow
Copy link

Escaflow commented Jul 1, 2021

@fkalinowski we hit the same issue, our setup is similar...

Our workaround is to use a deamonset that modifies the worker routing table with k8-route and to configure the corefile accordingly

[...]
example.com:53 {
        forward . 172.16.5.1:53
        errors
        cache 30
    }
[...]

There are considerable drawbacks when doing it this way, mainly if you want to use a registry on prem and try to download your images from there (since core dns isn't called in that case).

@matmicro
Copy link

matmicro commented Jul 1, 2021

I can read on the documentation that we should not use subnet 10.2.x.x
https://docs.ovh.com/gb/en/kubernetes/known-limits/#known-not-compliant-ip-ranges

I can see on my kubernetes that it uses 10.3.x.x

Is there a mistake here ?
Shall i keep using 10.3 on my vRack ? No conflicts if i want to attach this vrack to my k8s clusters ?

@lchdev
Copy link

lchdev commented Jul 1, 2021

There are considerable drawbacks when doing it this way, mainly if you want to use a registry on prem and try to download your images from there (since core dns isn't called in that case).

@Escaflow We worked our way around this by also forcing our own nameserver in the /etc/resolv.conf of the host (mounting the host filesystem in a privileged container also deployed as a DaemonSet). This also avoids the need to edit the CoreDNS configuration, but requires a restart of the CoreDNS pods (DNS configuration is injected by kubelet only when the pod is started).

Seems to be working fine for now...

@mhurtrel
Copy link
Collaborator Author

mhurtrel commented Jul 1, 2021

Thanks a lot for your feedbacks ! I wanted to insist we are aware of those limitation and we are chexking with the team

  • to document your approach @Escaflow and @lchdev for other users in the short term
  • offer an easier approach to allow you to define yoir suctom gateway if you want to do so (in the midterm)

@fkalinowski
Copy link

Hi @mhurtrel,

Just to be sure we are all aligned, there is two distinct problems:

  1. ability to add static route (i.e. custom gateway/interface per subnets)
  2. ability to provide custom (internal and/or external) DNS servers (additionnal or as replacement of the default ones)

In summary, any configuration that we are able to provide through the DHCP configuration at OpenStack level.

@mhurtrel
Copy link
Collaborator Author

mhurtrel commented Aug 3, 2021

our colleague @LostInBrittany wrote a guide describing how to manually define routes to manage this advanced use case (Kubernetes cluster in a vRack private Network wanting to access to other private network in the same vRack) : https://docs.ovh.com/gb/en/kubernetes/vrack-example-between-private-networks/ . This will be useful before this default gateway feature is developed ( #116 )

@fkalinowski
Copy link

Hi @mhurtrel @LostInBrittany,

Thanks for the guide !
Any ETA for the definitive solution that will automatically define these additional routes ?

Unfortunately it does not cover the second problem "ability to provide custom (internal and/or external) DNS servers (additional or as replacement of the default ones)".
Any clue on how to do this manually or an ETA for an integrated/automatic solution for this problem ?

@mhurtrel
Copy link
Collaborator Author

@fkalinowski you can follow the dvancement on #116 . I will give an ETA there as soon as possible (my current guess would be october or november).

Concerning your DNS you can change those default DNS servers by deploying a deamonset pod . We don't have specific documentation for this use case but that one should help : https://kubernetes.io/docs/tasks/administer-cluster/nodelocaldns/

@fkalinowski
Copy link

@mhurtrel, thanks for your reply but regarding Node Local DNS approach this only solves the problem with DNS query within Pods. It does NOT solve the problem with DNS query within the Node.

That's this second use case we need to achieve because we fetch some Docker images from a Docker Registry which is only reachable inside the vRack. This is required to avoid exposing our Docker Registry on the Internet (and thus increase our security).

At the current moment the only workaround we have identified is to override the resolv.conf file of each node at startup via a DaemonSet...

@mhurtrel
Copy link
Collaborator Author

@fkalinowski just confirming we support coreDNS configuration since last October #184

@fkalinowski
Copy link

Hi @mhurtrel thanks for notifying me, but as far as I know CoreDNS solves private DNS resolution at Pod level but still not at Node level.
This means that overriding CoreDNS configuration through a custom ConfigMap will not help containerd to resolve the service name of our OCI Image Registry and allow us to deploy private Docker Image on our K8S cluster.
Can you confirm my understanding ?

@mhurtrel
Copy link
Collaborator Author

mhurtrel commented Jan 26, 2023

Indeed for you use case, the deamonset forcing a customer DNS resolver is the best solution. Please note that it is capital that your customer node DNS resolver resolves public FQDN to ensure normal functionning of our systems.

We may improve that when DNSaaS will be part of our public cloud porfolio.

@clement-igonet
Copy link

@mhurtrel, thanks for your reply but regarding Node Local DNS approach this only solves the problem with DNS query within Pods. It does NOT solve the problem with DNS query within the Node.

That's this second use case we need to achieve because we fetch some Docker images from a Docker Registry which is only reachable inside the vRack. This is required to avoid exposing our Docker Registry on the Internet (and thus increase our security).

At the current moment the only workaround we have identified is to override the resolv.conf file of each node at startup via a DaemonSet...

So what about isolating an OVH managed docker registry inside a vRack env ?

@antonin-a
Copy link
Collaborator

antonin-a commented Dec 20, 2023

Hello @clement-igonet this feature is on our roadmap for 2024 and we can track it using this issue : #541

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
functionnal and UX New functional feature or experience improvment Security & Compliance New certifications or security features improvments
Projects
None yet
Development

No branches or pull requests