Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PV Migration across cluster without LB svc #103

Closed
jkroepke opened this issue Oct 6, 2021 · 10 comments
Closed

PV Migration across cluster without LB svc #103

jkroepke opened this issue Oct 6, 2021 · 10 comments

Comments

@jkroepke
Copy link

jkroepke commented Oct 6, 2021

Is your feature request related to a problem? Please describe.
Use Case: on premise cluster migration.
On premise cluster doesn't have always the ability to create Service Type=Loadbalancer. Currently, pv-migrate only support lbsvc as strategy between multiple clusters.

Describe the solution you'd like
Run rsync through the Kubernetes API

With kubectl exec it's possible to pipe data through the Kubernetes API.

The kubectl builtin command kubectl cp use this, but instead rsync, it supports only tar.

rsync -avurP --blocking-io --rsync-path= --rsh="kubectl exec $POD_NAME -i -- " /dir rsync:/dir

Describe alternatives you've considered

kubectl cp

Additional context
As I know, the openshift cli supports rsync a copy mechanism, see: https://github.com/openshift/origin/blob/release-3.11/pkg/oc/cli/rsync/rsync.go

@utkuozdemir
Copy link
Owner

Thanks for the suggestion, it makes sense.

I had the idea to do something like this in the past, transferring the data over the client machine (your laptop), in cases where the only place that can be the the "bridge" between 2 clusters can the the client machine that is accessing them. Might look into implementing a new strategy for this, but probably not soon (I can start looking into it earliest in a month from now).

Of course, PRs welcome :)

@fabiorauber
Copy link

There is a way to make it work for on-premises clusters: by installing MetalLB (https://metallb.universe.tf), which can provide ServiceType=LoadBalancer services at layer 2.

@utkuozdemir
Copy link
Owner

That's correct but it would only work if you have layer2 connectivity or routing between 2 clusters, or if you have public IP ranges that you can give to MetalLB to allocate.

There are many on-prem setups where none of these are in-place due to how the networking is set up, security reasons, lack of public IPs and/or permissions.

@fabiorauber
Copy link

fabiorauber commented Oct 19, 2021

You are right @utkuozdemir. I just wanted to document the possibility to anyone who may stumble upon this issue in the future.

@gernoteger
Copy link

Hi,
I am currently exploring the same situation. I have an age-old Openshift3 Cluster, and installing MetalLB is no option.
The K8S API ist definititively an option from a funktionality point of view. Performance-Wise, you'll really stress your poor API server with lots of Data.
Another option could be extending the current LB strategy to use regulart tls to together with stunnel; this would move the possibly large amout of data across infrastructure that's built for that. I give it look.

@utkuozdemir
Copy link
Owner

utkuozdemir commented Nov 26, 2021

@gernoteger Very good point - it will stress out the apiserver. Thank you.

I was looking into something similar right now actually. Not over the apiserver but via a tunnel. To do something like this: https://unix.stackexchange.com/a/183516

Edit: On second thought, this method would not add additional value to the lbsvc strategy - we would still need access from internet and ServiceLBs would need to work on both clusters.

Even though it would put much load to the apiserver, I think it makes sense to do the transfer over it - since this new strategy (let's call it local) would mainly bring value if both clusters are sort of air-gapped (or at least doesn't offer ServiceLB/not accessible from internet) and kubectl on the local machine is the only method to talk to both of them.

@gernoteger
Copy link

I agree the a local strategy that involves the local client is robust from a connectivity point of view. I was rather thinking of having an additional strategy to svclb, but with ingress.
In this case the data would never need to be persisted outside the clusters (important from a data protectionn point of view), and you would only have to transfer it once.
Of course the pods inside the receiving cluster would have to have access to the Ingress Endpoints of the sending cluster (or the other way round, but one should suffice).

All in all I think both approaches - api and ingress - have their use cases. albeit slightly different ones.

@jkroepke
Copy link
Author

Most Ingress Router supports TLS passthrough. As I know, stunnel supports SNI.

Using rsync through stunnel is maybe a good option, too.

https://charlesreid1.com/wiki/Stunnel/Rsync

@utkuozdemir
Copy link
Owner

utkuozdemir commented Nov 28, 2021

Been working on this branch which uses a combination of port-forward and reverse SSH tunnel to do the sync totally over the client: https://github.com/utkuozdemir/pv-migrate/tree/local-strategy

I will probably keep this strategy experimental for a while and won't make it part of the default set of strategies but it can be useful when the only available access to both clusters is kubectl.

Doing sync over ingresses is a whole another story which would need its own strategy. Also, it'd most possibly be dependent on the ingress controller used.

@utkuozdemir
Copy link
Owner

Today I released v0.7.3 with the initial support for local strategy. It is an experimental strategy and is not attempted in the default settings. I hope to improve it over time.

We can track the improvements in separate tickets.

Closing this issue since its requirements are met by the local strategy.

Any feedback is appreciated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants