Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[hack] support unique domain names for CRC clusters to enable multiple clusters on the same network #7244

Merged
merged 8 commits into from
Apr 3, 2024

Conversation

jmazzitelli
Copy link
Collaborator

@jmazzitelli jmazzitelli commented Apr 1, 2024

tl;dr This allows you to use CRC to run multiple OpenShift clusters on a local network. Very useful for multi-cluster work.

This is a very useful enhancement to the CRC hack script.

We already had a command "expose" and "unexpose" which exposes the CRC cluster to remote machines on the local network. This allows you to access the OpenShift cluster via a browser or an oc/kubectl client running anywhere on the network (not just from the same box where CRC is running). This uses firewall rules.

However, the problem is you can only have one CRC cluster on your network because CRC fixes the domain name to "crc.testing". So if I want to start two CRC clusters, one on machine "foo" and one on machine "bar" they both bind to "crc.testing" and therefore only one can be exposed to remote machines on the network.

This PR introduces a new command "changedomain" which changes the domain of a running CRC cluster from the default crc.testing to an nip.io hostname unique to the IP address of the host machine (so, for example, 192.168.1.10.nip.io). This command will also configure HAProxy so remote machines can access the CRC cluster using that new domain name.

If you have the firewalld service already running, this new command will add some firewall rules to open up the ports necessary to access the CRC cluster (technically, it just runs the "expose" command for you at the end).

==

To test:

  1. Start CRC normally: hack/crc-openshift.sh start
  2. Change the domain: hack/crc-openshift.sh changedomain
  3. Get the IP address of your host machine (e.g. IP_ADDR=$(hostname -I | awk '{print $1}')).
  4. Log into the OpenShift server using the new domain: oc login -u kiali -p kiali https://api.${IP_ADDR}.nip.io:6443
  5. Confirm the server you connected to is using that nip.io domain: oc whoami --show-server
  6. Confirm you can access the cluster: oc get ns
  7. Now go to another machine on your local network and run that same login command from step 4. Run steps 5 and 6 to confirm it all works.

@jmazzitelli jmazzitelli self-assigned this Apr 1, 2024
@jmazzitelli
Copy link
Collaborator Author

Note that if you use "changedomain", the firewalld service is stopped on the machine (if there was one running).

This also means the "expose" and "unexpose" commands should not be used when the domain is changed. There is no need to use them since "changedomain" will expose the cluster to remote clients.

@jmazzitelli
Copy link
Collaborator Author

Some of this work (particularly the part that configured OpenShift with the new base domain name) originated here: https://github.com/iLLeniumStudios/remote-crc-setup/blob/main/install.sh

@jmazzitelli
Copy link
Collaborator Author

jmazzitelli commented Apr 1, 2024

One thing I have not tested yet: our make targets used to push and use dev images. Things like make cluster-push operator-create kiali-create need to be tested.
UPDATE: I tested that - it works

@jmazzitelli jmazzitelli marked this pull request as draft April 2, 2024 00:13
@jmazzitelli
Copy link
Collaborator Author

jmazzitelli commented Apr 2, 2024

This doesn't fully work. oc CLI can log in and get resources, but it looks like oauth isn't working and you cannot log into the Console UI.

It has to do with the firewall. Shutting down the firewall will allow everything to work. But if I turn on the firewall and expose the ports 80, 443, 6443 it fails with "connection refused". Interestingly, if I restart the firewall but do not expose the ports, the error is "no route to host". So I think I need to figure out what additional firewall rules are needed.

With the current firewall rules in place, this happens (from: oc logs -n openshift-console -l component=ui):

failed to get latest auth source data: request to OAuth issuer endpoint
https://oauth-openshift.apps.192.168.1.20.nip.io/oauth/token failed:
Head "https://oauth-openshift.apps.192.168.1.20.nip.io":
dial tcp 192.168.1.20:443: connect: connection refused
$ firewall-cmd --list-all
FedoraWorkstation (default, active)
  target: default
  ingress-priority: 0
  egress-priority: 0
  icmp-block-inversion: no
  interfaces: enp6s0u2u1u2
  sources: 
  services: 
  ports: 
  protocols: 
  forward: no
  masquerade: no
  forward-ports: 
	port=443:proto=tcp:toport=443:toaddr=192.168.130.11
	port=6443:proto=tcp:toport=6443:toaddr=192.168.130.11
	port=80:proto=tcp:toport=80:toaddr=192.168.130.11
  source-ports: 
  icmp-blocks: 
  rich rules: 

Not sure why this error happens. Port 443 is forewarded.

It has to do with the fact the request is coming from inside the CRC VM. Because I can "curl" to my IP (http://) from the local machine and a remote machine. But if I try the same curl command from within a pod running in the cluster, I get the error. If I turn off the firewall, no error, and the pod can run curl successfully.

@jmazzitelli
Copy link
Collaborator Author

Worst case, I can just have the script shutdown the firewall when changing the domain. But I would like this to work with the firewall enabled... so I need to find what rules to create to get this to work with the firewall running.

@jmazzitelli
Copy link
Collaborator Author

jmazzitelli commented Apr 2, 2024

Well, after that last commit, now I can't see it fail. It may be due to the removal of the passthroughs, but I don't see why that would be the issue. These were removed:

firewall-cmd --direct --passthrough ipv4 -I FORWARD -i ${virt_interface} -j ACCEPT
firewall-cmd --direct --passthrough ipv4 -I FORWARD -o ${virt_interface} -j ACCEPT

UPDATE: starts failing again. I think I was able to log in due to come browser cache that is keeping the login tokens around.

@jmazzitelli
Copy link
Collaborator Author

jmazzitelli commented Apr 2, 2024

Here's the thing I need fixed. If anyone knows what firewall rules would fix this, let me know.

The firewall rules in place are:

 sudo firewall-cmd --add-forward-port="port=443:proto=tcp:toport=443:toaddr=192.168.130.11"
 sudo firewall-cmd --add-forward-port="port=6443:proto=tcp:toport=6443:toaddr=192.168.130.11"
 sudo firewall-cmd --add-forward-port="port=80:proto=tcp:toport=80:toaddr=192.168.130.11"

But internal cluster pods cannot make requests outside to the nip.io endpoints. For example, these two are causing problems (these URLs that are failing are on the host machine and point to the HAProxy which should forward the request right back into the cluster).

  • clusteroperators.config.openshift.io named authentication reporting: OuthServerRouteEndpointAccessibleControllerAvailable: Get "https://oauth-openshift.apps.192.168.1.20.nip.io/healthz": dial tcp 192.168.1.20:443: connect: connection refused
  • clusteroperators.config.openshift.io named console reporting: RouteHealthAvailable: failed to GET route (https://console-openshift-console.apps.192.168.1.20.nip.io): Get "https://console-openshift-console.apps.192.168.1.20.nip.io": dial tcp 192.168.1.20:443: connect: connection refused

@jmazzitelli
Copy link
Collaborator Author

can't figure out the firewall. I was close, but not there. So the script will disable firewalld if it is running when you use the changedomain command.

Other than that, this all works.

@jmazzitelli jmazzitelli marked this pull request as ready for review April 2, 2024 19:21
Copy link
Collaborator

@jshaughn jshaughn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fine.

@jmazzitelli jmazzitelli merged commit ec0e90f into kiali:master Apr 3, 2024
9 checks passed
@jmazzitelli jmazzitelli deleted the hack-crc-expose branch April 3, 2024 14:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants