Skip to content

KNOX-2690 - Service URLs are missing from Knox UI after a temporary discovery failure#516

Merged
zeroflag merged 1 commit intoapache:masterfrom
zeroflag:KNOX-2690-disc
Jan 4, 2022
Merged

KNOX-2690 - Service URLs are missing from Knox UI after a temporary discovery failure#516
zeroflag merged 1 commit intoapache:masterfrom
zeroflag:KNOX-2690-disc

Conversation

@zeroflag
Copy link
Copy Markdown
Contributor

@zeroflag zeroflag commented Nov 12, 2021

What changes were proposed in this pull request?

When CM is intermittently unavailable, Knox still regenerates topologies with an empty service list. It should skip topology regeneration and log the error instead.

This patch fixes it by throwing an exception and avoiding the topology regeneration in case of a discovery failure.

The patch also removes some unused code around discovery.

How was this patch tested?

Changed discovery-address in /var/lib/knox/gateway/conf/descriptors/cdp-proxy.json to an invalid address.

Checked the logs for the error message:

2021-11-12 11:14:41,394 INFO  discovery.cm (ClouderaManagerServiceDiscovery.java:discoverCluster(215)) - Performing cluster discovery for "Cluster 1"
2021-11-12 11:14:41,396 ERROR discovery.cm (ClouderaManagerServiceDiscovery.java:getClusterServices(276)) - Failed to access the service configurations for cluster (Cluster 1) discovery: com.cloudera.api.swagger.client.ApiException: java.net.ConnectException: Failed to connect to amagyar-2.amagyar.***/***7180
2021-11-12 11:14:41,396 ERROR knox.gateway (DefaultTopologyService.java:onFileChange(870)) - Unable to complete service discovery for cluster Cluster 1 topology = cdp-proxy.

Verified that the URL-s are still present on the Knox UI.

@zeroflag zeroflag marked this pull request as draft November 12, 2021 10:14
@zeroflag zeroflag marked this pull request as ready for review November 12, 2021 14:25
@zeroflag
Copy link
Copy Markdown
Contributor Author

Copy link
Copy Markdown
Contributor

@smolnar82 smolnar82 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Just to clarify: the removal of "some unused code" was deleting the public Map<String, Cluster> discover(GatewayConfig gatewayConfig, ServiceDiscoveryConfig discoveryConfig) method because it was only used in tests, right?

Cc. @pzampino

@zeroflag
Copy link
Copy Markdown
Contributor Author

Just to clarify: the removal of "some unused code" was deleting the public Map<String, Cluster> discover(GatewayConfig gatewayConfig, ServiceDiscoveryConfig discoveryConfig) method because it was only used in tests, right?

Yes.

@zeroflag zeroflag merged commit 67e7dd3 into apache:master Jan 4, 2022
dishtikundra added a commit to acceldata-io/knox that referenced this pull request Apr 14, 2025
dishtikundra added a commit to acceldata-io/knox that referenced this pull request Apr 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants