ClusterMetadata: Default cluster potentially uses wrong frontend RPC address #149

thempatel · 2021-03-23T16:42:55Z

We're working on standing up the temporal service via helm and I noticed this while I was configuring the various yaml files. If a user configures a custom gRPC port for the frontend service, then the hardcoded default of 7933 will be incorrect.

helm-charts/templates/server-configmap.yaml

Line 182 in 2fb4639

rpcAddress: "127.0.0.1:7933"

It also seems that the localhost address 127.0.0.1 address would be incorrect in a deployed environment assuming that the various services (history, matching, frontend, worker) are deployed separately.

https://github.com/temporalio/temporal/blob/e2e26004552cbc0867afb342238bb3f9efeee6ce/client/clientBean.go#L87-L96

The text was updated successfully, but these errors were encountered:

emmercm · 2022-11-17T16:58:26Z

@thempatel did you happen to resolve this in your environment? I believe I'm running into a similar issue.

thempatel · 2022-11-17T20:34:25Z

@emmercm we ended up forking the helm chart for temporal to fix the various bugs in it. For this one, I did end up changing the hard coded port to instead be sourced from user configuration (values.yaml).

note: it's been a really long time, so take this with a grain of salt:

IIRC, the localhost is OK because i think there's actually a proxy that listens on localhost for the frontend service, so connecting to localhost will just forward the connection to the internally configured frontend RPC service which will then forward to the actual services. 🤷🏽‍♂️

emmercm · 2022-11-18T16:16:10Z

@thempatel we've also forked the chart, but more so we can better configure our unique Kubernetes environment and multi-cluster than anything.

The proxy would make a ton of sense, but I didn't find any trace of it in GitHub: https://github.com/search?q=org%3Atemporalio+7933&type=code. I would think localhost in this case would be Kube node-local rather than container-local, right? I'm running into issues with multi-cluster where I believe I'm getting some cross-talk, and I've convinced myself it's this localhost config.

thempatel · 2022-11-18T16:43:28Z

@emmercm after reading #333 , noticed you're trying to run 2 unique temporal clusters. you cannot do this without isolating them, the services use a gossip protocol where they broadcast messages on a port. if your two clusters have services that are all broadcasting on the same port, but you've configured two different storage instances (sql, etc), you're going to run into problems.

The reason why I filed this (and subsequently forked) was exactly so that we could run multiple clusters all configured using different ports so that the two clusters don't run into each other.

One thing you could try to see if it solves your problem (if you haven't already), is to configure each of those clusters to be in their own K8s namespaces. If that works, then you'll just need to account for adding namespaces within the connection to the cluster in your clients.

emmercm · 2022-11-21T19:19:07Z

For some reason I swore the gossip behavior was deprecated/removed in Temporal as a step away from Cadence, but checking a quick tctl admin membership list_gossip shows all the pods that I would expect. Thank you for redirecting me on this one, this probably helps explain some of the behavior I'm seeing.

dmateusp · 2024-01-29T10:27:21Z

This was super helpful! After a fresh deployment, no communications with the task queues were working. I was getting context deadline timeouts on tctl tq describe --taskqueue all.

Then following this thread I changed the rpcAddress to match the frontend service name and port in my cluster: rpcAddress: "temporal-frontend:7233"

Now I'm able to list task queues and I can see that the matching server joined tctl admin membership list_gossip

Fixes #333 and #149. Context: temporalio/temporal#650

robholland · 2024-06-15T19:14:21Z

Fixed by #497, but also not used anymore anyway since Temporal 1.18.

Fixes #333 and #149. Context: temporalio/temporal#650

emmercm mentioned this issue Nov 17, 2022

Invalid clusterMetadata rpcAddress? #333

Closed

robholland added a commit that referenced this issue Jun 14, 2024

Correct outdated port config.

7bf8208

Fixes #333 and #149. Context: temporalio/temporal#650

robholland mentioned this issue Jun 14, 2024

Correct outdated port config. #497

Merged

robholland closed this as completed Jun 15, 2024

robholland added a commit that referenced this issue Jun 18, 2024

Correct outdated port config. (#497)

aa2e1ae

Fixes #333 and #149. Context: temporalio/temporal#650

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ClusterMetadata: Default cluster potentially uses wrong frontend RPC address #149

ClusterMetadata: Default cluster potentially uses wrong frontend RPC address #149

thempatel commented Mar 23, 2021 •

edited

Loading

emmercm commented Nov 17, 2022

thempatel commented Nov 17, 2022

emmercm commented Nov 18, 2022

thempatel commented Nov 18, 2022

emmercm commented Nov 21, 2022

dmateusp commented Jan 29, 2024

robholland commented Jun 15, 2024

ClusterMetadata: Default cluster potentially uses wrong frontend RPC address #149

ClusterMetadata: Default cluster potentially uses wrong frontend RPC address #149

Comments

thempatel commented Mar 23, 2021 • edited Loading

emmercm commented Nov 17, 2022

thempatel commented Nov 17, 2022

emmercm commented Nov 18, 2022

thempatel commented Nov 18, 2022

emmercm commented Nov 21, 2022

dmateusp commented Jan 29, 2024

robholland commented Jun 15, 2024

thempatel commented Mar 23, 2021 •

edited

Loading