Skip to content

Improve metrics and logs. Have MuxManager.Start wait 1m so that muxes are connected before we start outbound server.#164

Merged
temporal-nick merged 1 commit intomainfrom
nick/sessionhealth
Oct 9, 2025
Merged

Improve metrics and logs. Have MuxManager.Start wait 1m so that muxes are connected before we start outbound server.#164
temporal-nick merged 1 commit intomainfrom
nick/sessionhealth

Conversation

@temporal-nick
Copy link
Collaborator

What was changed

  • Limit admin service logging to 3/min
  • Added 1m delay to MuxManager startup so that muxes can connect before allowing outbound connections. This greatly improves load leveling on deployment (e.g. not all streams immediately select the first and only mux connection)
  • Reduced yamux logs to debug level for now, they aren't adding new information.
  • Added once-per-minute dump of the mux manager status

Why?

These changes were part of a debug session investigating zombie connections on the mux server. Unfortunately the source hasn't been identified yet, but these changes will make the proxy easier to operate in the future.

… are connected before we start outbound server.
@temporal-nick temporal-nick requested a review from pglass October 9, 2025 16:30
@temporal-nick temporal-nick requested a review from a team as a code owner October 9, 2025 16:30
// The AdminServiceStreams will duplicate the same output for an underlying connection issue hundreds of times.
// Limit their output to three times per minute
logger = log.NewThrottledLogger(log.With(logger, common.ServiceTag(serviceName)),
func() float64 { return 3.0 / 60.0 })
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be good to make the rate configurable

@temporal-nick temporal-nick merged commit 4289c12 into main Oct 9, 2025
5 checks passed
@temporal-nick temporal-nick deleted the nick/sessionhealth branch October 9, 2025 21:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants