`wait` has no overall timeout — orchestrators hang indefinitely on permanently-broken dependencies

## Summary

`connectivity wait` is documented as a way to gate other processes on dependency readiness (init containers, deploy scripts, etc.). It polls forever with a fixed 15s sleep and no overall deadline. If a dependency is permanently broken — a typo'd hostname, a service that never comes up, a destination behind a firewall change — `wait` hangs indefinitely. The orchestrator (k8s, systemd, GitHub Actions) eventually times out and reports a useless "still waiting" failure with no diagnostic surface.

For an SRE, the desired UX is:

- `connectivity wait --timeout 5m` exits with a distinct non-zero code on timeout
- The log clearly identifies which destination(s) were not reached
- The error is detectable from process exit code without parsing logs

## Code

`destinations.go:219-230`:
```go
func (dest *Destination) WaitFor() {
	for {
		reachable := dest.Check()
		if reachable {
			LogDestination(dest, "Connected")
			return
		}
		time.Sleep(15 * time.Second)
	}
}
```

`connectivity.go:155-166`:
```go
func WaitLoop(destinations []*Destination) {
	var wg sync.WaitGroup
	for _, dest := range destinations {
		wg.Add(1)
		go func(dest *Destination) {
			defer wg.Done()
			dest.WaitFor()
		}(dest)
	}
	wg.Wait()
}
```

No `context.Context`, no deadline, no progress reporting beyond per-attempt logs.

## Suggested fix

Add a `--timeout` flag (default e.g. unlimited, but recommend setting one) and propagate it through context:

```go
ctx, cancel := context.WithTimeout(context.Background(), *timeout)
defer cancel()

var wg sync.WaitGroup
errs := make(chan string, len(destinations))
for _, dest := range destinations {
    wg.Add(1)
    go func(d *Destination) {
        defer wg.Done()
        if !d.WaitFor(ctx) {
            errs <- d.Label
        }
    }(dest)
}
wg.Wait()
close(errs)

if len(errs) > 0 {
    log.Printf("Timed out waiting for: %s", strings.Join(collect(errs), ", "))
    os.Exit(1)
}
```

`WaitFor(ctx)` should also use exponential backoff (capped) rather than 15s flat — so a fast-failing destination doesn't issue 4 DNS lookups per minute for an hour, and a slow-converging one isn't punished by an aggressive cadence.

Distinct exit codes (e.g., 1 = timeout, 2 = config error) would also help orchestrator-level alerting distinguish "dependency missing" from "we never started".

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`wait` has no overall timeout — orchestrators hang indefinitely on permanently-broken dependencies #18

Summary

Code

Suggested fix

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

wait has no overall timeout — orchestrators hang indefinitely on permanently-broken dependencies #18

Description

Summary

Code

Suggested fix

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

`wait` has no overall timeout — orchestrators hang indefinitely on permanently-broken dependencies #18