27158: backport-2.0: cli: Deprecate --wait=live in `node decommission` r=bdarnell a=tschottdorf This is a bit of an unusual backport. The previous behavior is correctly documented, but it simply isn't that useful to operators who want to feel reasonably certain that cluster health is back to green before carrying on with their day. I'm not sure what our policy here should be this late into 2.0, but I'm tempted to say that we're not going to cherry-pick this. @bdarnell, what's your take here? Backport 1/1 commits from #27027. /cc @cockroachdb/release --- Refer to issue #26880. When you try to decommission a node that is down, today one has to use `decommission --wait=live`, which does not verify whether the down node is part of any ranges, and manually verify that the cluster has up-replicated all ranges elsewhere. This is far from ideal, in particular since there is no automated way to reliably check this. `--wait=live` is deprecated and its behaviour replaced by that of `--wait=all`. Instead of relying on metrics for replica counts, which may be stale, look at authoritive source - range metadata. Release note (cli change): Deprecate `--wait=live` parameter for `node decommission`. `--wait=all` is the default behaviour. This ensures CockroachDB checks ranges are on the node to be decommissioned are not under-replicated before the node is decommissioned. Co-authored-by: neeral <neeral@users.noreply.github.com> Co-authored-by: Tobias Schottdorf <tobias.schottdorf@gmail.com>