From 331d968c4fd1324bce614f6d5e5acb132f12bfc9 Mon Sep 17 00:00:00 2001 From: David Eliahu Date: Mon, 9 Nov 2020 09:26:01 -0800 Subject: [PATCH] Add version upgrade checklist --- dev/versions.md | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/dev/versions.md b/dev/versions.md index 302c207a9f..f0ba6b7f64 100644 --- a/dev/versions.md +++ b/dev/versions.md @@ -1,5 +1,18 @@ # Upgrade notes +## Things to check when updating versions + +* cluster up / info / down (works, and logs look good) +* check metrics server pod logs +* check cluster autoscaler pod logs +* check pod -> cluster autoscaling on cpu or gpu or inferentia +* check cluster autoscaling on cpu and gpu and inferentia +* examples + * check logs, predictions + * check metrics, tracker + * make sure to try all 8 base images (tf/onnx/py gpu/cpu, tf/py inferentia) + * confirm GPUs are used when requested + ## eksctl 1. Find the latest release on [GitHub](https://github.com/weaveworks/eksctl/releases) and check the changelog