From ecee524c10953c1915b1ab3256950e7a982ba875 Mon Sep 17 00:00:00 2001 From: Claudia Date: Wed, 12 Mar 2025 17:45:32 -0400 Subject: [PATCH] add autopilot minor typo fixes Signed-off-by: Claudia last typo Signed-off-by: Claudia fix to link Signed-off-by: Claudia requested changes Signed-off-by: Claudia --- setup.RHOAI-v2.13/CLUSTER-SETUP.md | 27 +++++++++++++++++++++ setup.RHOAI-v2.16/CLUSTER-SETUP.md | 27 +++++++++++++++++++++ setup.RHOAI-v2.17/CLUSTER-SETUP.md | 27 +++++++++++++++++++++ setup.k8s/CLUSTER-SETUP.md | 29 ++++++++++++++++++++++ setup.tmpl/CLUSTER-SETUP.md.tmpl | 39 ++++++++++++++++++++++++++++++ 5 files changed, 149 insertions(+) diff --git a/setup.RHOAI-v2.13/CLUSTER-SETUP.md b/setup.RHOAI-v2.13/CLUSTER-SETUP.md index a178853..5daa891 100644 --- a/setup.RHOAI-v2.13/CLUSTER-SETUP.md +++ b/setup.RHOAI-v2.13/CLUSTER-SETUP.md @@ -88,6 +88,33 @@ kueue-controller-manager's log: ``` +## Autopilot + +Helm charts values and how-to for customization can be found [in the official documentation](https://github.com/IBM/autopilot/blob/main/helm-charts/autopilot/README.md). As-is, Autopilot will run on GPU nodes. + +- Add the Autopilot Helm repository + +```bash +helm repo add autopilot https://ibm.github.io/autopilot/ +helm repo update +``` + +- Install the chart (idempotent command). The config file is for customizing the helm values and it is optional. + +```bash +helm upgrade autopilot autopilot/autopilot --install --namespace=autopilot --create-namespace -f your-config.yml +``` + +### Enabling Prometheus metrics + +After completing the installation, manually label the namespace to enable metrics to be scraped by Prometheus with the following command: + +```bash +oc label ns autopilot openshift.io/cluster-monitoring=true +``` + +The `ServiceMonitor` labeling is not required. + ## Kueue Configuration Create Kueue's default flavor: diff --git a/setup.RHOAI-v2.16/CLUSTER-SETUP.md b/setup.RHOAI-v2.16/CLUSTER-SETUP.md index cebd9dd..1bde91f 100644 --- a/setup.RHOAI-v2.16/CLUSTER-SETUP.md +++ b/setup.RHOAI-v2.16/CLUSTER-SETUP.md @@ -76,6 +76,33 @@ AI configuration as follows: +## Autopilot + +Helm charts values and how-to for customization can be found [in the official documentation](https://github.com/IBM/autopilot/blob/main/helm-charts/autopilot/README.md). As-is, Autopilot will run on GPU nodes. + +- Add the Autopilot Helm repository + +```bash +helm repo add autopilot https://ibm.github.io/autopilot/ +helm repo update +``` + +- Install the chart (idempotent command). The config file is for customizing the helm values and it is optional. + +```bash +helm upgrade autopilot autopilot/autopilot --install --namespace=autopilot --create-namespace -f your-config.yml +``` + +### Enabling Prometheus metrics + +After completing the installation, manually label the namespace to enable metrics to be scraped by Prometheus with the following command: + +```bash +oc label ns autopilot openshift.io/cluster-monitoring=true +``` + +The `ServiceMonitor` labeling is not required. + ## Kueue Configuration Create Kueue's default flavor: diff --git a/setup.RHOAI-v2.17/CLUSTER-SETUP.md b/setup.RHOAI-v2.17/CLUSTER-SETUP.md index 3fee15f..e064046 100644 --- a/setup.RHOAI-v2.17/CLUSTER-SETUP.md +++ b/setup.RHOAI-v2.17/CLUSTER-SETUP.md @@ -76,6 +76,33 @@ AI configuration as follows: +## Autopilot + +Helm charts values and how-to for customization can be found [in the official documentation](https://github.com/IBM/autopilot/blob/main/helm-charts/autopilot/README.md). As-is, Autopilot will run on GPU nodes. + +- Add the Autopilot Helm repository + +```bash +helm repo add autopilot https://ibm.github.io/autopilot/ +helm repo update +``` + +- Install the chart (idempotent command). The config file is for customizing the helm values and it is optional. + +```bash +helm upgrade autopilot autopilot/autopilot --install --namespace=autopilot --create-namespace -f your-config.yml +``` + +### Enabling Prometheus metrics + +After completing the installation, manually label the namespace to enable metrics to be scraped by Prometheus with the following command: + +```bash +oc label ns autopilot openshift.io/cluster-monitoring=true +``` + +The `ServiceMonitor` labeling is not required. + ## Kueue Configuration Create Kueue's default flavor: diff --git a/setup.k8s/CLUSTER-SETUP.md b/setup.k8s/CLUSTER-SETUP.md index 74f6791..0fb3c9f 100644 --- a/setup.k8s/CLUSTER-SETUP.md +++ b/setup.k8s/CLUSTER-SETUP.md @@ -7,6 +7,7 @@ The cluster setup installs and configures the following components: + Kueue + AppWrappers + Cluster roles and priority classes ++ Autopilot ## Priorities @@ -73,6 +74,34 @@ operators as follows: - `queueName` is set to `default-queue`, - pod priorities, resource requests and limits have been adjusted. +## Autopilot + +Helm charts values and how-to for customization can be found [in the official documentation](https://github.com/IBM/autopilot/blob/main/helm-charts/autopilot/README.md). As-is, Autopilot will run on GPU nodes. + +- Add the Autopilot Helm repository + +```bash +helm repo add autopilot https://ibm.github.io/autopilot/ +helm repo update +``` + +- Install the chart (idempotent command). The config file is for customizing the helm values and it is optional. + +```bash +helm upgrade autopilot autopilot/autopilot --install --namespace=autopilot --create-namespace -f your-config.yml +``` + +### Enabling Prometheus metrics + +The `ServiceMonitor` object is the one that enables Prometheus to scrape the metrics produced by Autopilot. +In order for Prometheus to find the right objects, the `ServiceMonitor` needs to be annotated with the Prometheus' release name. It is usually `prometheus`, and that's the default added in the Autopilot release. +If that is not the case in your cluster, the correct release label can be found by checking in the `ServiceMonitor` of Prometheus itself, or the name of Prometheus helm chart. +Then, Autopilot's `ServiceMonitor` can be labeled with the following command + +```bash +kubectl label servicemonitors.monitoring.coreos.com -n autopilot autopilot-metrics-monitor release= --overwrite +``` + ## Kueue Configuration Create Kueue's default flavor: diff --git a/setup.tmpl/CLUSTER-SETUP.md.tmpl b/setup.tmpl/CLUSTER-SETUP.md.tmpl index d88edd0..307ae25 100644 --- a/setup.tmpl/CLUSTER-SETUP.md.tmpl +++ b/setup.tmpl/CLUSTER-SETUP.md.tmpl @@ -12,6 +12,7 @@ The cluster setup installs and configures the following components: + Kueue + AppWrappers + Cluster roles and priority classes ++ Autopilot {{- end }} @@ -154,6 +155,44 @@ operators as follows: {{- end }} +## Autopilot + +Helm charts values and how-to for customization can be found [in the official documentation](https://github.com/IBM/autopilot/blob/main/helm-charts/autopilot/README.md). As-is, Autopilot will run on GPU nodes. + +- Add the Autopilot Helm repository + +```bash +helm repo add autopilot https://ibm.github.io/autopilot/ +helm repo update +``` + +- Install the chart (idempotent command). The config file is for customizing the helm values and it is optional. + +```bash +helm upgrade autopilot autopilot/autopilot --install --namespace=autopilot --create-namespace -f your-config.yml +``` + +### Enabling Prometheus metrics + +{{ if .OPENSHIFT -}} +After completing the installation, manually label the namespace to enable metrics to be scraped by Prometheus with the following command: + +```bash +{{ .KUBECTL }} label ns autopilot openshift.io/cluster-monitoring=true +``` + +The `ServiceMonitor` labeling is not required. +{{- else -}} +The `ServiceMonitor` object is the one that enables Prometheus to scrape the metrics produced by Autopilot. +In order for Prometheus to find the right objects, the `ServiceMonitor` needs to be annotated with the Prometheus' release name. It is usually `prometheus`, and that's the default added in the Autopilot release. +If that is not the case in your cluster, the correct release label can be found by checking in the `ServiceMonitor` of Prometheus itself, or the name of Prometheus helm chart. +Then, Autopilot's `ServiceMonitor` can be labeled with the following command + +```bash +{{ .KUBECTL }} label servicemonitors.monitoring.coreos.com -n autopilot autopilot-metrics-monitor release= --overwrite +``` +{{- end }} + ## Kueue Configuration Create Kueue's default flavor: