Skip to content

waok8s/wao-estimator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

93 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

wao-estimator

GitHub GitHub release (latest SemVer) CI GitHub go.mod Go version Go Report Card codecov

WAO-Estimator provides power consumption estimation capabilities to help schedule Pods in power efficient way.

Overview

WAO-Estimator provides APIs for estimating the increase in power consumption when deploying Pods in a cluster.

Specifically, it responds to queries like this:

Q. Tell me the increase in power consumption when deploying 5 Pods each requiring 500 mCPU.

{ "cpu_milli": 500, "num_workloads": 5 }

A. The power consumption of the entire cluster will at least increase by 5W with 1 Pod, 9W with 2 Pods, 13W with 3 Pods, 15W with 4 Pods, and 16W with 5 Pods.

{ "watt_increases": [ 5, 9, 13, 15, 16 ] }

Each value indicates the following information:

  • cpu_milli: the amount of CPU consumed by a single workload
  • num_workloads: the number of workloads
  • watt_increases: the estimated increase in power consumption when the workloads are placed

For use cases, see WAO-Scheduler-v2, which uses WAO-Estimator to place pods on the cluster, and WAOFed, which works with KubeFed to optimally place Pods in multi-cluster environments.

Getting Started

To start using WAO-Estimator, you need a Kubernetes cluster that meets the following conditions:

  • Each worker node is a physical machine
  • Each worker node supports IPMI or Redfish
  • Power consumption models for each type of physical machine
  • Environmental information required by the model are obtainable
    • e.g. static pressure difference between front and back of a server

Supported Kubernetes versions: 1.19 or higher

πŸ’‘ Mainly tested with 1.25, may work with old versions (but may require some efforts).

Installation

Make sure you have cert-manager deployed on the cluster where KubeFed control plane is deployed, as it is used to generate webhook certificates.

kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.10.0/cert-manager.yaml

⚠️ You may have to wait a second for cert-manager to be ready.

Deploy the Operator with the following command. It creates wao-estimator-system namespace and deploys CRDs, controllers and other resources.

kubectl apply -f https://github.com/Nedopro2022/wao-estimator/releases/download/v0.1.1/wao-estimator.yaml

πŸ’‘ Please verify that 3 Service objects have been created. webhook-service and metrics-service are normal and estimator-service is for providing WAO-Estimator APIs.

$ kubectl get svc -n wao-estimator-system
NAME                                                 TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)    AGE
wao-estimator-controller-manager-estimator-service   ClusterIP   10.96.222.246   <none>        5656/TCP   5m
wao-estimator-controller-manager-metrics-service     ClusterIP   10.96.161.177   <none>        8443/TCP   5m
wao-estimator-webhook-service                        ClusterIP   10.96.217.214   <none>        443/TCP    5m

Setup WAO-Estimator by deploying an Estimator resource.

This is a simple example of an Estimator resource.

apiVersion: waofed.bitmedia.co.jp/v1beta1
kind: Estimator
metadata:
  namespace: default
  name: default
spec:
  defaultNodeConfig:
    nodeMonitor:
      agents: []
    powerConsumptionPredictor:
      type: None

The namespace and name will affect the URL of the API endpoint.

http://<yourhost>:5656/namespaces/<namespace>/estimators/<name>/values/powerconsumption

In this case, we specified namespace: default name: default, so it will be:

http://<yourhost>:5656/namespaces/default/estimators/default/values/powerconsumption

πŸ’‘ This allows for running multiple Estimators. Using default/default is usually fine.

For simplicity, we specified spec.nodeMonitor and spec.powerConsumptionPredictor as None. They will not function correctly, but it will still allow us to verify the behavior of the API.

Check operation with estimator-cli

estimator-cli is a CLI tool for using WAO-Estimator APIs. It is essentially an HTTP client.

First, execute the following command to allow local access to the estimator-service Service resource.

πŸ’‘ You can also use LoadBalancer or Ingress in supported environments.

kubectl port-forward -n wao-estimator-system svc/wao-estimator-controller-manager-estimator-service 5656:5656

Next, run estimator-cli in another terminal.

πŸ’‘ estimator-cli for linux-x86_64 is available on the releases page. If you have Go installed, the following command will work as well.

go run github.com/Nedopro2022/wao-estimator/pkg/cmd/estimator-cli@latest -p 500,5 pc
$ ./estimator-cli -p 500,5 pc
[+Inf +Inf +Inf +Inf +Inf]

Since spec.nodeMonitor and spec.powerConsumptionPredictor are specified as None, the response (increases in power consumption) is not correct, but we can confirm that WAO-Estimator APIs are working.

πŸ’‘ You can see the actual HTTP request by adding -v option to estimator-cli. For example, you will see the following request by running ./estimator-cli -v -p 500,5 pc. See the help -h for details.

curl -X 'POST' -d '{"cpu_milli":500,"num_workloads":5}' -H 'Content-Type: application/json' 'http://localhost:5656/namespaces/default/estimators/default/values/powerconsumption'

Detailed configuration of Estimator resource

// TODO

NodeMonitor

    nodeMonitor:
      refreshInterval: 30s
      agents:
        - type: None
          endpoint: ""
        - type: Fake
          endpoint: ""
Type Description Endpoint value Example
None do nothing ignored ""
Fake a fake NodeMonitor for test ignored ""
MetricsAPI Kubernetes Metrics API currently ignored (only in-cluster config is supported) ""
DifferentialPressureAPI WAO Differential API API Endpoint http://hogehoge:5000/api/sensor/101037B
IPMIExporter IPMI Exporter API Endpoint http://hogehoge:9290/metrics
Redfish Redfish API Endpoint https://10.0.0.1/redfish/v1
NodeMonitor Provided NodeStatus Type Implementation Details
Fake NodeStatusCPUUsage fetch node label waofed.bitmedia.co.jp/node-status.cpuusage
Fake NodeStatusAmbientTemp fetch node label waofed.bitmedia.co.jp/node-status.cpuusage
Fake NodeStatusStaticPressureDiff fetch node label waofed.bitmedia.co.jp/node-status.staticpressurediff
MetricsAPI NodeStatusCPUUsage client-go access Metrics API
MetricsAPI NodeStatusLogicalProcessors client-go access Metrics API
DifferentialPressureAPI NodeStatusStaticPressureDiff via WAO DifferentialPressureAPI
IPMIExporter NodeStatusAmbientTemp via IPMI
Redfish NodeStatusAmbientTemp via Redfish REST API
Redfish NodeStatusStaticPressureDiff via Redfish REST API

PowerConsumptionPredictor

    powerConsumptionPredictor:
      type: None
      endpoint: ""
Type Description Endpoint value Example NodeStatus required
None do nothing ignored "" none
Fake a fake PowerConsumptionPredictor for test, always returns ( base_watts + requestCPUMilli / 1000 * watt_per_core ) ignored "" none
MLServer WAO power model with MLServer REST API MLServer instance in format {scheme+server}/v2/models/{model}/versions/{version}/** http://hogehoge:8080/v2/models/model1/versions/v0.1.0/infer NodeStatusCPUUsage, NodeStatusLogicalProcessors, NodeStatusAmbientTemp, NodeStatusStaticPressureDiff

Uninstallation

Delete the Operator and resources with the following command.

kubectl delete -f https://github.com/Nedopro2022/wao-estimator/releases/download/v0.1.1/wao-estimator.yaml

πŸ’‘ Demo using kind and FakeNodeMonitor / FakePCPredictor

// TODO

Technical Details

// TODO

Prediction models

Estimation algorithms

NodeMonitor implementations

PowerConsumptionPredictor implementations

HTTP APIs

Developing

This Operator uses Kubebuilder (v3.8.0), so we basically follow the Kubebuilder way. See the Kubebuilder Documentation for details.

Prerequisites

Make sure you have the following tools installed:

  • Git
  • Make
  • Go
  • Docker

Run a development cluster with kind

./hack/dev-kind-reset-cluster.sh # create a K8s cluster `kind-wao-estimator`
./hack/dev-kind-deploy.sh # build and deploy the Operator

License

Copyright 2022 Bitmedia Inc.

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.