Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion .github/workflows/test-build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,8 @@ jobs:
uses: actions/setup-go@v5
with:
go-version-file: 'go.mod'
- name: List open ports and processes
run: sudo lsof -i -P -n | grep LISTEN
- name: Unit test
run: make utest
race:
Expand All @@ -42,4 +44,4 @@ jobs:
with:
go-version-file: 'go.mod'
- name: Test race
run: go test -count=1 -parallel 1 -race ./...
run: go test -count=1 -parallel 1 -race -skip 'TestTracesForSanity/rdt_trace|TestPowerForSanity/power_efficiency' ./...
2 changes: 1 addition & 1 deletion Dockerfile
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# Copyright (c) 2022 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

FROM golang:1.24.2 AS build
FROM golang:1.24.6 AS build

WORKDIR /app

Expand Down
12 changes: 9 additions & 3 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,9 @@ SCALEOUT_PLUGIN=scale_out
RMPOD_PLUGIN=rm_pod
RDT_PLUGIN=rdt
CPU_PLUGIN=cpu_scale
ENERGY_PLUGIN=energy
GO_CILINT_CHECKERS=errcheck,goimports,gosec,gosimple,govet,ineffassign,nilerr,revive,staticcheck,unused
DOCKER_IMAGE_VERSION=0.3.0
DOCKER_IMAGE_VERSION=0.4.0

api:
hack/generate_code.sh
Expand Down Expand Up @@ -32,7 +33,10 @@ build-plugin-rdt:
build-plugin-cpu:
CGO_ENABLED=0 go build -o bin/plugins/${CPU_PLUGIN} plugins/${CPU_PLUGIN}/cmd/${CPU_PLUGIN}.go

build-plugins: build-plugin-scaleout build-plugin-rmpod build-plugin-rdt build-plugin-cpu
build-plugin-energy:
CGO_ENABLED=0 go build -o bin/plugins/${ENERGY_PLUGIN} plugins/${ENERGY_PLUGIN}/cmd/${ENERGY_PLUGIN}.go

build-plugins: build-plugin-scaleout build-plugin-rmpod build-plugin-rdt build-plugin-cpu build-plugin-energy

controller-images:
docker build -t planner:${DOCKER_IMAGE_VERSION} . --no-cache --pull
Expand All @@ -42,6 +46,7 @@ plugin-images:
docker build -t rmpod:${DOCKER_IMAGE_VERSION} -f plugins/rm_pod/Dockerfile . --no-cache --pull
docker build -t rdt:${DOCKER_IMAGE_VERSION} -f plugins/rdt/Dockerfile . --no-cache --pull
docker build -t cpuscale:${DOCKER_IMAGE_VERSION} -f plugins/cpu_scale/Dockerfile . --no-cache --pull
docker build -t energy:${DOCKER_IMAGE_VERSION} -f plugins/energy/Dockerfile . --no-cache --pull

all-images: controller-images plugin-images

Expand All @@ -57,7 +62,8 @@ prepare-build:
go mod tidy

utest:
go test -count=1 -parallel 1 -v ./...
# Skipping certain trace tests, as they cannot be run safely on public runners.
go test -count=1 -parallel 1 -v -skip 'TestTracesForSanity/rdt_trace|TestPowerForSanity/power_efficiency' ./...

test:
hack/run_test.sh
Expand Down
1 change: 0 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,4 @@


# Intent Driven Orchestration Planner

![planner.png](planner.png)
Expand Down
2 changes: 1 addition & 1 deletion artefacts/deploy/manifest.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -199,7 +199,7 @@ spec:
serviceAccountName: planner-service-account
containers:
- name: planner
image: 127.0.0.1:5000/planner:0.3.0
image: 127.0.0.1:5000/planner:0.4.0
ports:
- containerPort: 33333
imagePullPolicy: Always
Expand Down
12 changes: 7 additions & 5 deletions artefacts/examples/default_profiles.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5,35 +5,37 @@ metadata:
name: p50latency
spec:
type: "latency"
description: "Measures P50 latency in ms over a 30ms time window as reported by Linkerd service mesh."
description: "Measures P50 latency in ms over a 30s time window as reported by Linkerd service mesh."
---
apiVersion: "ido.intel.com/v1alpha1"
kind: KPIProfile
metadata:
name: p95latency
spec:
type: "latency"
description: "Measures P95 latency in ms over a 30ms time window as reported by Linkerd service mesh."
description: "Measures P95 latency in ms over a 30s time window as reported by Linkerd service mesh."
---
apiVersion: "ido.intel.com/v1alpha1"
kind: KPIProfile
metadata:
name: p99latency
spec:
type: "latency"
description: "Measures P99 latency in ms over a 30ms time window as reported by Linkerd service mesh."
description: "Measures P99 latency in ms over a 30s time window as reported by Linkerd service mesh."
---
apiVersion: "ido.intel.com/v1alpha1"
kind: KPIProfile
metadata:
name: throughput
spec:
type: "throughput"
description: "Measures requests per second aggregated over a 30ms time window as reported by Linkerd service mesh."
minimize: False
description: "Measures requests per second aggregated over a 30s time window as reported by Linkerd service mesh."
---
apiVersion: "ido.intel.com/v1alpha1"
kind: KPIProfile
metadata:
name: availability
spec:
type: "availability"
type: "availability"
minimize: False
2 changes: 1 addition & 1 deletion artefacts/examples/example_deployment.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ spec:
spec:
containers:
- name: sample-function
image: testfunction/rust_function:0.1
image: testfunction/rust_function:0.2
ports:
- containerPort: 8080
env:
Expand Down
19 changes: 16 additions & 3 deletions artefacts/intents_crds_v1alpha1.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ spec:
type: object
properties:
kind:
description: 'Kind of the owner.'
description: 'Kind of the owner (defaults to Deployment kind).'
type: string
enum:
- Deployment
Expand All @@ -33,11 +33,15 @@ spec:
- name
priority:
type: number
description: "Priority for a set of PODs"
description: "Priority for a set of PODs (defaults to 0.01)."
format: float
minimum: 0.01 # prevents any div 0!
maximum: 1.0
default: 0.01
active:
type: boolean
description: "Indicates if the planner should actively managed this intent (defaults to true)."
default: true
objectives:
type: array
description: "Objectives for a set of PODs."
Expand All @@ -54,6 +58,12 @@ spec:
measuredBy:
type: string
description: "Defines what kind of an objective this is. Also defines if the objective is an upper or lower bound objective."
tolerance:
type: number
description: "Indicates a tolerance as percentage in context of the specified target value for the objective (defaults to 0.0) - e.g. 0.1 & target 10ms ==> 11ms."
format: float
minimum: 0.0
default: 0.0
required:
- name
- value
Expand Down Expand Up @@ -108,13 +118,16 @@ spec:
spec:
type: object
properties:
# TODO: add weight.
query:
type: string
description: "This is an optional parameter - if defined, the user needs to provide a query string defining how to capture the objective's KPI. Optional parameters - in accordance with the provide documentation - can be detailed under props."
description:
type: string
description: "Ideally includes a description on what is measured by the query - including e.g. information on units etc."
minimize:
type: boolean
description: "Indicates whether the planner should try to minimize this or not (defaults to true)."
default: true
type:
type: string
description: "Defines the type of the KPI."
Expand Down
31 changes: 29 additions & 2 deletions cmd/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,8 @@ package main

import (
"flag"
"io"
"os"
"time"

"github.com/intel/intent-driven-orchestration/pkg/controller"
Expand Down Expand Up @@ -38,7 +40,32 @@ func main() {
}
cfg, err := common.ParseConfig(config)
if err != nil {
klog.Fatalf("Error loading planner config: %s", err)
klog.Fatalf("Error loading planner config: %v", err)
}

// set logFile
if cfg.Generic.LogFile != "" {
err := flag.Set("logtostderr", "false")
if err != nil {
klog.Fatalf("Error setting flag logtostderr: %v", err)
}
err = flag.Set("alsologtostderr", "true")
if err != nil {
klog.Fatalf("Error setting flag alsologtostderr: %v", err)
}

logFile, err := os.OpenFile(cfg.Generic.LogFile, os.O_CREATE|os.O_WRONLY|os.O_APPEND, 0600)
if err != nil {
klog.Fatalf("Failed to open log file: %v", err)
}
defer logFile.Close()

multiWriter := io.MultiWriter(os.Stdout, logFile)
klog.SetOutput(multiWriter)
defer func() {
klog.Flush()
}()
klog.Infof("Successfuly added to klog output the log file: %s", cfg.Generic.LogFile)
}

// K8s genClient setup
Expand All @@ -65,7 +92,7 @@ func main() {
planner := astar.NewAPlanner(actuatorList, cfg)
defer planner.Stop()

// This is main controller.
// This is the main controller.
tracer := controller.NewMongoTracer(cfg.Generic.MongoEndpoint)
c := controller.NewController(cfg, tracer, k8sClient, podInformerFactory.Core().V1().Pods())
c.SetPlanner(planner)
Expand Down
4 changes: 4 additions & 0 deletions docs/actuators.md
Original file line number Diff line number Diff line change
Expand Up @@ -68,6 +68,10 @@ functions in the next section to better understand how the system selects what t
Furthermore, the implementation of ***NextState()*** can support the opportunistic planning capabilities, by adding new
states, that although they do not satisfy the desired still at least move the system in the right direction.

Note that the parameters of the actions are defined by **interface{}**. Ideally a map is used to represent the
parameters. For example, they can be represented as a _map[string]int64_ or _map[string]string_. Other
types of values in the map will be cast to string to support the GRPC plugin mechanism.

#### Utility/Cost functions

Utilities are used to steer the planner. Planners will deem an action to be favorable if the actuator returns a low
Expand Down
Binary file modified docs/fig/intents_objectives_kpis.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
5 changes: 4 additions & 1 deletion docs/fig/intents_objectives_kpis.puml
Original file line number Diff line number Diff line change
Expand Up @@ -14,11 +14,13 @@ hide class circle
class Intent {
targetKey
targetKind
Priority
priority
active
}
class Objective {
name
value
tolerance
}
enum KPIType {
latency
Expand All @@ -30,6 +32,7 @@ class KPIProfile {
query: string
endpoint: address
external: bool
minimize: bool
}
Intent "1" *-right- "1..n" Objective: objectives
Objective "0..*" -- "1" KPIProfile: measuredBy
Expand Down
32 changes: 17 additions & 15 deletions docs/getting_started.md
Original file line number Diff line number Diff line change
Expand Up @@ -117,6 +117,7 @@ sections for each of the [framework's major components](framework.md) as well as
| Property | Description |
|----------------|-----------------------------------------------------------------------------|
| mongo_endpoint | URI for the Mongo database - representing the knowledge base of the system. |
| log_file | (Optional) Path to a log file to config klog. |

### Controller

Expand Down Expand Up @@ -185,21 +186,22 @@ Each actuator will have its own configuration.

### cpu scale actuator

| Property | Description |
|------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| interpreter | Path to a python interpreter. |
| analytics_script | Path to the analytics python script used to determine the scaling model. |
| cpu_max | Maximum CPU resource units (in millis) that the actuator will allow. |
| cpu_rounding | Multiple of 10 defining how to round up CPU resource units. |
| cpu_safeguard_factor | Define the factor the actuator will use to stay below the targeted objective. |
| look_back | Time in minutes defining how old the ML model can be. |
| max_proactive_cpu | Maximum CPU resource units (in millis) that the actuator will allow when proactively scaling. If set to 0, proactive planning is disabled. A fraction of this value is used for proactive scale ups/downs. |
| proactive_latency_percentage | Float defining the potential percentage change in latency by scaling the resources. |
| endpoint | Name of the endpoint to use for registering this plugin. |
| port | Port this actuator should listen on. |
| mongo_endpoint | URI for the Mongo database - representing the knowledge base of the system. |
| plugin_manager_endpoint | String defining the plugin manager's endpoint to which actuators can register themselves. |
| plugin_manager_port | Port number of the plugin manager's endpoint to which actuators can register themselves. |
| Property | Description |
|------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| interpreter | Path to a python interpreter. |
| analytics_script | Path to the analytics python script used to determine the scaling model. |
| cpu_max | Maximum CPU resource units (in millis) that the actuator will allow. |
| cpu_rounding | Multiple of 10 defining how to round up CPU resource units. |
| cpu_safeguard_factor | Define the factor the actuator will use to stay below the targeted objective. |
| boostFactor | Defines the multiplication factor for calculating resource limits from requests. If set to 1.0 PODs will be in a Guaranteed QoS, smaller or larger values lead to a BestEffort or Burstable QoS accordingly. |
| look_back | Time in minutes defining how old the ML model can be. |
| max_proactive_cpu | Maximum CPU resource units (in millis) that the actuator will allow when proactively scaling. If set to 0, proactive planning is disabled. A fraction of this value is used for proactive scale ups/downs. |
| proactive_latency_percentage | Float defining the potential percentage change in latency by scaling the resources. |
| endpoint | Name of the endpoint to use for registering this plugin. |
| port | Port this actuator should listen on. |
| mongo_endpoint | URI for the Mongo database - representing the knowledge base of the system. |
| plugin_manager_endpoint | String defining the plugin manager's endpoint to which actuators can register themselves. |
| plugin_manager_port | Port number of the plugin manager's endpoint to which actuators can register themselves. |

### RDT actuator

Expand Down
Loading