Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 7 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -66,11 +66,11 @@ helm repo add skyhook https://helm.ngc.nvidia.com/nvidia/skyhook
helm repo update
helm search repo skyhook ## should show the latest version

# basic install
# basic install
helm install skyhook skyhook/skyhook-operator \
--version v0.9.1 \
--version v0.9.2 \
--namespace skyhook \
--create-namespace
--create-namespace
```

### Configure Image Pull Secrets (if needed)
Expand Down Expand Up @@ -102,7 +102,7 @@ kubectl wait --for=condition=Ready pod -l control-plane=controller-manager -n sk
kubectl get pods -l control-plane=controller-manager -n skyhook -o jsonpath='{.items[0].status.conditions[?(@.type=="Ready")].status}'

# Verify the CRDs are installed
kubectl get crd | grep skyhook
kubectl get crd | grep skyhook

# Verify packages are working
kubectl apply -f - <<EOF
Expand Down Expand Up @@ -174,13 +174,13 @@ The Status will show the overall package status as well as the status of each no
# View node state annotations for a specific Skyhook
kubectl get nodes -o jsonpath='{range .items[*]}{.metadata.name}{": "}{.metadata.annotations.skyhook\.nvidia\.com/nodeState_<skyhook-name>}{"\n"}{end}'
```

### Stages
The operator will apply steps in a package throughout different lifecycle stages. This ensures that the right steps are applied in the right situations and in the correct order.
- Upgrade: This stage will be ran whenever a package's version is upgraded in the SCR.
- Uninstall: This stage will be ran whenever a package's version is downgraded or it's removed from the SCR.
- Apply: This stage will always be ran at least once.
- Config: This stage will run when a configmap is changed and on the first SCR application.
- Config: This stage will run when a configmap is changed and on the first SCR application.
- Interrupt: This stage will run when a package has an interrupt defined or a key's value in a packages configmap changes which has a config interrupt defined.
- Post-Interrupt: This stage will run when a package's interrupt has finished.

Expand All @@ -199,7 +199,7 @@ This ensures that when operations like kernel module unloading or system reboots

**NOTE**: If a package is removed from the SCR, then the uninstall stage for that package will solely be run.

**Semantic versioning is strictly enforced in the operator** in order to support upgrade and uninstall. Semantic versioning allows the
**Semantic versioning is strictly enforced in the operator** in order to support upgrade and uninstall. Semantic versioning allows the
operator to know which way the package is going while also enforcing best versioning practices.

**For detailed information about our versioning strategy, git tagging conventions, and component release process, see [docs/versioning.md](docs/versioning.md) and [docs/release-process.md](docs/release-process.md).**
Expand Down
4 changes: 2 additions & 2 deletions chart/Chart.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,9 @@ type: application
# This is the chart version. This version number must be incremented each time you make changes to the helm chart. OR
# it the agent version is updated, or operator version is updated.
# Versions are expected to follow Semantic Versioning (https://semver.org/)
version: v0.9.0
version: v0.9.2
# This is the version number operator container being deployed.
# Versions are expected to follow Semantic Versioning (https://semver.org/)
appVersion: v0.9.0
# this is the minimum version of kubernetes that the operator supports/tested against.
kubeVersion: ">=1.30.0"
kubeVersion: ">=1.30.0-0"
10 changes: 5 additions & 5 deletions chart/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -57,16 +57,16 @@ controllerManager:
drop:
- ALL
env:
## copyDirRoot is the directory for which the operator will work from on the host.
## copyDirRoot is the directory for which the operator will work from on the host.
## Some environments may require this to be set to a specific directory.
copyDirRoot: /var/lib/skyhook
## agentLogRoot is the directory for which the agent will write logs.
## agentLogRoot is the directory for which the agent will write logs.
## Some environments may require this to be set to a specific directory.
agentLogRoot: /var/log/skyhook
## leaderElection: "true" will enable leader election for the operator controller
## Default is "true" and is required for production.
leaderElection: "true"
## logLevel: "info" is the log level for the operator controller.
## logLevel: "info" is the log level for the operator controller.
## If you want more or less logs, change this value to "debug" or "error".
logLevel: info
metricsPort: :8080
Expand All @@ -85,7 +85,7 @@ controllerManager:
## agentImage: is the image used for the agent container. This image is the default for this install, but can be overridden in the CR at package level.
agent:
repository: nvcr.io/nvidia/skyhook/agent
tag: "v6.3.0"
tag: "v6.3.1"

# resources: If this is defined it will override the default calculation for resources
# from estimatedNodeCount and estimatedPackageCount. The below values are
Expand Down Expand Up @@ -147,7 +147,7 @@ webhook:
secretName: webhook-cert
## serviceName: name of the service to expose the webhook
serviceName: skyhook-operator-webhook-service
## enable: "true" will enable the webhook setup in the operator controller.
## enable: "true" will enable the webhook setup in the operator controller.
## Default is "true" and is required for production.
enable: true

Expand Down