Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
moved deployment to kubernetes files
Summary: Terraform does not support deployments on GCP using a GPU at the moment. So we need to deploy such cases using plain Kubernetes configuration files. The buildbot mlir-nvidia is configured in `deployment-mlir-nvidia.yaml` in this folder. Reviewers: tra Subscribers: mehdi_amini, rriddle, jpienaar, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, aartbik, liufengdb, stephenneuendorffer, Joonsoo, grosul1, Kayjukh, jurahul, msifontes Differential Revision: https://reviews.llvm.org/D82434
- Loading branch information
1 parent
d99cf8c
commit 2446cff
Showing
3 changed files
with
87 additions
and
81 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,63 @@ | ||
--- | ||
apiVersion: apps/v1 | ||
kind: Deployment | ||
metadata: | ||
name: mlir-nvidia | ||
spec: | ||
# number of instances we want to run | ||
replicas: 1 | ||
selector: | ||
matchLabels: | ||
app: buildbot-mlir-nvidia | ||
# define strategy for updating the images | ||
strategy: | ||
rollingUpdate: | ||
# do not deploy more replicas, as the buildbot server | ||
# can't handle multiple workers with the same credentials | ||
maxSurge: 0 | ||
# Allow to have 0 replicas during updates. | ||
maxUnavailable: 1 | ||
type: RollingUpdate | ||
template: | ||
metadata: | ||
labels: | ||
app: buildbot-mlir-nvidia | ||
spec: | ||
containers: | ||
# the image and version we want to run | ||
- image: gcr.io/sanitizer-bots/buildbot-mlir-nvidia:9 | ||
name: mlir-nvidia | ||
# reserve "<number of cores>-1" for this image, kubernetes also | ||
# needs <1 core for management tools | ||
resources: | ||
limits: | ||
cpu: "15" | ||
memory: 10Gi | ||
# also request to use the GPU | ||
# Note: this does not work in terraform at the moment | ||
nvidia.com/gpu: "1" | ||
requests: | ||
cpu: "15" | ||
memory: 10Gi | ||
nvidia.com/gpu: "1" | ||
volumeMounts: | ||
# mount the secrets into a folder | ||
- mountPath: /secrets | ||
mountPropagation: None | ||
name: buildbot-token | ||
# specify the nood pool on which to deploy | ||
nodeSelector: | ||
pool: nvidia-16core-pool | ||
restartPolicy: Always | ||
# FIXME: do we need this if we requested a GPU? | ||
#tolerations: | ||
#- effect: NoSchedule | ||
# key: nvidia.com/gpu | ||
# operator: Equal | ||
# value: present | ||
volumes: | ||
# declare the secret as a volume so we can mount it | ||
- name: buildbot-token | ||
secret: | ||
optional: false | ||
secretName: password-mlir-nvidia |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters