Skip to content

feat: allow monitoring image to collect gpu metrics#8

Merged
ovesh merged 2 commits intodg-cifrom
feat/support-gpu-metrics
Oct 25, 2024
Merged

feat: allow monitoring image to collect gpu metrics#8
ovesh merged 2 commits intodg-cifrom
feat/support-gpu-metrics

Conversation

@ovesh
Copy link
Copy Markdown

@ovesh ovesh commented Oct 25, 2024

in order to support using pynvml without blowing up the size of the monitoring docker image, tasks need to run on the non-default batch-debian vm image. This makes google batch set up the nvidia drivers on the vm and mount them to the container instead of relying on the drivers being installed on the monitor image.

An added benefit is reduced docker image sizes for all tasks that require a gpu.

in order to support using pynvml without blowing up the size
of the monitoring docker image, tasks need to run on the non-default
batch-debian vm image. This makes google batch set up the nvidia
drivers on the vm and mount them to the container instead of
relying on the drivers being installed on the monitor image.

An added benefit is reduced docker image sizes for all tasks
that require a gpu.
@ovesh
Copy link
Copy Markdown
Author

ovesh commented Oct 25, 2024

This change was verified manually by running a workflow with https://github.com/deepgenomics/cromwell-monitor/ as the monitoring image, with both a gpu-enabled task and a non-gpu task, and then viewing Google Batch logs on the console.

This area of the code is not covered by existing unit tests.

@ovesh ovesh merged commit 2c87015 into dg-ci Oct 25, 2024
SophiaPerzan-DG pushed a commit that referenced this pull request Feb 12, 2025
in order to support using pynvml without blowing up the size
of the monitoring docker image, tasks need to run on the non-default
batch-debian vm image. This makes google batch set up the nvidia
drivers on the vm and mount them to the container instead of
relying on the drivers being installed on the monitor image.

An added benefit is reduced docker image sizes for all tasks
that require a gpu.

also: fix gh actions syntax error
SophiaPerzan-DG pushed a commit that referenced this pull request Feb 12, 2025
in order to support using pynvml without blowing up the size
of the monitoring docker image, tasks need to run on the non-default
batch-debian vm image. This makes google batch set up the nvidia
drivers on the vm and mount them to the container instead of
relying on the drivers being installed on the monitor image.

An added benefit is reduced docker image sizes for all tasks
that require a gpu.

also: fix gh actions syntax error
SophiaPerzan-DG pushed a commit that referenced this pull request Feb 20, 2025
in order to support using pynvml without blowing up the size
of the monitoring docker image, tasks need to run on the non-default
batch-debian vm image. This makes google batch set up the nvidia
drivers on the vm and mount them to the container instead of
relying on the drivers being installed on the monitor image.

An added benefit is reduced docker image sizes for all tasks
that require a gpu.

also: fix gh actions syntax error
SophiaPerzan-DG pushed a commit that referenced this pull request Feb 21, 2025
in order to support using pynvml without blowing up the size
of the monitoring docker image, tasks need to run on the non-default
batch-debian vm image. This makes google batch set up the nvidia
drivers on the vm and mount them to the container instead of
relying on the drivers being installed on the monitor image.

An added benefit is reduced docker image sizes for all tasks
that require a gpu.

also: fix gh actions syntax error
SophiaPerzan-DG pushed a commit that referenced this pull request Mar 3, 2025
in order to support using pynvml without blowing up the size
of the monitoring docker image, tasks need to run on the non-default
batch-debian vm image. This makes google batch set up the nvidia
drivers on the vm and mount them to the container instead of
relying on the drivers being installed on the monitor image.

An added benefit is reduced docker image sizes for all tasks
that require a gpu.

also: fix gh actions syntax error
SophiaPerzan-DG pushed a commit that referenced this pull request Jun 4, 2025
in order to support using pynvml without blowing up the size
of the monitoring docker image, tasks need to run on the non-default
batch-debian vm image. This makes google batch set up the nvidia
drivers on the vm and mount them to the container instead of
relying on the drivers being installed on the monitor image.

An added benefit is reduced docker image sizes for all tasks
that require a gpu.

also: fix gh actions syntax error
SophiaPerzan-DG pushed a commit that referenced this pull request Dec 2, 2025
in order to support using pynvml without blowing up the size
of the monitoring docker image, tasks need to run on the non-default
batch-debian vm image. This makes google batch set up the nvidia
drivers on the vm and mount them to the container instead of
relying on the drivers being installed on the monitor image.

An added benefit is reduced docker image sizes for all tasks
that require a gpu.

also: fix gh actions syntax error
SophiaPerzan-DG pushed a commit that referenced this pull request Dec 4, 2025
in order to support using pynvml without blowing up the size
of the monitoring docker image, tasks need to run on the non-default
batch-debian vm image. This makes google batch set up the nvidia
drivers on the vm and mount them to the container instead of
relying on the drivers being installed on the monitor image.

An added benefit is reduced docker image sizes for all tasks
that require a gpu.

also: fix gh actions syntax error
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants