Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added a config option for easy gpu sharing to DockerSpawner #473

Closed
wants to merge 4 commits into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
44 changes: 44 additions & 0 deletions dockerspawner/dockerspawner.py
Original file line number Diff line number Diff line change
Expand Up @@ -1113,6 +1113,42 @@ def _cast_cpu_limit(self, proposal):
"""cast cpu_limit to a float if it's callable"""
return self._eval_if_callable(proposal.value)

gpu_ids = Union(
[Callable(), Unicode(allow_none=True)],
help="""
GPU IDs to share with containers

Default is None which means no GPUs are shared

Acceptable values are None or a string containing list of
GPU integer IDs or uuids or the literal 'all'

Examples:
c.DockerSpawner.gpu_ids = None # Will not share GPUs with containers
c.DockerSpawner.gpu_ids = 'all' # Shares all the GPUs with containers
c.DockerSpawner.gpu_ids = '0' # Shares GPU of ID 0 with containers
c.DockerSpawner.gpu_ids = '0,1,2' # Shares GPUs of IDs 0, 1 and 2 with containers

Alternatively, you can pass a callable that takes the spawner as
the only argument and returns one of the above acceptable values:

def per_user_gpu_ids(spawner):
username = spawner.user.name
gpu_assign = {'alice': '0', 'bob': '1,2'}
return gpu_assign.get(username, None)
c.DockerSpawner.gpu_ids = per_user_gpu_ids

Note that before using this config option, you have to:
1- Install the Nvidia Container Toolkit and make sure your docker is able to run containers with gpu
2- Use an image with a CUDA version supported by your hosts GPU driver
""",
).tag(config=True)

@validate('gpu_ids')
def _cast_gpu_ids(self, proposal):
"""cast gpu_ids to a string if it's callable"""
return self._eval_if_callable(proposal.value)

async def create_object(self):
"""Create the container/service object"""

Expand All @@ -1135,6 +1171,7 @@ async def create_object(self):
binds=self.volume_binds,
links=self.links,
mounts=self.mount_binds,
device_requests=[],
)

if getattr(self, "mem_limit", None) is not None:
Expand All @@ -1149,6 +1186,13 @@ async def create_object(self):
)
host_config["cpu_quota"] = int(self.cpu_limit * cpu_period)

if getattr(self, "gpu_ids", None) is not None:
host_config['device_requests'].append(
docker.types.DeviceRequest(
device_ids=[f"{self.gpu_ids}"], capabilities=[['gpu']]
)
)

if not self.use_internal_ip:
host_config["port_bindings"] = {self.port: (self.host_ip,)}
host_config.update(self._render_templates(self.extra_host_config))
Expand Down