Resource Management

Resource slots

Backend.AI abstracts each different type of computing resources as a "resource slot". Resource slots are distinguished by its name consisting of two parts: the device name and the slot name.

Resource slot name	Device name	Slot name
`cpu`	`cpu`	(implicitly defined as `root`)
`mem`	`mem`	(implicitly defined as `root`)
`cuda.device`	`cuda`	`device`
`cuda.shares`	`cuda`	`shares`
`cuda.mig-2c10g`	`cuda`	`mig-2c10g`

Each resource slot has a slot type as follows:

Slot type	Meaning	Examples
`COUNT`	The value of the resource slot is an integer or decimal to represent how many of the device(s) are available/allocated. It may also represent fractions of devices.	`cpu`, `cuda.device`, `cuda.shares`
`BYTES`	The value of the resource slot is an integer to represent how many bytes of the resources are available/allocated.	`mem`
`UNIQUE`	Only "each one" of the device can be allocated to each different kernel exclusively.	`cuda.mig-10g`

Compute plugins

Backend.AI administrators may install one or more compute plugins to each agent. Without any plugin, only the intrinsic cpu and mem resource slots are available.

Each compute plugin may declare one or more resource slots. The plugin is invoked upon startup of the agent to get the list of devices and the resource slots to report. Administrators can inspect the per-agent accelerator details provided by the compute plugins in the control panel.

The most well-known compute plugin is cuda_open, which is included in the open source version. It declares cuda.device resource slot that represents each NVIDIA GPU as one unit.

There is a special compute plugin to simulate non-existent devices: mock. Developers may put a local configuration to declare an arbitrary set of devices and resource slots to test the schedulers and the frontend. It is useful to develop integrations with new hardware devices before you get the actual devices on your hands.

Resource group is a logical group of the Agents with independent schedulers. Each agent belongs to a single resource group only. It self-reports which resource group to join when sending the heartbeat messages, but the specified resource group must exist in prior.

concept-scheduler

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

resources.rst

resources.rst

Resource Management

Resource slots

Compute plugins

Files

resources.rst

Latest commit

History

resources.rst

File metadata and controls

Resource Management

Resource slots

Compute plugins