You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
lablup/backend.ai-agent#70 will allow size configuration & enforcements of per-container scratch directories.
The current manager does not take the disk space into account when scheduling, so we need to improve that.
manager: Add "scratch" intrinsic resource slot type
agent
Add an intrinsic compute resource plugin, "ScratchDevice" and "ScracthPlugin", to ai.backend.agent.intrinsic
Report the space availability info of the disk where the scratch root resides to the manager
Docker containers (Docker 18.09) use by default 320 MiB of read-writable tmpfs spaces.
tmpfs on /dev type tmpfs (rw,nosuid,size=65536k,mode=755)
tmpfs on /proc/kcore type tmpfs (rw,nosuid,size=65536k,mode=755)
tmpfs on /proc/keys type tmpfs (rw,nosuid,size=65536k,mode=755)
tmpfs on /proc/timer_list type tmpfs (rw,nosuid,size=65536k,mode=755)
tmpfs on /proc/sched_debug type tmpfs (rw,nosuid,size=65536k,mode=755)
It is unlikely to happen that user programs fill up all of above spaces, but for system stability, we need to consider such scenario.
So we need to subtract the following values when setting the memory limit for container (#52),
lablup/backend.ai-agent#70 will allow size configuration & enforcements of per-container scratch directories.
The current manager does not take the disk space into account when scheduling, so we need to improve that.
ai.backend.agent.intrinsic
┆Issue is synchronized with this Asana task by Unito
The text was updated successfully, but these errors were encountered: