Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider scratch disk sizes in the scheduler #49

Open
4 tasks
achimnol opened this issue Aug 20, 2019 · 3 comments
Open
4 tasks

Consider scratch disk sizes in the scheduler #49

achimnol opened this issue Aug 20, 2019 · 3 comments
Labels
type:feature Add new features
Milestone

Comments

@achimnol
Copy link
Member

achimnol commented Aug 20, 2019

lablup/backend.ai-agent#70 will allow size configuration & enforcements of per-container scratch directories.
The current manager does not take the disk space into account when scheduling, so we need to improve that.

  • manager: Add "scratch" intrinsic resource slot type
  • agent
    • Add an intrinsic compute resource plugin, "ScratchDevice" and "ScracthPlugin", to ai.backend.agent.intrinsic
    • Report the space availability info of the disk where the scratch root resides to the manager

┆Issue is synchronized with this Asana task by Unito

@achimnol
Copy link
Member Author

achimnol commented Sep 8, 2019

Docker containers (Docker 18.09) use by default 320 MiB of read-writable tmpfs spaces.

tmpfs on /dev type tmpfs (rw,nosuid,size=65536k,mode=755)
tmpfs on /proc/kcore type tmpfs (rw,nosuid,size=65536k,mode=755)
tmpfs on /proc/keys type tmpfs (rw,nosuid,size=65536k,mode=755)
tmpfs on /proc/timer_list type tmpfs (rw,nosuid,size=65536k,mode=755)
tmpfs on /proc/sched_debug type tmpfs (rw,nosuid,size=65536k,mode=755)

It is unlikely to happen that user programs fill up all of above spaces, but for system stability, we need to consider such scenario.

So we need to subtract the following values when setting the memory limit for container (#52),

  • all default tmpfs sizes
  • scratch size if the scratch directory uses tmpfs
  • shared memory size if allocated (default: 64 MiB)

@achimnol
Copy link
Member Author

achimnol commented Sep 8, 2019

We could list above /proc/ mounts from docker inspect result, but still we need to figure out which are writable and what the size limits are:

"MaskedPaths": [
    "/proc/asound",
    "/proc/acpi",
    "/proc/kcore",
    "/proc/keys",
    "/proc/latency_stats",
    "/proc/timer_list",
    "/proc/timer_stats",
    "/proc/sched_debug",
    "/proc/scsi",
    "/sys/firmware"
],
"ReadonlyPaths": [
    "/proc/bus",
    "/proc/fs",
    "/proc/irq",
    "/proc/sys",
    "/proc/sysrq-trigger"
]

@achimnol
Copy link
Member Author

achimnol commented Sep 8, 2019

We can list those in-container /proc mounts by reading from /proc/{host-pid-of-container-process}/mounts from the agent.

@achimnol achimnol modified the milestones: 19.09, 20.03 Jan 20, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type:feature Add new features
Projects
None yet
Development

No branches or pull requests

1 participant