From a66dda5ae816a6a8936645fe0520cb4dc6354137 Mon Sep 17 00:00:00 2001 From: Bohumir Zamecnik Date: Mon, 22 Oct 2018 12:01:13 +0200 Subject: [PATCH] Install as a service via systemd (Docker cannot be used). --- DOCKER.md | 30 +++++++++++++++++++ Makefile | 12 ++++---- README.md | 71 ++++++++++++++++++++++++++++++++------------- nvgpu-agent.service | 15 ++++++++++ 4 files changed, 102 insertions(+), 26 deletions(-) create mode 100644 DOCKER.md create mode 100644 nvgpu-agent.service diff --git a/DOCKER.md b/DOCKER.md new file mode 100644 index 0000000..6dbeca4 --- /dev/null +++ b/DOCKER.md @@ -0,0 +1,30 @@ +## Installation and running via Docker + +For easier deployment of the agent apps we can use Docker. + +> `psutils` cannot see process details (user, creation time, command) on the host +OS - by definition os container this is separated. This is a design fault of the +dockerized solution and I'm not sure if it can work at all. +See https://github.com/rossumai/nvgpu/issues/2. + +It needs [nvidia-docker](https://github.com/NVIDIA/nvidia-docker) installed. + +```bash +# build the image +docker build -t nvgpu . + +# run CLI +nvidia-docker run --rm nvgpu nvl + +# run agent +nvidia-docker run --rm -p 1080:80 nvgpu + +# run the master with agents specified in ~/nvgpu_master.cfg +nvidia-docker run --rm -p 1080:80 -v $(pwd)/nvgpu_master.cfg:/etc/nvgpu.cfg nvgpu + +open http://localhost:1080 +``` + +You can set the containers for automatic startup with `--restart always` option. + +Note: Docker containers have some hash as hostname (it's not the host machine hostname). diff --git a/Makefile b/Makefile index a72b08f..560fe14 100644 --- a/Makefile +++ b/Makefile @@ -31,9 +31,9 @@ docker_build: docker_run_nvl: nvidia-docker run --rm nvgpu nvl - -docker_run_agent: - nvidia-docker run --rm -p 1080:80 nvgpu - -docker_run_master: - nvidia-docker run --rm -p 1080:80 -v $(pwd)/nvgpu_master.cfg:/etc/nvgpu.cfg nvgpu +# +#docker_run_agent: +# nvidia-docker run --rm -p 1080:80 nvgpu +# +#docker_run_master: +# nvidia-docker run --rm -p 1080:80 -v $(pwd)/nvgpu_master.cfg:/etc/nvgpu.cfg nvgpu diff --git a/README.md b/README.md index 8025359..3400331 100644 --- a/README.md +++ b/README.md @@ -21,15 +21,23 @@ status in a web application. ## Installing +For a user: + +```bash +pip install -U nvgpu ``` -pip install nvgpu + +or to the system: + +```bash +sudo -H pip install -U nvgpu ``` ## Usage examples Command-line interface: -``` +```bash # grab all available GPUs CUDA_VISIBLE_DEVICES=$(nvgpu available) @@ -60,7 +68,7 @@ $ nvl Python API: -``` +```python import nvgpu nvgpu.available_gpus() @@ -97,7 +105,7 @@ Agents can also display their status by default. ### Agent -``` +```bash FLASK_APP=nvgpu.webapp flask run --host 0.0.0.0 --port 1080 ``` @@ -118,37 +126,60 @@ AGENTS = [ ] ``` -``` +```bash NVGPU_CLUSTER_CFG=/path/to/nvgpu_master.cfg FLASK_APP=nvgpu.webapp flask run --host 0.0.0.0 --port 1080 ``` Open the master in the web browser: http://node01:1080. -## Installation and running via Docker +## Installing as a service -For easier deployment of the agent apps we can use Docker. - -It needs [nvidia-docker](https://github.com/NVIDIA/nvidia-docker) installed. +On Ubuntu with `systemd` we can install the agents/master as as service to be +ran automatically on system start. ```bash -# build the image -docker build -t nvgpu . +# create an unprivileged system user +sudo useradd -r nvgpu +``` -# run CLI -nvidia-docker run --rm nvgpu nvl +Copy [nvgpu-agent.service](nvgpu-agent.service) to: -# run agent -nvidia-docker run --rm -p 1080:80 nvgpu +```bash +sudo vi /etc/systemd/system/nvgpu-agent.service +``` -# run the master with agents specified in ~/nvgpu_master.cfg -nvidia-docker run --rm -p 1080:80 -v $(pwd)/nvgpu_master.cfg:/etc/nvgpu.cfg nvgpu +Set agents to the configuration file for the master: -open http://localhost:1080 +```bash +sudo vi /etc/nvgpu.conf ``` -You can set the containers for automatic startup with `--restart always` option. +```python +AGENTS = [ + # direct access without using HTTP + 'self', + 'http://node01:1080', + 'http://node02:1080', + 'http://node03:1080', + 'http://node04:1080', +] +``` -Note: Docker containers have some hash as hostname (it's not the host machine hostname). +Set up and start the service: + +```bash +# enable for automatic startup at boot +sudo systemctl enable nvgpu-agent.service +# start +sudo systemctl start nvgpu-agent.service +# check the status +sudo systemctl status nvgpu-agent.service +``` + +```bash +# check the service +open http://localhost:1080 +``` ## Author diff --git a/nvgpu-agent.service b/nvgpu-agent.service new file mode 100644 index 0000000..76609ae --- /dev/null +++ b/nvgpu-agent.service @@ -0,0 +1,15 @@ +[Unit] +Description=NVGPU agent + +[Service] +User=nvgpu +Group=nvgpu +Environment=NVGPU_CLUSTER_CFG=/etc/nvgpu.conf +Environment=FLASK_APP=nvgpu.webapp +Environment=FLASK_HOST=0.0.0.0 +Environment=FLASK_PORT=1080 +ExecStart=/bin/bash -c "/usr/local/bin/flask run --host $FLASK_HOST --port $FLASK_PORT" +Restart=always + +[Install] +WantedBy=multi-user.target