procwatch

A process supervisor written in Python. Define a set of services in a YAML config, and procwatch will start them, watch them, restart them when they crash, and stream their logs to disk. There's a live terminal dashboard and a small HTTP API for controlling things at runtime.

Linux only — uses /proc/{pid}/stat for metrics and POSIX signals for shutdown.

install

requires Python 3.11+ and Linux

pip install -e .

run

procwatch examples/sample.yaml

From another terminal:

curl localhost:8080/status

Ctrl+C to stop. procwatch will SIGTERM all child processes, wait 5 seconds, then SIGKILL anything still running.

config

log_dir: /tmp/procwatch/logs
api_port: 8080

services:
  - name: web
    command: "python3 server.py"
    restart: always
    max_restarts: 5
    backoff_base: 1.0
    env:
      PORT: "9000"
    cwd: /opt/myapp

  - name: worker
    command: "python3 worker.py"
    restart: on-failure
    max_restarts: 3

  - name: migrate
    command: "python3 manage.py migrate"
    restart: never

restart options:

always — restart regardless of exit code
on-failure — only restart if exit code != 0
never — run once

Backoff doubles each time: 1s, 2s, 4s... capped at 30s.

logs

Each service gets its own log files:

/tmp/procwatch/logs/
  web.stdout.log
  web.stderr.log

Internal procwatch logs go to /tmp/procwatch/procwatch.log.

API

GET  /routes                        list all endpoints
GET  /status                        all services + state, pid, cpu, memory, uptime
POST /services/{name}/start         start a stopped service
POST /services/{name}/stop          stop a running service
POST /services/{name}/restart       restart a service

Example:

curl localhost:8080/status
curl -X POST localhost:8080/services/web/restart

/status response:

[
  {
    "name": "web",
    "state": "running",
    "pid": 12345,
    "restart_count": 1,
    "uptime_seconds": 42.3,
    "memory_kb": 18432
  }
]

States: pending running restarting stopped failed

failed means the service hit max_restarts and won't be retried.

how it works

Each service runs in its own asyncio.Task. The supervisor sits in the same event loop and waits on a stop event that gets set when SIGTERM or SIGINT arrives.

When a process exits, its worker checks the restart policy — if it should restart, it waits the backoff period and spawns again. stdout and stderr are drained concurrently with asyncio.gather alongside proc.wait(), so a process writing a lot of output doesn't block anything else.

Shutdown sends SIGTERM to the whole process group (not just the top-level PID) so child-of-child processes get cleaned up too. If they don't exit within 5 seconds they get SIGKILL.

Memory and CPU come from /proc/{pid}/stat — no psutil. CPU is calculated by diffing the cumulative tick counters between samples, which is the same approach top uses.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
examples		examples
procwatch		procwatch
tests		tests
README.md		README.md
pyproject.toml		pyproject.toml
tui.png		tui.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

procwatch

install

run

config

logs

API

how it works

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

procwatch

install

run

config

logs

API

how it works

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages