Skip to content

Proposal: teleprobe server split into api and worker #28

@lulf

Description

@lulf

I would like to propose a set of changes to the teleprobe architecture that, if accepted, should allow it to scale for running multiple jobs:

  • Split teleprobe-server into 2 parts: teleprobe-api (one) and teleprobe-worker (many)
    • The teleprobe-api accepts requests to run a job
      • A job includes a list of binaries and associated tags which identifies on which each binary should run.
      • Maintains an in-memory queue of jobs and schedules them across workers.
      • Is public facing and authenticates requests to run jobs
    • The teleprobe-worker runs a binary and reports result and logs back to teleprobe-api.
      • A worker is configured with a list of targets. Each target contains the same information as today, but with a set of tags/labels.
      • At startup, each worker announces to teleprobe-api it's identity and the list of targets with tags/labels it supports.
      • Workers poll the teleprobe-api for binaries to run (long-polling with timeout) and runs those binaries (can run multiple in parallel, api knows if worker is busy).
      • Workers report logs/results back to the teleprobe-api
      • Are not public facing and is assumed to have an internal network for accessing the teleprobe-api

A further improvement could be to even split the teleprobe-api into an api and a scheduler part, allowing job information to persist across restarts, running multiple API for failover etc, but that would require introducing persistence and some form of coordination. So I consider that a future step and a natural evolution of the above should the need arise.

I'm happy to instead fork teleprobe for this capability, but it feels like a lot of overlap in the use case.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions