Skip to content

Architecture

Fábio Luciano edited this page Jun 11, 2026 · 2 revisions

Architecture

The relay is a single stateless-ish Go service: an HTTP receiver that decodes Tekton CloudEvents, pushes them through a processing chain, and fans them out to action handlers built from your configuration.

System view

flowchart TB
    subgraph cluster [Kubernetes cluster]
        TKC[Tekton Pipelines controller]
        subgraph relay [tekton-events-relay]
            HTTP[HTTP receiver<br/>:8080]
            CHAIN[Processing chain]
            REG[Handler registry]
            STORE[(State store<br/>memory / Valkey / Olric)]
            DLQ[(Dead letter queue<br/>JSONL)]
        end
        VK[(Valkey<br/>optional)]
    end
    TKC -- CloudEvents over HTTP --> HTTP
    HTTP --> CHAIN --> REG
    CHAIN <--> STORE
    STORE -. backend=valkey .-> VK
    HTTP -- permanent failures --> DLQ
    REG --> GH[GitHub API]
    REG --> GL[GitLab API]
    REG --> OTH[Gitea · Bitbucket · Azure · SourceHut]
    REG --> NOT[Slack · Teams · Discord · PagerDuty<br/>Datadog · Grafana · Sentry · Webhooks]
Loading

The processing chain

Every event flows through a fixed Chain of Responsibility. Each link can stop the event (filtered/duplicate) or pass it on; each link is instrumented with Prometheus timers.

flowchart LR
    IN([CloudEvent]) --> V[Validator<br/>required fields]
    V --> F[Filter<br/>resource-type allowlist]
    F --> D[Deduper<br/>CloudEvent ID, via store]
    D --> E[Enricher<br/>dashboard TargetURL]
    E --> DI[Dispatcher<br/>concurrent fan-out]
    DI --> H1[handler 1]
    DI --> H2[handler 2]
    DI --> HN[handler N]
Loading
  • Validator — drops malformed events (missing provider/SHA/run name) before they reach providers.
  • Filterfilter.allow_taskrun / allow_pipelinerun / allow_customrun / allow_eventlistener.
  • Deduper — first-seen check on the CloudEvent ID against the state store; duplicates from Tekton retransmissions are dropped here.
  • Enricher — fills TargetURL with a Tekton Dashboard link when dashboard_url is set.
  • Dispatcher — runs all matching handlers concurrently (max_concurrency bound, per-handler handler_timeout), records per-handler status for /readyz.

Each handler is additionally wrapped, at build time, with its action's CEL when guard, task/pipeline name filters, and (for commit_status with context_per_task) the per-task context rewriter.

Life of an event

sequenceDiagram
    participant T as Tekton controller
    participant R as Relay (HTTP)
    participant C as Chain
    participant S as State store
    participant G as GitHub API

    T->>R: POST / (CloudEvent: pipelinerun.successful)
    R->>R: auth (HMAC) · rate limit · body limit
    R->>C: decoded envelope
    C->>S: FirstSeen(event-id)?
    S-->>C: true (new)
    C->>G: create commit status / upsert PR comment
    alt provider returns 429/5xx
        C->>G: retry with backoff + jitter (honors Retry-After)
    end
    C-->>R: ok
    R-->>T: 200 OK
    Note over R,T: Retryable overload → 503,<br/>Tekton retransmits later.<br/>Permanent failure → 200 + event preserved in DLQ.
Loading

The HTTP status code is the back-pressure protocol: 503 means "retry me" (queue/backend overload), 200 always acknowledges — including permanent failures, which go to the dead letter queue instead of being retried forever by Tekton.

Configuration → handlers

flowchart TB
    CFG[config.yaml] --> FACT[Factories<br/>per provider/notifier]
    SEC[/mounted Secrets/] --> FACT
    FACT --> W1[CEL when guard]
    W1 --> W2[task/pipeline filter]
    W2 --> H[Action handler]
    H --> REG[Registry]
    RELOAD[SIGHUP / file watch] -. rebuild & atomic swap .-> FACT
Loading

On hot reload, factories rebuild the whole registry and chain from the new config and swap them atomically; in-flight events finish on the old one. The dedupe store is not rebuilt, so no duplicates slip through a reload.

High availability topology

The deduper and the accumulator hold state. With the default memory backend that state is per-pod — correct only with one replica. Shared backends make replicas equivalent:

flowchart LR
    subgraph valkey [backend: valkey]
        P1[relay pod] --> V[(Valkey)]
        P2[relay pod] --> V
    end
    subgraph olric [backend: olric]
        Q1[relay pod] <-. gossip .-> Q2[relay pod]
        Q1 <-. gossip .-> Q3[relay pod]
        Q2 <-. gossip .-> Q3
    end
Loading

See Operations → State backends for trade-offs.

Source layout

Package Responsibility
cmd/receiver Wiring: config load, store/DLQ construction, chain build, hot reload, lifecycle
internal/event, internal/event/tekton CloudEvent decoding (TaskRun, PipelineRun, CustomRun, EventListener) and annotation extraction
internal/domain The neutral Event model all handlers consume
internal/pipeline Chain links: validator, filter, deduper, enricher, dispatcher, status tracker
internal/store Pluggable state backends (memory / Valkey / Olric)
internal/dlq Dead letter queue (JSONL)
internal/notifier/scm/* One package per SCM provider
internal/notifier/* Chat/alerting notifiers
internal/factory Config → handler construction, CEL/filter wrapping
internal/cel CEL compilation, custom macros, event field exposure
internal/http, internal/httpx Receiver server/middleware · outbound HTTP client with retry policy
internal/metrics, internal/tracing Prometheus collectors · OpenTelemetry

Clone this wiki locally