Skip to content

Distributed Tracing feature #70

@Aki-7

Description

@Aki-7

Dependencies

Feature Request

Distributed tracing for Jenkins Remoting.

Purpose

Monitoring and troubleshooting Jenkins agents by tracing the remoting behavior.

Challenges

How to instrument remoting

  • Use EngineListener and ChannelListener

attempt PR: #49

What we can trace is restricted

  • Modify the remoting module to instrument more

attempt PR: jenkinsci/remoting#471

A completely different method might work well.

  • Sniffing packet payload?

How to collect spans when the connection is not established

The easiest way to use EngingListener is to send a listener from the controller and register the listener.
But then, we cannot collect spans before the initial connection.
Also, we may not be able to collect spans after the connection is closed and before the connection is established again. see #65.

  • Setup instrumentation when launching agent.

attempt PR: jenkinsci/remoting#471

How we can contribute to the better monitoring and troubleshooting experience?

OpenTelemetry Plugin already trace the time spent to allocate a node to a job, which includes the time to provision a new node if needed.

We are trying to create more detailed spans but it is difficult to know what kind of spans are helpful for monitoring and troubleshooting.

Here is the draft of the spans: https://docs.google.com/document/d/1gjRamLWz3NwenVifC5pYyBMmxsUjl9MjspZF0mRYeaI/edit#heading=h.6xn68iwvd7gz

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requesthelp wantedExtra attention is needed

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions