Status | |
---|---|
Stability | development: traces |
alpha: metrics | |
Distributions | contrib |
Issues | |
Code Owners | @adrielp, @andrzej-stencel, @crobert-1, @TylerHelmuth |
The GitHub receiver receives data from GitHub via two methods:
- Scrapes version control system metrics from GitHub repositories and organizations using the GraphQL and REST APIs.
- Receives GitHub Actions events by serving a webhook endpoint, converting those events into traces.
The current default set of metrics can be found in documentation.md.
These metrics can be used as leading indicators (capabilities) to the DORA metrics; helping provide insight into modern-day engineering practices.
The collection interval is common to all scrapers and is set to 30 seconds by default.
Note: Generally speaking, if the vendor allows for anonymous API calls, then you won't have to configure any authentication, but you may only see public repositories and organizations. You may also run into significantly more rate limiting.
github:
collection_interval: <duration> #default = 30s recommended 300s
scrapers:
scraper/config-1:
scraper/config-2:
...
A more complete example using the GitHub scrapers with authentication is as follows:
extensions:
bearertokenauth/github:
token: ${env:GH_PAT}
receivers:
github:
initial_delay: 1s
collection_interval: 60s
scrapers:
scraper:
metrics:
vcs.contributor.count:
enabled: true
github_org: myfancyorg
search_query: "org:myfancyorg topic:o11yalltheway" #Recommended optional query override, defaults to "{org,user}:<github_org>"
endpoint: "https://selfmanagedenterpriseserver.com"
auth:
authenticator: bearertokenauth/github
service:
extensions: [bearertokenauth/github]
pipelines:
metrics:
receivers: [..., github]
processors: []
exporters: [...]
Important:
- The GitHub scraper does not emit metrics for branches that have not had changes since creation from the default branch (trunk).
- Due to GitHub API limitations, it is possible for the branch time metric to change when rebases occur, recreating the commits with new timestamps.
For additional context on GitHub scraper limitations and inner workings please see the Scraping README.
Workflow tracing support is accomplished through the processing of GitHub
Actions webhook events for workflows and jobs. The workflow_job
and
workflow_run
event payloads are then constructed into trace
telemetry.
Each GitHub Action workflow or job, along with its steps, are converted
into trace spans, allowing the observation of workflow execution times,
success, and failure rates. Each Trace and Span ID is deterministic. This
enables the underlying actions to emit telemetry from any command running in any
step. This can be achieved by using tools like the run-with-telemetry
action and otel-cli. The key is generating IDs in the same way
that this GitHub receiver does. The trace_event_handling.go file contains
the new*ID
functions that generate deterministic IDs.
IMPORTANT - Ensure your WebHook endpoint is secured with a secret and a Web Application Firewall (WAF) or other security measure.
The WebHook configuration exposes the following settings:
endpoint
: (default =localhost:8080
) - The address and port to bind the WebHook to.path
: (default =/events
) - The path for Action events to be sent to.health_path
: (default =/health
) - The path for health checks.secret
: (optional) - The secret used to validates the payload.required_header
: (optional) - The required header key and value for incoming requests.service_name
: (optional) - The service name for the traces. See the Configuring Service Name section for more information.
The WebHook configuration block also accepts all the confighttp settings.
An example configuration is as follows:
receivers:
github:
webhook:
endpoint: localhost:19418
path: /events
health_path: /health
secret: ${env:SECRET_STRING_VAR}
required_headers:
WAF-Header: "value"
For tracing, all configuration is set under the webhook
key. The full set
of exposed configuration values can be found in config.go
.
The service_name
option in the WebHook configuration can be used to set a
pre-defined service.name
resource attribute for all traces emitted by the
receiver. This takes priority over the internal generation of the
service.name
. In this configuration, it would be important to create a GitHub
receiver per GitHub app configured for the set of repositories that match your
service.name
.
However, a more efficient approach would be to leverage the default generation
of service.name
by configuring Custom Properties in each GitHub
repository. To do that simply add a service_name
key with the desired value in
each repository and all events sent to the GitHub receiver will properly
associate with that service.name
. Alternatively, the service_name
will be
derived from the repository name.
The precedence for setting the service.name
resource attribute is as follows:
service_name
configuration in the WebHook configuration.service_name
key in the repository's Custom Properties per repository.service_name
derived from the repository name.service.name
set tounknown_service
per the semantic conventions as a fall back.
To configure a GitHub App, you will need to create a new GitHub App within your
organization. Refer to the general GitHub App documentation for how to
create a GitHub App. During the subscription phase, subscribe to workflow_run
and workflow_job
events.