Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cherry pick otel update commit #6924

Draft
wants to merge 7 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
52 changes: 52 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -80,6 +80,58 @@ Main (unreleased)
- Resync defaults for `otelcol.processor.k8sattributes` with upstream. (@hainenber)

- Resync defaults for `otelcol.exporter.otlp` and `otelcol.exporter.otlphttp` with upstream. (@hainenber)

- Upgrading from OpenTelemetry v0.96.0 to v0.99.0.
- `otelcol.processor.batch`: Prevent starting unnecessary goroutines.
https://github.com/open-telemetry/opentelemetry-collector/issues/9739
- `otelcol.exporter.otlp`: Checks for port in the config validation for the otlpexporter.
https://github.com/open-telemetry/opentelemetry-collector/issues/9505
- `otelcol.receiver.otlp`: Fix bug where the otlp receiver did not properly respond
with a retryable error code when possible for http.
https://github.com/open-telemetry/opentelemetry-collector/pull/9357
- `otelcol.receiver.vcenter`: Fixed the resource attribute model to more accurately support multi-cluster deployments.
https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/30879
For more information on impacts please refer to:
https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/31113
The main impact is that `vcenter.resource_pool.name`, `vcenter.resource_pool.inventory_path`,
and `vcenter.cluster.name` are reported with more accuracy on VM metrics.
- `otelcol.receiver.vcenter`: Remove the `vcenter.cluster.name` resource attribute from Host resources if the Host is standalone (no cluster).
https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/32548
- `otelcol.receiver.vcenter`: Changes process for collecting VMs & VM perf metrics to be more efficient (one call now for all VMs).
https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/31837
- `otelcol.connector.servicegraph`: Added a new `database_name_attribute` config argument to allow users to
specify a custom attribute name for identifying the database name in span attributes.
https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/30726
- `otelcol.connector.servicegraph`: Fix 'failed to find dimensions for key' error from race condition in metrics cleanup.
https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/31701
- `otelcol.connector.spanmetrics`: Add `metrics_expiration` option to enable expiration of metrics if spans are not received within a certain time frame.
By default, the expiration is disabled (set to 0).
https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/30559
- `otelcol.connector.spanmetrics`: Change default value of `metrics_flush_interval` from 15s to 60s.
https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/31776
- `otelcol.connector.spanmetrics`: Discard counter span metric exemplars after each flush interval to avoid unbounded memory growth.
This aligns exemplar discarding for counter span metrics with the existing logic for histogram span metrics.
https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/31683
- `otelcol.exporter.loadbalancing`: Fix panic when a sub-exporter is shut down while still handling requests.
https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/31410
- `otelcol.exporter.loadbalancing`: Fix memory leaks on shutdown.
https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/31050
- `otelcol.exporter.loadbalancing`: Support the timeout period of k8s resolver list watch can be configured.
https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/31757
- `otelcol.processor.transform`: Change metric unit for metrics extracted with `extract_count_metric()` to be the default unit (`1`).
https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/31575
- `otelcol.receiver.opencensus`: Refactor the receiver to pass lifecycle tests and avoid leaking gRPC connections.
https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/31643
- `otelcol.extension.jaeger_remote_sampling`: Fix leaking goroutine on shutdown.
https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/31157
- `otelcol.receiver.kafka`: Fix panic on shutdown.
https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/31926
- `otelcol.processor.resourcedetection`: Only attempt to detect Kubernetes node resource attributes when they're enabled.
https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/31941
- `otelcol.processor.resourcedetection`: Fix memory leak on AKS.
https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/32574
- `otelcol.processor.resourcedetection`: Update to ec2 scraper so that core attributes are not dropped if describeTags returns an error (likely due to permissions).
https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/30672

v0.40.5 (2024-05-15)
--------------------
Expand Down
3 changes: 1 addition & 2 deletions docs/developer/updating-otel.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,6 @@ The Agent depends on various OpenTelemetry (Otel) modules such as these:
```
github.com/open-telemetry/opentelemetry-collector-contrib/exporter/jaegerexporter
github.com/open-telemetry/opentelemetry-collector-contrib/extension/sigv4authextension
github.com/open-telemetry/opentelemetry-collector-contrib/processor/spanmetricsprocessor
go.opentelemetry.io/collector
go.opentelemetry.io/collector/component
go.opentelemetry.io/otel
Expand All @@ -24,7 +23,7 @@ Unfortunately, updating Otel dependencies is not straightforward:
* This is mostly so that we can include metrics of Collector components with the metrics shown under the Agent's `/metrics` endpoint.
* All Collector and Collector-Contrib dependencies should be updated at the same time, because they
are kept in sync on the same version.
* E.g. if we use `v0.85.0` of `go.opentelemetry.io/collector`, we also use `v0.85.0` of `spanmetricsprocessor`.
* E.g. if we use `v0.85.0` of `go.opentelemetry.io/collector`, we also use `v0.85.0` of `spanmetricsconnector`.
* This is in line with how the Collector itself imports dependencies.
* It helps us avoid bugs.
* It makes it easier to communicate to customers the version of Collector which we use in the Agent.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -59,13 +59,14 @@ otelcol.connector.servicegraph "LABEL" {

`otelcol.connector.servicegraph` supports the following arguments:

Name | Type | Description | Default | Required
---- | ---- | ----------- | ------- | --------
`latency_histogram_buckets` | `list(duration)` | Buckets for latency histogram metrics. | `["2ms", "4ms", "6ms", "8ms", "10ms", "50ms", "100ms", "200ms", "400ms", "800ms", "1s", "1400ms", "2s", "5s", "10s", "15s"]` | no
`dimensions` | `list(string)` | A list of dimensions to add with the default dimensions. | `[]` | no
`cache_loop` | `duration` | Configures how often to delete series which have not been updated. | `"1m"` | no
`store_expiration_loop` | `duration` | The time to expire old entries from the store periodically. | `"2s"` | no
`metrics_flush_interval` | `duration` | The interval at which metrics are flushed to downstream components. | `"0s"` | no
Name | Type | Description | Default | Required
----------------------------|------------------|-------------------------------------------------------------------- |---------|---------
`latency_histogram_buckets` | `list(duration)` | Buckets for latency histogram metrics. | `["2ms", "4ms", "6ms", "8ms", "10ms", "50ms", "100ms", "200ms", "400ms", "800ms", "1s", "1400ms", "2s", "5s", "10s", "15s"]` | no
`dimensions` | `list(string)` | A list of dimensions to add with the default dimensions. | `[]` | no
`cache_loop` | `duration` | Configures how often to delete series which have not been updated. | `"1m"` | no
`store_expiration_loop` | `duration` | The time to expire old entries from the store periodically. | `"2s"` | no
`metrics_flush_interval` | `duration` | The interval at which metrics are flushed to downstream components. | `"0s"` | no
`database_name_attribute` | `string` | The attribute name used to identify the database name from span attributes. | `"db.name"` | no

Service graphs work by inspecting traces and looking for spans with
parent-children relationship that represent a request.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -72,7 +72,8 @@ otelcol.connector.spanmetrics "LABEL" {
| `aggregation_temporality` | `string` | Configures whether to reset the metrics after flushing. | `"CUMULATIVE"` | no |
| `dimensions_cache_size` | `number` | How many dimensions to cache. | `1000` | no |
| `exclude_dimensions` | `list(string)` | List of dimensions to be excluded from the default set of dimensions. | `[]` | no |
| `metrics_flush_interval` | `duration` | How often to flush generated metrics. | `"15s"` | no |
| `metrics_flush_interval` | `duration` | How often to flush generated metrics. | `"60s"` | no |
| `metrics_expiration` | `duration` | Time period after which metrics are considered stale and are removed from the cache. | `"0s"` | no |
| `namespace` | `string` | Metric namespace. | `""` | no |
| `resource_metrics_cache_size` | `number` | The size of the cache holding metrics for a service. | `1000` | no |
| `resource_metrics_key_attributes` | `list(string)` | Limits the resource attributes used to create the metrics. | `[]` | no |
Expand All @@ -86,6 +87,8 @@ The supported values for `aggregation_temporality` are:

If `namespace` is set, the generated metric name will be added a `namespace.` prefix.

Setting `metrics_expiration` to `"0s"` means that the metrics will never expire.

`resource_metrics_cache_size` is mostly relevant for cumulative temporality. It helps avoid issues with increasing memory and with incorrect metric timestamp resets.

`resource_metrics_key_attributes` can be used to avoid situations where resource attributes may change across service restarts,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -153,6 +153,7 @@ Name | Type | Description | Default | Required
---- | ---- | ----------- | ------- | --------
`service` | `string` | Kubernetes service to resolve. | | yes
`ports` | `list(number)` | Ports to use with the IP addresses resolved from `service`. | `[4317]` | no
`timeout` | `duration` | Resolver timeout. | `"1s"` | no

If no namespace is specified inside `service`, an attempt will be made to infer the namespace for this Agent.
If this fails, the `default` namespace will be used.
Expand Down
Loading
Loading