-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add the otelarrow receiver scaffold #26519
Changes from all commits
ee43bd3
399eac3
6605a23
7c0a26b
67aee1e
d3490a8
6817b9e
6556aca
8d9c990
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,27 @@ | ||
# Use this changelog template to create an entry for release notes. | ||
|
||
# One of 'breaking', 'deprecation', 'new_component', 'enhancement', 'bug_fix' | ||
change_type: new_component | ||
|
||
# The name of the component, or a single word describing the area of concern, (e.g. filelogreceiver) | ||
component: otelarrowreceiver | ||
|
||
# A brief description of the change. Surround your text with quotes ("") if it needs to start with a backtick (`). | ||
note: The OTel Arrow receiver receives telemetry data using the OTel-Arrow protocol via gRPC and standard OTLP protocol via gRPC or HTTP. | ||
|
||
# Mandatory: One or more tracking issues related to the change. You can use the PR number here if no issue exists. | ||
issues: [26491] | ||
|
||
# (Optional) One or more lines of additional information to render under the primary note. | ||
# These lines will be padded with 2 spaces and then inserted directly into the document. | ||
# Use pipe (|) for multiline entries. | ||
subtext: | ||
|
||
# If your change doesn't affect end users or the exported elements of any package, | ||
# you should instead start your pull request title with [chore] or use the "Skip Changelog" label. | ||
# Optional: The change log or logs in which this entry should be included. | ||
# e.g. '[user]' or '[user, api]' | ||
# Include 'user' if the change is relevant to end users. | ||
# Include 'api' if there is a change to a library API. | ||
# Default: '[user]' | ||
change_logs: [user] |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
include ../../Makefile.Common |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,184 @@ | ||
# OTel-Arrow Receiver | ||
|
||
<!-- status autogenerated section --> | ||
| Status | | | ||
| ------------- |-----------| | ||
| Stability | [development]: traces, metrics, logs | | ||
| Distributions | [contrib] | | ||
| Issues | [![Open issues](https://img.shields.io/github/issues-search/open-telemetry/opentelemetry-collector-contrib?query=is%3Aissue%20is%3Aopen%20label%3Areceiver%2Fotelarrow%20&label=open&color=orange&logo=opentelemetry)](https://github.com/open-telemetry/opentelemetry-collector-contrib/issues?q=is%3Aopen+is%3Aissue+label%3Areceiver%2Fotelarrow) [![Closed issues](https://img.shields.io/github/issues-search/open-telemetry/opentelemetry-collector-contrib?query=is%3Aissue%20is%3Aclosed%20label%3Areceiver%2Fotelarrow%20&label=closed&color=blue&logo=opentelemetry)](https://github.com/open-telemetry/opentelemetry-collector-contrib/issues?q=is%3Aclosed+is%3Aissue+label%3Areceiver%2Fotelarrow) | | ||
| [Code Owners](https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/CONTRIBUTING.md#becoming-a-code-owner) | [@jmacd](https://www.github.com/jmacd), [@lquerel](https://www.github.com/lquerel) | | ||
|
||
[development]: https://github.com/open-telemetry/opentelemetry-collector#development | ||
[contrib]: https://github.com/open-telemetry/opentelemetry-collector-releases/tree/main/distributions/otelcol-contrib | ||
<!-- end autogenerated section --> | ||
|
||
This is a multi-protocol telemetry receiver. Receives telemetry data | ||
using the standard forms of [OTLP (gRPC, HTTP/proto, HTTP/json)]( | ||
https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/protocol/otlp.md) | ||
and [OTel-Arrow](https://github.com/open-telemetry/otel-arrow) (gRPC). | ||
|
||
## Getting Started | ||
|
||
The OTel-Arrow receiver is an extension of the core OpenTelemetry | ||
Collector [OTLP | ||
receiver](https://github.com/open-telemetry/opentelemetry-collector/tree/main/receiver/otlpreceiver) | ||
component with additional support for the | ||
[OTel-Arrow](https://github.com/open-telemetry/otel-arrow) protocol. | ||
|
||
OTel-Arrow supports column-oriented data transport using the Apache | ||
Arrow data format. The OTel-Arrow | ||
exporter | ||
converts OTLP data into an optimized representation and then sends | ||
batches of data using Apache Arrow to encode the stream. This | ||
component contains logic to reverse the process used in the OTel-Arrow | ||
exporter. | ||
|
||
The use of an OTel-Arrow exporter-receiver pair is recommended when | ||
the network is expensive. Typically, expect to see a 50% reduction in | ||
bandwidth compared with the same data being sent using standard | ||
OTLP/gRPC and gzip compression. | ||
|
||
This component includes all the features and configuration of the core | ||
OTLP receiver, making it possible to upgrade from the core component | ||
simply by replacing "otlp" with "otelarrow" as the component name in | ||
the collector configuration. | ||
|
||
To enable the OTel-Arrow receiver, include it in the list of receivers | ||
for a pipeline. No further configuration is needed. This receiver | ||
listens on the standard OTLP/gRPC port 4317 and serves standard OTLP | ||
over gRPC out of the box. | ||
|
||
```yaml | ||
receivers: | ||
otelarrow: | ||
``` | ||
|
||
## Advanced Configuration | ||
|
||
Users may wish to configure gRPC settings, for example: | ||
|
||
``` | ||
receivers: | ||
otelarrow: | ||
protocols: | ||
grpc: | ||
... | ||
``` | ||
|
||
- `endpoint` (default = 0.0.0.0:4317 for grpc protocol, 0.0.0.0:4318 http protocol): | ||
host:port to which the receiver is going to receive data. The valid syntax is | ||
described at https://github.com/grpc/grpc/blob/master/doc/naming.md. | ||
|
||
Several common configuration structures provide additional capabilities automatically: | ||
|
||
- [gRPC settings](https://github.com/open-telemetry/opentelemetry-collector/blob/main/config/configgrpc/README.md) | ||
- [TLS and mTLS settings](https://github.com/open-telemetry/opentelemetry-collector/blob/main/config/configtls/README.md) | ||
|
||
### Arrow-specific Configuration | ||
|
||
In the `arrow` configuration block, the following settings are available: | ||
|
||
- `memory_limit` (default: 128MiB): limits the amount of concurrent memory used by Arrow data buffers. | ||
|
||
When the limit is reached, the receiver will return RESOURCE_EXHAUSTED | ||
error codes to the receiver, which are [conditionally retryable, see | ||
exporter retry configuration](https://github.com/open-telemetry/opentelemetry-collector/blob/main/exporter/exporterhelper/README.md). | ||
|
||
### Keepalive configuration | ||
|
||
As a gRPC streaming service, the OTel Arrow receiver is able to limit | ||
stream lifetime through configuration of the underlying http/2 | ||
connection via keepalive settings. | ||
|
||
Keepalive settings are vital to the operation of OTel Arrow, because | ||
longer-lived streams use more memory and streams are fixed to a single | ||
host. Since every stream of data is different, we recommend | ||
experimenting to find a good balance between memory usage, stream | ||
lifetime, and load balance. | ||
|
||
gRPC libraries do not build-in a facility for long-lived RPCs to learn | ||
about impending http/2 connection state changes, including the event | ||
that initiates connection reset. While the receiver knows its own | ||
keepalive settings, a shorter maximum connection lifetime can be | ||
imposed by intermediate http/2 proxies, and therefore the receiver and | ||
exporter are expected to independently configure these limits. | ||
|
||
``` | ||
receivers: | ||
otelarrow: | ||
protocols: | ||
grpc: | ||
keepalive: | ||
server_parameters: | ||
max_connection_age: 1m | ||
max_connection_age_grace: 10m | ||
``` | ||
|
||
In the example configuration above, OTel-Arrow streams will have reset | ||
initiated after 10 minutes. Note that `max_connection_age` is set to | ||
a small value and we recommend tuning `max_connection_age_grace`. | ||
|
||
OTel Arrow exporters are expected to configure their | ||
`max_stream_lifetime` property to a value that is slightly smaller | ||
than the receiver's `max_connection_age_grace` setting, which causes | ||
the exporter to cleanly shut down streams, allowing requests to | ||
complete before the http/2 connection is forcibly closed. While the | ||
exporter will retry data that was in-flight during an unexpected | ||
stream shutdown, instrumentation about the telemety pipeline will show | ||
RPC errors when the exporter's `max_stream_lifetime` is not configured | ||
correctly. | ||
|
||
[See the exporter README for more | ||
guidance](https://github.com/open-telemetry/opentelemetry-collector/blob/main/exporter/exporterhelper/README.md). For the | ||
example where `max_connection_age_grace` is set to 10 minutes, the | ||
exporter's `max_stream_lifetime` should be set to the same number | ||
minus a reasonable timeout to allow in-flight requests to complete. | ||
For example, an exporter with `9m30s` stream lifetime: | ||
|
||
``` | ||
exporters: | ||
otelarrow: | ||
timeout: 30s | ||
arrow: | ||
max_stream_lifetime: 9m30s | ||
endpoint: ... | ||
tls: ... | ||
``` | ||
|
||
### Receiver metrics | ||
|
||
In addition to the the standard | ||
[obsreport](https://pkg.go.dev/go.opentelemetry.io/collector/obsreport) | ||
metrics, this component provides network-level measurement instruments | ||
which we anticipate will become part of `obsreport` in the future. At | ||
the `normal` level of metrics detail: | ||
|
||
- `receiver_recv`: uncompressed bytes received, prior to compression | ||
- `receiver_recv_wire`: compressed bytes received, on the wire. | ||
|
||
Arrow's compression performance can be derived by dividing the average | ||
`receiver_recv` value by the average `receiver_recv_wire` value. | ||
|
||
At the `detailed` metrics detail level, information about the stream | ||
of data being returned from the receiver will be instrumented: | ||
|
||
- `receiver_sent`: uncompressed bytes sent, prior to compression | ||
- `receiver_sent_wire`: compressed bytes sent, on the wire. | ||
|
||
## HTTP-specific documentation | ||
|
||
To enable optional OTLP/HTTP support, the HTTP protocol must be | ||
explicitly listed. It will use port 4318 by default. | ||
|
||
``` | ||
receivers: | ||
otelarrow: | ||
protocols: | ||
http: | ||
``` | ||
|
||
See the core OTLP receiver for documentation specific to HTTP | ||
connections, including: | ||
|
||
- [Writing with HTTP/JSON](https://github.com/open-telemetry/opentelemetry-collector/tree/main/receiver/otlpreceiver#writing-with-httpjson) | ||
- [CORS (Cross-origin resource sharing)](https://github.com/open-telemetry/opentelemetry-collector/tree/main/receiver/otlpreceiver#cors-cross-origin-resource-sharing) |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,119 @@ | ||
// Copyright The OpenTelemetry Authors | ||
// SPDX-License-Identifier: Apache-2.0 | ||
|
||
package otelarrowreceiver // import "github.com/open-telemetry/opentelemetry-collector-contrib/receiver/otelarrowreceiver" | ||
|
||
import ( | ||
"errors" | ||
"fmt" | ||
"net/url" | ||
"path" | ||
|
||
"go.opentelemetry.io/collector/component" | ||
"go.opentelemetry.io/collector/config/configgrpc" | ||
"go.opentelemetry.io/collector/config/confighttp" | ||
"go.opentelemetry.io/collector/confmap" | ||
) | ||
|
||
const ( | ||
// Protocol values. | ||
protoGRPC = "protocols::grpc" | ||
protoHTTP = "protocols::http" | ||
) | ||
|
||
type httpServerSettings struct { | ||
*confighttp.HTTPServerSettings `mapstructure:",squash"` | ||
|
||
// The URL path to receive traces on. If omitted "/v1/traces" will be used. | ||
TracesURLPath string `mapstructure:"traces_url_path,omitempty"` | ||
|
||
// The URL path to receive metrics on. If omitted "/v1/metrics" will be used. | ||
MetricsURLPath string `mapstructure:"metrics_url_path,omitempty"` | ||
|
||
// The URL path to receive logs on. If omitted "/v1/logs" will be used. | ||
LogsURLPath string `mapstructure:"logs_url_path,omitempty"` | ||
} | ||
|
||
// Protocols is the configuration for the supported protocols. | ||
type Protocols struct { | ||
GRPC *configgrpc.GRPCServerSettings `mapstructure:"grpc"` | ||
HTTP *httpServerSettings `mapstructure:"http"` | ||
Arrow *ArrowSettings `mapstructure:"arrow"` | ||
} | ||
|
||
// ArrowSettings support configuring the Arrow receiver. | ||
type ArrowSettings struct { | ||
// MemoryLimit is the size of a shared memory region used by | ||
// all Arrow streams. When too much load is passing through, they | ||
// will see ResourceExhausted errors. | ||
MemoryLimit uint64 | ||
} | ||
|
||
// Config defines configuration for OTel Arrow receiver. | ||
type Config struct { | ||
// Protocols is the configuration for the supported protocols, currently gRPC and HTTP (Proto and JSON). | ||
Protocols `mapstructure:"protocols"` | ||
} | ||
|
||
var _ component.Config = (*Config)(nil) | ||
var _ confmap.Unmarshaler = (*Config)(nil) | ||
|
||
// Validate checks the receiver configuration is valid | ||
func (cfg *Config) Validate() error { | ||
if cfg.GRPC == nil && cfg.HTTP == nil { | ||
return errors.New("must specify at least one protocol when using the OTel Arrow receiver") | ||
} | ||
if cfg.Arrow != nil && cfg.GRPC == nil { | ||
return errors.New("must specify at gRPC protocol when using the OTLP Arrow receiver") | ||
jmacd marked this conversation as resolved.
Show resolved
Hide resolved
|
||
} | ||
return nil | ||
} | ||
|
||
// Unmarshal a confmap.Conf into the config struct. | ||
func (cfg *Config) Unmarshal(conf *confmap.Conf) error { | ||
// first load the config normally | ||
err := conf.Unmarshal(cfg, confmap.WithErrorUnused()) | ||
if err != nil { | ||
return err | ||
} | ||
|
||
// Note: since this is the OTel-Arrow exporter, not the core component, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is http only not a valid configuration for this receiver? if it is, then i would remove this comment and re-enable the check below There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. HTTP only means you've disabled OTel Arrow and you're identical to an OTLP receiver. I'm not sure why the user would want this. (Again, open-telemetry/otel-arrow#43 comes up.) README states:
and later
if we added OTel Arrow support for HTTP streams, possibly, then it would make sense to apply the change you described, but I think we should separate the two transports into separate components. |
||
// we allow a configuration that is free of an explicit protocol, i.e., | ||
// we assume gRPC but we do not assume HTTP, whereas the core component | ||
// also has: | ||
// | ||
// if !conf.IsSet(protoGRPC) { | ||
// cfg.GRPC = nil | ||
// } | ||
|
||
if !conf.IsSet(protoHTTP) { | ||
cfg.HTTP = nil | ||
} else { | ||
var err error | ||
|
||
if cfg.HTTP.TracesURLPath, err = sanitizeURLPath(cfg.HTTP.TracesURLPath); err != nil { | ||
return err | ||
} | ||
if cfg.HTTP.MetricsURLPath, err = sanitizeURLPath(cfg.HTTP.MetricsURLPath); err != nil { | ||
return err | ||
} | ||
if cfg.HTTP.LogsURLPath, err = sanitizeURLPath(cfg.HTTP.LogsURLPath); err != nil { | ||
return err | ||
} | ||
} | ||
|
||
return nil | ||
} | ||
|
||
// Verify signal URL path sanity | ||
func sanitizeURLPath(urlPath string) (string, error) { | ||
u, err := url.Parse(urlPath) | ||
if err != nil { | ||
return "", fmt.Errorf("invalid HTTP URL path set for signal: %w", err) | ||
} | ||
|
||
if !path.IsAbs(u.Path) { | ||
u.Path = "/" + u.Path | ||
} | ||
return u.Path, nil | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this need a separate section for configuration? If the only setting is a memory limit, any reason not to include it at the top level?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Relates to open-telemetry/otel-arrow#43. This code is designed to drop-in where an OTLP receiver once stood, so leaving the Arrow settings in a separate section for future compatibility (was the idea).