Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Configuration API changes #10

Open
jjatria opened this issue Dec 3, 2023 · 4 comments
Open

Configuration API changes #10

jjatria opened this issue Dec 3, 2023 · 4 comments

Comments

@jjatria
Copy link
Owner

jjatria commented Dec 3, 2023

Currently, the SDK does not make it easy to customise the configuration too much. It does allow for a great deal of configuration via environment variables, but it would be good to support a more fine-grained, in-code configuration (see also #9 for a case where this feature was desired).

The changes in #6 t add support for the logs API experiment with exposing the internal methods the SDK currently uses to configure individual parts of the SDK. It does this selfishly so that testing is easier, but maybe this could be the way forward.

The SDK currently internally uses

If we decide to go this way, we will undoubtedly need a configure_metrics_provider eventually, but there'd be room to add it.

Currently, both the current configure_span_processors (or configure_tracer_provider) and the proposed configure_logger_provider create a default provider and attach processors on it before setting it as the default. One possible extension here would be to make these accept a provider from the user (which can default to the same current behaviour), and a list of processors / keys to processors to attach to it. The interface could be (more or less) consistent for all three possible flavours of this configurator function (logger, tracer, and metrics).

An entirely alternative route would be to do something like what the Ruby implementation does, which is to expose some sort of "configurator" object which has access to some specific SDK internals that we don't want to (or should) expose publicly.

Of these, I think I tend to prefer the former approach, since it feels more Perlish and straightforward, but the interface I think could benefit from some thinking. It would be good to get an idea of what parts of this configuration process need to be customisable, but if these functions allow the user to pass the base provider instance, then they are allowed also to create that instance however they see fit.

Ideas about this are welcome.

@abh
Copy link
Contributor

abh commented Jan 1, 2024

For what it's worth:

In Go I have an internal helper package that does all the configuration of the trace provider, exporter, default span resources, etc.

I struggled (as you saw in my feedback!) with how awkward this was to replicate in Perl. In parallel I installed the OpenTelemetry operator in my kubernetes cluster. The operator can easily be made to setup many of the relevant environment variables, so it took away my need to make the application easier to configure in the "normal way I configure the application".

@jjatria
Copy link
Owner Author

jjatria commented Apr 30, 2024

Thinking about this a little bit, it seems like really, the part that users are most likely to want to be able to hook into is the creation of the provider. That would have helped the situation in #9 by allowing something like

my $provider = OpenTelemetry::SDK::Trace::TracerProvider->new(
    resource => OpenTelemetry::SDK::Resource->new( ... ),
    ...
);

OpenTelemetry::SDK->configure_tracer_provider($provider);

which would internally apply the default exporters / span processors, etc.

In my previous comment I suggested the idea of allowing also a list of processors to attach to the provider, but really, that's just

$provider->add_span_processor($_) for @processors;

so I'm not sure of what benefit that would be (and truth be told, in 99% of real-world cases, that will be a batch processor anyway). But it would be useful to be able to hook into the default exporter detection and attachment, etc.

So I think exposing the configure_{logger,tracer,metrics}_provider class method and allowing it to receive the provider on which to apply the exporters / processors, etc, is probably enough. If the user wants to do anything more involved, there's nothing stopping them from

my $exporter = My::Exporter->new( ... );
my $processor = My::Processor->new( exporter => $exporter, ... );
my $provider = My::Provider->new( ... );
$provider->add_foo_processor($processor);
OpenTelemetry->foo_provider = $provider;

And if this API is not enough, this will also not prevent us from extending it in the future: we could accept more positional parameters, or switch to named ones since we'll still be able to figure out if the user is passing a key/value list of just a provider.

As before, comments on a suggested use case or the needs of an API are more than welcome.

@jjatria
Copy link
Owner Author

jjatria commented May 1, 2024

An alternative I thought of overnight would be to allow some callbacks to be set for specific parts of the setup. I still think this might be a little too early for this approach, but I'm thinking of something like

OpenTelemetry::SDK->configure_tracer_provider(
    # A little verbose if all we want is to pass in a provider,
    # but would be the most consistent
    new_provider => sub {
        My::Custom::Provider->new( ... );
    },

    new_processor => sub ( $exporter ) {
        # $exporter is the one created by the previous callback
        return Some::Processor->new( exporter => $exporter, ... );
    },

    new_exporter => sub ( $name ) {
        # $name could be the slug that we read from the env (eg. "otlp")
        return Some::Exporter->new( ... );
    },
);

Then the code of the function would be similar to

sub configure_tracer_provider ( %args ) {
    state %map = (
        jaeger  => '::Jaeger',
        otlp    => '::OTLP',
        zipkin  => '::Zipkin',
        console => 'OpenTelemetry::SDK::Exporter::Console',
    );
 
    my $new_provider = $args{new_provider} // sub {
        OpenTelemetry::SDK::Trace::TracerProvider->new;
    };

    my $new_processor = $args{new_processor} // sub ( $exporter ) {
        my $class = 'OpenTelemetry::SDK::Trace::Span::Processor::'
            . ( $exporter isa OpenTelemetry::SDK::Exporter::Console ? 'Simple' : 'Batch' );
 
        Module::Runtime::require_module $class;
        $class->new( exporter => $exporter );
    };

    my $new_exporter = $args{new_exporter} // sub ( $slug ) {
        my $class = $map{$slug} or return;

        $class = 'OpenTelemetry::Exporter' . $class if $class =~ /^::/;

        Module::Runtime::require_module $class;
        $class->new;
    };
 
    my @names = split ',',
        ( config('TRACES_EXPORTER') // 'otlp' ) =~ s/\s//gr;
 
    my $provider = $new_provider->();
 
    my %seen = ( none => 1 );
    for my $name ( @names ) {
        next if $seen{$name}++;

        try {
            my $exporter = $new_exporter->($name);

            unless ( $exporter ) {
                $logger->warnf("Unknown exporter '%s' cannot be configured", $name);
                next;
            }

            my $processor = $new_processor->( $exporter );
            $provider->add_span_processor( $processor );
        }
        catch ($e) {
            $logger->warnf("Error configuring '%s' span exporter: %s", $name, $e);
        }
    }
 
    OpenTelemetry->tracer_provider = $provider;
}

with I guess some additional error checks, etc. Would need to figure out exactly what we want if eg. the callbacks fail, or if they don't return what we expect, etc.

@jjatria
Copy link
Owner Author

jjatria commented Jul 30, 2024

Would need to figure out exactly what we want if eg. the callbacks fail, or if they don't return what we expect, etc.

I think we can solve the first of these, as well as how to allow users to extend the defaults rather than completely overwrite them. We could allow the user to define a callback, and call it inside a try block. If the user did not define a callback, or the function failed, or it returned falsy, then we call our default handler. If the user provided a callback, and the callback returned a true value, then we could either accept that as-is, or validate, and call our default handler again if it did not pass validation.

So, in pseudo-code, something like

my $default = sub { ... };
my $callback = $args{new_provider} // $default;

my $provider;
try { $provider = $callback->() }
catch ($e) {
    # log error
}

undef $provider if $provider && $provider->$is_not_valid;

$provider //= $default->();

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants