Skip to content

Add OpenTelemetry support during function proxy #1684

Open
@LucasRoesler

Description

@LucasRoesler

My actions before raising this issue

Expected Behaviour

During function proxy, the Gateway should be able to produce open telemetry spans.

Current Behaviour

There are no tracing spans

List All Possible Solutions and Workarounds

Which Solution Do You Recommend?

I recently did a walk-through for integrating OpenTelemetry with OpenFaaS functions and think it would be nice if the Gateway could produce an OpenTelemetry spans during function invocation. Adding tracing during the function proxy would provide a more accurate picture of the networking in the cluster and enable accurate assessments of the overhead (or lack thereof) from the Gateway.

We previously discussed this in general in #1354 but OpenTelemetry was not a active project at the time, only OpenTracing. OpenTelemetry. OpenTelemetry makes this integration much more feasbile now because we can more easily provide support for multiple exporters. Additionally, the OpenTelemetry providers generally allow all of the required configuration via env variables, which means the integration should require only minimal changes to the Gateway.

During the Gateway startup we would initialize and set the global tracing provider using something like this

shutdownTracing, err := tracing.Provider(config.Version, config.Commit)
if err != nil {
	log.Fatal(err)
}
// Cleanly shutdown and flush telemetry when the application exits.
defer shutdownTracing(ctx)

We can then encapsulate all of the tracing specific code in the Provider implemenation

func Provider(version, commit string) (shutdown Shutdown, err error) {
	exporter := Exporter(os.Getenv("OTEL_EXPORTER"))

	var exp tracesdk.TracerProviderOption
	switch exporter {
	case JaegerExporter:
		// configure the collector from the env variables,
		// OTEL_EXPORTER_JAEGER_ENDPOINT/USER/PASSWORD
		j, e := jaeger.New(jaeger.WithCollectorEndpoint())
		exp, err = tracesdk.WithBatcher(j), e
	case LogExporter:
		w := os.Stdout
		opts := []stdouttrace.Option{stdouttrace.WithWriter(w)}
		if truthyEnv("OTEL_EXPORTER_LOG_PRETTY_PRINT") {
			opts = append(opts, stdouttrace.WithPrettyPrint())
		}
		if !truthyEnv("OTEL_EXPORTER_LOG_TIMESTAMPS") {
			opts = append(opts, stdouttrace.WithoutTimestamps())
		}

		s, e := stdouttrace.New(opts...)
		exp, err = tracesdk.WithSyncer(s), e
	// additional exporters
	default:
		logrus.Warn("tracing disabled")
		// We explicitly DO NOT set the global TracerProvider using otel.SetTracerProvider().
		// The unset TracerProvider returns a no-op "non-recording" span, but still passes through context.
		otel.SetTextMapPropagator(
			propagation.NewCompositeTextMapPropagator(propagation.TraceContext{}, propagation.Baggage{}),
		)
		// return no-op shutdown function
		return func(_ context.Context) {}, nil
	}
	if err != nil {
		return nil, err
	}
	
	// some additional work to
	// finish initializing the provider 
	
	otel.SetTracerProvider(provider)

	shutdown = func(ctx context.Context) {
		// Do not let the application hang forever when it is shutdown.
		ctx, cancel := context.WithTimeout(ctx, time.Second*5)
		defer cancel()

		err := provider.Shutdown(ctx)
		if err != nil {
			logrus.WithError(err).Error("tracing provider did not gracefully shutdown")
		}
	}
	return shutdown, nil
}

Inside the function invocation hanlder here

baseURL := baseURLResolver.Resolve(r)
we would add

	var err error
	_, span := otel.Tracer("Gateway").Start(r.Context(), "Proxy")
	defer func() {
		if err != nil {
			span.SetStatus(codes.Error, err.Error())
			span.RecordError(err)
		}
		span.End()
	}()

This would then show as a new span named "Proxy" between the ingress and the function (if they have tracing enabled). There are a few other things we could do, e.g. adding the status code, original url, and request url as metadata to the span, but this is optional for a minimal implementation.

Steps to Reproduce (for bugs)

  1. Follow this walkthrough https://github.com/LucasRoesler/openfaas-tracing-walkthrough

Context

https://github.com/LucasRoesler/openfaas-tracing-walkthrough

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions