Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add otel tracing to authorino #380

Merged
merged 11 commits into from
Mar 1, 2023
Merged

Conversation

Rohith-Raju
Copy link
Contributor

@Rohith-Raju Rohith-Raju commented Feb 14, 2023

Hello, This pr is #163

Tasks Done

  • Import open tracing packages and create a new exporter
  • Invoke Otel function in main()

Verification steps courtesy @guicassolato :

make cluster install build

docker run --rm --name jaeger -d \
  -e COLLECTOR_ZIPKIN_HTTP_PORT=9411 \
  -p 5775:5775/udp \
  -p 6831:6831/udp \
  -p 6832:6832/udp \
  -p 5778:5778 \
  -p 16686:16686 \
  -p 14268:14268 \
  -p 9411:9411 \
  jaegertracing/all-in-one:1.6

bin/authorino server --observability-service-endpoint http://localhost:14268/api/traces

Then, in a separate shell:

curl http://localhost:5001/check

kubectl apply -f -<<EOF
apiVersion: authorino.kuadrant.io/v1beta1
kind: AuthConfig
metadata:
  name: my-authconfig
spec:
  hosts:
  - localhost
EOF

curl http://localhost:5001/check

Navigate to http://localhost:16686 to check the traces in the Jaeger UI.

Signed-off-by: ROHITH RAJU rohithraju488@gmail.com

Signed-off-by: ROHITH RAJU <rohithraju488@gmail.com>
Signed-off-by: ROHITH RAJU <rohithraju488@gmail.com>
@Rohith-Raju Rohith-Raju marked this pull request as draft February 14, 2023 15:18
@Rohith-Raju
Copy link
Contributor Author

What should my next steps be ?

@guicassolato
Copy link
Collaborator

What should my next steps be ?

This is a very good start, @Rohith-Raju. Thank you again for putting this together!

I've left a few comments. After that, I imagine your next move would be instrumenting the AuthService.Check func at

func (a *AuthService) Check(parentContext gocontext.Context, req *envoy_auth.CheckRequest) (*envoy_auth.CheckResponse, error) {

@Rohith-Raju
Copy link
Contributor Author

Rohith-Raju commented Feb 15, 2023

I've left a few comments.

HI @guicassolato, I don't see them here.

main.go Outdated Show resolved Hide resolved
main.go Outdated Show resolved Hide resolved
pkg/trace/exporter.go Outdated Show resolved Hide resolved
pkg/trace/exporter.go Outdated Show resolved Hide resolved
pkg/trace/exporter.go Outdated Show resolved Hide resolved
pkg/trace/exporter.go Outdated Show resolved Hide resolved
pkg/trace/exporter.go Outdated Show resolved Hide resolved
pkg/trace/exporter.go Outdated Show resolved Hide resolved
@guicassolato
Copy link
Collaborator

I've left a few comments.

HI @guicassolato, I don't see them here.

My bad @Rohith-Raju. The review was pending in GH.

pkg/trace/exporter.go Outdated Show resolved Hide resolved
main.go Outdated Show resolved Hide resolved
Signed-off-by: ROHITH RAJU <rohithraju488@gmail.com>
Signed-off-by: ROHITH RAJU <rohithraju488@gmail.com>
@guicassolato
Copy link
Collaborator

PR looking great, @Rohith-Raju!

Do you mind writing down a few verification steps in the description, so people can try this out manually and see how it looks like? It usually starts with make local-setup.

Thanks!

pkg/trace/exporter.go Outdated Show resolved Hide resolved
main.go Outdated Show resolved Hide resolved
@guicassolato
Copy link
Collaborator

Verification steps I used to try this out locally:

make cluster install build

docker run --rm --name jaeger -d \
  -e COLLECTOR_ZIPKIN_HTTP_PORT=9411 \
  -p 5775:5775/udp \
  -p 6831:6831/udp \
  -p 6832:6832/udp \
  -p 5778:5778 \
  -p 16686:16686 \
  -p 14268:14268 \
  -p 9411:9411 \
  jaegertracing/all-in-one:1.6

bin/authorino server --observability-service-endpoint http://localhost:14268/api/traces

Then, in a separate shell:

curl http://localhost:5001/check

kubectl apply -f -<<EOF
apiVersion: authorino.kuadrant.io/v1beta1
kind: AuthConfig
metadata:
  name: my-authconfig
spec:
  hosts:
  - localhost
EOF

curl http://localhost:5001/check

Navigate to http://localhost:16686 to check the traces in the Jaeger UI.

Signed-off-by: ROHITH RAJU <rohithraju488@gmail.com>
@Rohith-Raju Rohith-Raju marked this pull request as ready for review February 22, 2023 12:44
pkg/service/auth.go Outdated Show resolved Hide resolved
Signed-off-by: ROHITH RAJU <rohithraju488@gmail.com>
main.go Outdated
Comment on lines 40 to 42
otel_grpc "go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc"
otel_http "go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp"
otel_propagation "go.opentelemetry.io/otel/propagation"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mind separating this block? We try to keep the imports separated in at least 3 blocks:

  • Go standard library packages
  • github.com/kuadrant/authorino/* packages
  • Third-party packages

main.go Outdated Show resolved Hide resolved
main.go Show resolved Hide resolved
pkg/service/auth.go Outdated Show resolved Hide resolved
pkg/service/auth.go Outdated Show resolved Hide resolved
@guicassolato
Copy link
Collaborator

Hi @Rohith-Raju. Thanks again for putting this work together.

I finally had time to finish reviewing the PR today. I'll leave a few suggestions here since I cannot push to your branch:

diff --git a/main.go b/main.go
index 93df9c8..dfa82b0 100644
--- a/main.go
+++ b/main.go
@@ -37,16 +37,12 @@ import (
 	"github.com/kuadrant/authorino/pkg/service"
 	"github.com/kuadrant/authorino/pkg/trace"
 	"github.com/kuadrant/authorino/pkg/utils"
-	otel_grpc "go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc"
-	otel_http "go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp"
-	otel_propagation "go.opentelemetry.io/otel/propagation"
 
 	envoy_auth "github.com/envoyproxy/go-control-plane/envoy/service/auth/v3"
 	grpc_prometheus "github.com/grpc-ecosystem/go-grpc-prometheus"
 	"github.com/prometheus/client_golang/prometheus/promhttp"
 	"github.com/spf13/cobra"
 	"github.com/spf13/pflag"
-	"go.opentelemetry.io/otel"
 	"google.golang.org/grpc"
 	"google.golang.org/grpc/credentials"
 	healthpb "google.golang.org/grpc/health/grpc_health_v1"
@@ -56,6 +52,11 @@ import (
 	_ "k8s.io/client-go/plugin/pkg/client/auth/gcp"
 	ctrl "sigs.k8s.io/controller-runtime"
 	"sigs.k8s.io/controller-runtime/pkg/healthz"
+
+	otel_grpc "go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc"
+	otel_http "go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp"
+	"go.opentelemetry.io/otel"
+	otel_propagation "go.opentelemetry.io/otel/propagation"
 	// +kubebuilder:scaffold:imports
 )
 
@@ -89,7 +90,7 @@ var (
 	enableLeaderElection           bool
 	maxHttpRequestBodySize         int64
 	observabilityServiceEndpoint   string
-	observabilityServiveSeed       string
+	observabilityServiceSeed       string
 
 	scheme = runtime.NewScheme()
 
@@ -134,6 +135,7 @@ func main() {
 	cmdServer.PersistentFlags().BoolVar(&enableLeaderElection, "enable-leader-election", false, "Enable leader election for status updater - ensures only one instance of Authorino tries to update the status of reconciled resources")
 	cmdServer.PersistentFlags().Int64Var(&maxHttpRequestBodySize, "max-http-request-body-size", utils.EnvVar("MAX_HTTP_REQUEST_BODY_SIZE", int64(8192)), "Maximum size of the body of requests accepted in the raw HTTP interface of the authorization server - in bytes")
 	cmdServer.PersistentFlags().StringVar(&observabilityServiceEndpoint, "observability-service-endpoint", "", "Endpoint URL of the OpenTelemetry collector service")
+	cmdServer.PersistentFlags().StringVar(&observabilityServiceSeed, "observability-service-seed", "", "Seed attribute of the OpenTelemetry resource")
 
 	cmdVersion := &cobra.Command{
 		Use:   "version",
@@ -180,7 +182,7 @@ func run(cmd *cobra.Command, _ []string) {
 
 	if observabilityServiceEndpoint != "" {
 		otel.SetLogger(logger)
-		tp, err := trace.CreateTraceProvider(observabilityServiceEndpoint, version)
+		tp, err := trace.CreateTraceProvider(observabilityServiceEndpoint, version, observabilityServiceSeed)
 		if err != nil {
 			logger.Error(err, "unable to create traceprovider")
 			os.Exit(1)
diff --git a/pkg/context/context.go b/pkg/context/context.go
index 3438844..062fb53 100644
--- a/pkg/context/context.go
+++ b/pkg/context/context.go
@@ -21,18 +21,18 @@ type options struct {
 	timeout time.Duration
 }
 
-type option func(options)
+type option func(*options)
 
 // WithParent returns an option to create a new context based on an existing parent context.
 func WithParent(parent gocontext.Context) option {
-	return func(opts options) {
+	return func(opts *options) {
 		opts.parent = parent
 	}
 }
 
 // WithTimeout return an option to create a new context that cancels itself automatically after the timeout.
 func WithTimeout(timeout time.Duration) option {
-	return func(opts options) {
+	return func(opts *options) {
 		opts.timeout = timeout
 	}
 }
@@ -41,15 +41,16 @@ func WithTimeout(timeout time.Duration) option {
 // If a parent context is provided, creates a copy of the parent with further options.
 // If a timeout option is provided, creates a context that cancels itself automatically after the timeout.
 func New(opts ...option) gocontext.Context {
-	o := options{}
+	o := &options{}
 	for _, opt := range opts {
 		opt(o)
 	}
 
-	ctx := gocontext.Background()
-
+	var ctx gocontext.Context
 	if o.parent != nil {
 		ctx = o.parent
+	} else {
+		ctx = gocontext.Background()
 	}
 
 	if o.timeout > 0 {
diff --git a/pkg/service/auth.go b/pkg/service/auth.go
index 01b7f4e..229191a 100644
--- a/pkg/service/auth.go
+++ b/pkg/service/auth.go
@@ -11,11 +11,6 @@ import (
 	"strings"
 	"time"
 
-	"go.opentelemetry.io/otel"
-	otel_attr "go.opentelemetry.io/otel/attribute"
-	otel_codes "go.opentelemetry.io/otel/codes"
-	otel_propagation "go.opentelemetry.io/otel/propagation"
-	otel_trace "go.opentelemetry.io/otel/trace"
 	gocontext "golang.org/x/net/context"
 
 	"github.com/kuadrant/authorino/pkg/auth"
@@ -32,6 +27,11 @@ import (
 	"google.golang.org/protobuf/types/known/structpb"
 	v1 "k8s.io/api/admission/v1"
 	metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
+
+	"go.opentelemetry.io/otel"
+	otel_attr "go.opentelemetry.io/otel/attribute"
+	otel_codes "go.opentelemetry.io/otel/codes"
+	otel_trace "go.opentelemetry.io/otel/trace"
 )
 
 const (
@@ -88,14 +88,10 @@ func NewAuthService(index index.Index, timeout time.Duration, maxHttpRequestBody
 // The body can be any JSON object; in case the input is a Kubernetes AdmissionReview resource,
 // the response is compatible with the Dynamic Admission API
 func (a *AuthService) ServeHTTP(resp http.ResponseWriter, req *http.Request) {
-	ctx := context.New(context.WithTimeout(a.Timeout))
-	ctx = otel.GetTextMapPropagator().Extract(ctx, otel_propagation.HeaderCarrier(req.Header))
 	requestId := fmt.Sprintf("%x", sha256.Sum256([]byte(fmt.Sprint(req))))
 
-	attr := otel_trace.WithAttributes(otel_attr.String("authorino.request_id", requestId))
-	ctx, span := otel.Tracer("AuthService").Start(ctx, "ServeHTTP", attr)
-
-	otel.GetTextMapPropagator().Inject(ctx, otel_propagation.HeaderCarrier(resp.Header()))
+	ctx := context.New(context.WithParent(req.Context()), context.WithTimeout(a.Timeout))
+	ctx, span := otel.Tracer("AuthService").Start(ctx, "ServeHTTP", otel_trace.WithAttributes(otel_attr.String("authorino.request_id", requestId)))
 	defer span.End()
 
 	logger := log.
@@ -241,18 +237,17 @@ func (a *AuthService) ServeHTTP(resp http.ResponseWriter, req *http.Request) {
 // Check performs authorization check based on the attributes associated with the incoming request,
 // and returns status `OK` or not `OK`.
 func (a *AuthService) Check(parentContext gocontext.Context, req *envoy_auth.CheckRequest) (*envoy_auth.CheckResponse, error) {
-	requestId := req.Attributes.Request.Http.GetId()
-	attr := otel_trace.WithAttributes(otel_attr.String("authorino.request_id", requestId))
-	_, span := otel.Tracer("AuthService").Start(parentContext, "Check", attr)
+	requestData := req.Attributes.Request.Http
+	requestId := requestData.GetId()
+
+	ctx, span := otel.Tracer("AuthService").Start(parentContext, "Check", otel_trace.WithAttributes(otel_attr.String("authorino.request_id", requestId)))
 	defer span.End()
 
 	requestLogger := log.WithName("service").WithName("auth").WithValues("request id", requestId)
-	ctx := log.IntoContext(context.New(context.WithParent(parentContext), context.WithTimeout(a.Timeout)), requestLogger)
+	ctx = log.IntoContext(context.New(context.WithParent(ctx), context.WithTimeout(a.Timeout)), requestLogger)
 
 	a.logAuthRequest(req, ctx)
 
-	requestData := req.Attributes.Request.Http
-
 	// service config
 	var host string
 	if h, overridden := req.Attributes.ContextExtensions[X_LOOKUP_KEY_NAME]; overridden {
diff --git a/pkg/trace/exporter.go b/pkg/trace/exporter.go
index 3a41ac6..13a18e0 100644
--- a/pkg/trace/exporter.go
+++ b/pkg/trace/exporter.go
@@ -1,6 +1,7 @@
 package trace
 
 import (
+	"go.opentelemetry.io/otel/attribute"
 	"go.opentelemetry.io/otel/exporters/jaeger"
 	"go.opentelemetry.io/otel/sdk/resource"
 	"go.opentelemetry.io/otel/sdk/trace"
@@ -14,26 +15,27 @@ func newExporter(url string) (*jaeger.Exporter, error) {
 	return jaeger.New(collector)
 }
 
-func newResource(version string) *resource.Resource {
-	r, _ := resource.Merge(
-		resource.Default(),
-		resource.NewWithAttributes(
-			semconv.SchemaURL,
-			semconv.ServiceNameKey.String("authorino"),
-			semconv.ServiceVersionKey.String(version),
-		),
-	)
+func newResource(version, seed string) *resource.Resource {
+	attrs := []attribute.KeyValue{
+		semconv.ServiceNameKey.String("authorino"),
+		semconv.ServiceVersionKey.String(version),
+	}
+	if seed != "" {
+		attrs = append(attrs, attribute.String("seed", seed))
+	}
+	res := resource.NewWithAttributes(semconv.SchemaURL, attrs...)
+	r, _ := resource.Merge(resource.Default(), res)
 	return r
 }
 
-func CreateTraceProvider(url string, version string) (*trace.TracerProvider, error) {
+func CreateTraceProvider(url, version, seed string) (*trace.TracerProvider, error) {
 	exp, err := newExporter(url)
 	if err != nil {
 		return nil, err
 	}
 	tp := trace.NewTracerProvider(
 		trace.WithBatcher(exp),
-		trace.WithResource(newResource(version)),
+		trace.WithResource(newResource(version, seed)),
 	)
 	return tp, nil
 }

The most important changes are:

  1. Fix the creation of the context with parent in pkg/context/context that was preventing proper nesting of the spans.
  2. No need to call Extract as it's handled automatically by the instrumentation libraries.
  3. Stop injecting the trace in the HTTP response.
  4. Implement the tracing seed option, a fixed attribute of the tracing resource – useful to differentiate between Authorino instances.

To test these changes

Build and start the server:

make cluster install build && bin/authorino server --observability-service-endpoint=http://localhost:14268/api/traces --observability-service-seed=$(date +%s) --log-level=debug

The value of the observability-service-seed option will show up in the Authorino logs. It can be used in the Jaeger UI to search for traces with the tag seed=<value>.

Send requests to the Raw HTTP Authorization interface:

curl -H "Traceparent: 00-$(openssl rand -hex 16)-7359b90f4355cfd9-01" http://localhost:5001/check -v
curl -H "Traceparent: 00-$(openssl rand -hex 16)-7359b90f4355cfd9-01" http://localhost:5001/check -X PUT -v
curl http://localhost:5001/check -v

Requests should show up in the Jaeger UI properly nested:

Screenshot 2023-02-27 at 20 58 34

@guicassolato
Copy link
Collaborator

In case one wants to try the changes by hitting the default GRPC interface instead of the Raw HTTP Authorization one:

  1. Download the libs and protobufs for the grpc client:
for repo in \
envoyproxy/data-plane-api \
googleapis/googleapis \
envoyproxy/protoc-gen-validate \
cncf/udpa
do
git clone git@github.com:$repo /tmp/authorino-grpc-client/$repo
done
  1. Install grpcurl (example for macos):
brew install grpcurl
  1. Send requests to Authorino (with and without traceparent):
grpcurl -import-path /tmp/authorino-grpc-client/envoyproxy/data-plane-api  \
        -import-path /tmp/authorino-grpc-client/googleapis/googleapis \
        -import-path /tmp/authorino-grpc-client/envoyproxy/protoc-gen-validate \
        -import-path /tmp/authorino-grpc-client/cncf/udpa \
        -proto /tmp/authorino-grpc-client/envoyproxy/data-plane-api/envoy/service/auth/v3/external_auth.proto \
        -plaintext -d "{\"attributes\":{\"request\":{\"http\":{\"host\":\"localhost\",\"headers\":{\"traceparent\":\"00-$(openssl rand -hex 16)-7359b90f4355cfd9-01\"}}}}}" \
        localhost:50051 \
        envoy.service.auth.v3.Authorization.Check

grpcurl -import-path /tmp/authorino-grpc-client/envoyproxy/data-plane-api  \
        -import-path /tmp/authorino-grpc-client/googleapis/googleapis \
        -import-path /tmp/authorino-grpc-client/envoyproxy/protoc-gen-validate \
        -import-path /tmp/authorino-grpc-client/cncf/udpa \
        -proto /tmp/authorino-grpc-client/envoyproxy/data-plane-api/envoy/service/auth/v3/external_auth.proto \
        -plaintext -d '{"attributes":{"request":{"http":{"host":"localhost"}}}}' \
        localhost:50051 \
        envoy.service.auth.v3.Authorization.Check

Authorino must be running locally and Jaeger in a docker as instructions shared before.

The traces in the Jaeger UI in this case should look like this:

Screenshot 2023-02-27 at 21 11 58

@Rohith-Raju
Copy link
Contributor Author

Thanks for the feedback @guicassolato, I'll make the changes and get back 😄

Signed-off-by: Rohith-Raju <rohithraju488@gmail.com>
main.go Show resolved Hide resolved
Signed-off-by: Rohith-Raju <rohithraju488@gmail.com>
Signed-off-by: Rohith-Raju <rohithraju488@gmail.com>
@guicassolato guicassolato requested a review from a team February 28, 2023 10:19
Copy link
Collaborator

@guicassolato guicassolato left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some automated unit (and perhaps also integration) tests needed, plus docs, but I thing those can come in a subsequent iteration, given the amount of manual testing already performed here.

I've left a few comments for other future improvements as well.

semconv.ServiceVersionKey.String(version),
}
if seed != "" {
attrs = append(attrs, attribute.String("seed", seed))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking again about this. Perhaps it could expect a comma-separated key=value list instead. As long as it's in the same release, it can be done in a separate iteration.

os.Exit(1)
}
otel.SetTracerProvider(tp)
otel.SetTextMapPropagator(otel_propagation.NewCompositeTextMapPropagator(otel_propagation.TraceContext{}, otel_propagation.Baggage{}))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth enabling propagation even when observability reporting is off? Right now, it's not injecting trace IDs or baggage into outgoing requests (e.g. HTTP requests to external sources of metadata), but it'll be useful once added.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was going to ask you the same, thanks for covering this 👍

Comment on lines +137 to +138
cmdServer.PersistentFlags().StringVar(&observabilityServiceEndpoint, "observability-service-endpoint", "", "Endpoint URL of the OpenTelemetry collector service")
cmdServer.PersistentFlags().StringVar(&observabilityServiceSeed, "observability-service-seed", "", "Seed attribute of the OpenTelemetry resource")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need support by the Authorino Operator.

Comment on lines +137 to +138
cmdServer.PersistentFlags().StringVar(&observabilityServiceEndpoint, "observability-service-endpoint", "", "Endpoint URL of the OpenTelemetry collector service")
cmdServer.PersistentFlags().StringVar(&observabilityServiceSeed, "observability-service-seed", "", "Seed attribute of the OpenTelemetry resource")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return r
}

func CreateTraceProvider(url, version, seed string) (*trace.TracerProvider, error) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing unit tests.

@guicassolato guicassolato requested a review from a team February 28, 2023 15:00
@eguzki
Copy link
Collaborator

eguzki commented Mar 1, 2023

Should tracing be opt-in/opt-out feature? Some "switch" to disable tracing might be interesting?

main.go Show resolved Hide resolved
@eguzki
Copy link
Collaborator

eguzki commented Mar 1, 2023

Question: this is only about tracing or it also exposes new metrics in the metric endpoint ?

@guicassolato
Copy link
Collaborator

Question: this is only about tracing or it also exposes new metrics in the metric endpoint ?

For now, it's only about tracing, @eguzki.

Authorino does expose endpoints for collectors to pull metrics (Prometheus format). In the future, we can extend the OpenTelemetry instrumentation added here to push the metrics to the observability service endpoint as well.

@guicassolato guicassolato merged commit 6070280 into Kuadrant:main Mar 1, 2023
@thomasmaas
Copy link
Collaborator

Amazing job @Rohith-Raju. Thank you!

@Rohith-Raju Rohith-Raju changed the title Wip Feat: Add tracing to authorino Add otel tracing to authorino Mar 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants