Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OSSM-5444 Support Distributed Tracing 3 #1516

Merged
merged 3 commits into from
Jan 9, 2024

Conversation

yxun
Copy link
Member

@yxun yxun commented Dec 9, 2023

  1. This PR is adding SMCP v2.5 spec APIs for the following three MeshConfig.ExtensionProvider :
  • opentelemetry
  • envoyOtelAls

Reference: https://istio.io/latest/docs/reference/config/istio.mesh.v1alpha1/#MeshConfig-ExtensionProvider

This should unblock the testing of OSSM-5444. And connecting OSSM mesh with Distributed tracing 3 with opentelemetry provider.

The envoyOtelAls is not used from tracing side but it can be used in envoy log collection with opentelemetry.
Note: this PR evoyOtelAls is not including field logFormat. The field default will follow Envoy access logging formatting.
So using default is more consistent with Envoy side config.

  1. This PR replaces the strategy_v2.5.go counter if statements with reflection.
    This makes the code being able to dynamically check all fields in the ExtensionProviderConfig instead of hard coding each field names.

@yxun
Copy link
Member Author

yxun commented Dec 10, 2023

I run make gen but no change from pkg/apis/maistra/v2/zz_generated.deepcopy.go.
It looks like my local env does not generate that file. Does anyone know how to fix that ?

Comment on lines 214 to 220
if counter < 2 {
// ExtensionProvider has only Name field or all nil field
allErrors = append(allErrors, fmt.Errorf("extension provider '%s' does not define any provider - "+
"it must specify one of: prometheus, envoyExtAuthzHttp, or envoyExtAuthzGrpc", ext.Name))
} else if counter > 1 {
"it must specify one of: %s", ext.Name, fieldNames[1:]))
} else if counter > 2 {
// ExtensionProvider defines more than one provider + Name field
allErrors = append(allErrors, fmt.Errorf("extension provider '%s' must specify only one type of provider: "+
"prometheus, envoyExtAuthzHttp, or envoyExtAuthzGrpc", ext.Name))
"%s", ext.Name, fieldNames[1:]))
}
Copy link
Contributor

@luksa luksa Dec 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I'm reading this correctly, this code assumes that if there are two fields set, then one of them is name and the other is one of the providers. But this isn't always the case. What if I set the fields prometheus and envoyExtAuthzHTTP, without setting the name field.

Also, does the error message list name as one of the options (e.g. "it must specify one of: name, prometheus, envoyExtAuthzHTTP, envoyExtAuthzGRPC, ...")? I don't think it should.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

line 202: https://github.com/maistra/istio-operator/pull/1516/files#diff-e668dab162c3d0f9417174bd4eecacac65846a7279393b7fe61d318c52a7a08bR202
You can't have empty name field. The line 202 checks the follow first
allErrors = append(allErrors, fmt.Errorf("extension provider name cannot be empty"))

Should we change the existing name empty if check ?

Copy link
Member Author

@yxun yxun Dec 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, the error message does not list name as one option:
"it must specify one of: %s", ext.Name, fieldNames[1:]))
fieldNames[1:] I skipp the name field from the beginning of the slice

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why wouldn't we just use a map implementation, with casting? Wouldn't that be easier than using reflection, especially if we're just using simply types?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I missed the [1:]. Are we sure that name will always be at index 0?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why wouldn't we just use a map implementation, with casting? Wouldn't that be easier than using reflection, especially if we're just using simply types?

Hi Rob and Marko,
I am not clear about how a map works with SMCP spec.
I read the pkg/apis/maistra/conversion/extension_provider.go code. I am clear How the v2.ControlPlanceSpec struct code populates to helm chart values and also the helm chart values to this v2 spec struct.

However, it's not clear how the SMCP yaml spec initiates the v2.ControlPlanceSpec struct code.
If we have a SMCP CR yaml below, does that mean each of the field names in the yaml (e.g. meshconfig, extensionProviders, opentelemetry, service) need match a struct field in the v2 package .go code ?

spec:
  meshconfig:
    extensionProviders:
      opentelemetry:
        service:
  

Were you suggesting replacing the type ExtensionProviderConfig struct with a map named ExtensionProviderConfig ?

Copy link
Member Author

@yxun yxun Dec 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I missed the [1:]. Are we sure that name will always be at index 0?

Yes, that name always be at index 0.

@rcernich
Copy link
Contributor

Is this a better approach than creating a new tracing provider type? I'd also be concerned how this works with the existing Tracing fields, since those manipulate the zipkin field when Jaeger is specified (not sure what they do when None is specified.

@yxun
Copy link
Member Author

yxun commented Dec 12, 2023

Is this a better approach than creating a new tracing provider type? I'd also be concerned how this works with the existing Tracing fields, since those manipulate the zipkin field when Jaeger is specified (not sure what they do when None is specified.

If I read correctly, we can only have one name with one extensionProvider type.
When jaeger is specified, the code only allows existing jaeger spec field and those new fields are not allowed under that name. However, we may have two providers:

  • name: jaeger
    jaeger:
    ...
  • name: zipkin
    zipkin:
    ...

I can test more if there are any conflict with multiple providers.

(not sure what they do when None is specified. The code also does not allow only name field.
counter < 2 means all fields nil or only name field specified and then error.

@yxun yxun changed the title OSSM-5444 Support Distributed Tracing 3 WIP: OSSM-5444 Support Distributed Tracing 3 Dec 12, 2023
@yxun yxun changed the title WIP: OSSM-5444 Support Distributed Tracing 3 OSSM-5444 Support Distributed Tracing 3 Dec 12, 2023
@yxun yxun changed the title OSSM-5444 Support Distributed Tracing 3 WIP: OSSM-5444 Support Distributed Tracing 3 Dec 13, 2023
@jewertow
Copy link
Member

jewertow commented Dec 13, 2023

However, we may have two providers:

name: jaeger
jaeger:
...
name: zipkin
zipkin:
...

I don't like this idea, because it's not the upstream API.

Is this a better approach than creating a new tracing provider type?

@rcernich I don't think so, because the existing Jaeger/Stackdriver settings are mapped to the old telemetry v2 settings, which are no longer developed. And what's more, the old API can't be used with the new Telemetry API. So I think it's better to extend extension providers and document that the new API should be preferred. In OSSM 3, we will have only extension providers, so it's also better from the perspective of migration.

I'd also be concerned how this works with the existing Tracing fields

Very good point. I think we should implement a validation, which don't allow to set both spec.tracing.type: Jaeger and spec.meshConfig.extensionProviders[].zipkin at the same time, and additionally we should make sure that telemetry.v2 settings are disabled when Zipkin extension provider is used.

Btw. it would be easier to review this change if Zipkin was added in another PR.

@yxun
Copy link
Member Author

yxun commented Dec 14, 2023

However, we may have two providers:
name: jaeger
jaeger:
...
name: zipkin
zipkin:
...

I don't like this idea, because it's not the upstream API.

Is this a better approach than creating a new tracing provider type?

@rcernich I don't think so, because the existing Jaeger/Stackdriver settings are mapped to the old telemetry v2 settings, which are no longer developed. And what's more, the old API can't be used with the new Telemetry API. So I think it's better to extend extension providers and document that the new API should be preferred. In OSSM 3, we will have only extension providers, so it's also better from the perspective of migration.

I'd also be concerned how this works with the existing Tracing fields

Very good point. I think we should implement a validation, which don't allow to set both spec.tracing.type: Jaeger and spec.meshConfig.extensionProviders[].zipkin at the same time, and additionally we should make sure that telemetry.v2 settings are disabled when Zipkin extension provider is used.

Btw. it would be easier to review this change if Zipkin was added in another PR.

Hi, I have created another PR only for Zipkin:
#1531

Signed-off-by: yxun <yuanlin.yxu@gmail.com>
@yxun yxun changed the title WIP: OSSM-5444 Support Distributed Tracing 3 OSSM-5444 Support Distributed Tracing 3 Jan 2, 2024
@yxun yxun changed the title OSSM-5444 Support Distributed Tracing 3 WIP: OSSM-5444 Support Distributed Tracing 3 Jan 2, 2024
@yxun yxun changed the title WIP: OSSM-5444 Support Distributed Tracing 3 OSSM-5444 Support Distributed Tracing 3 Jan 5, 2024
@yxun
Copy link
Member Author

yxun commented Jan 5, 2024

I tested this PR change works with "Red Hat build of OpenTelemetry operator" in the following task:
https://istio.io/v1.16/docs/tasks/observability/logs/otel-provider/
https://istio.io/v1.16/docs/tasks/observability/distributed-tracing/zipkin/

This PR is ready for review. instruction doc is in the following JIRA comment.
https://issues.redhat.com/browse/OSSM-5609

Copy link
Member

@jewertow jewertow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Istio Operator should reject SMCP v2.4 with opentelemetry and envoyOtelAls providers, like here https://github.com/maistra/istio-operator/pull/1531/files#diff-b86b0b1932da7be356a0558098c3fe6c57d214953652145e2420b357b315ec01.

@openshift-merge-bot openshift-merge-bot bot merged commit d02719c into maistra:maistra-2.5 Jan 9, 2024
4 checks passed
@yxun yxun deleted the OSSM-5444 branch May 6, 2024 18:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants