Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix duplicate traceId when use forward scheme in spring.cloud.gateway.routes.uri properties #672

Merged
merged 2 commits into from
Mar 7, 2024

Conversation

cylx3126
Copy link
Contributor

@cylx3126 cylx3126 commented Mar 6, 2024

Fix duplicate traceId when use forward scheme in spring.cloud.gateway.routes.uri properties

I provide a Minimal, Reproducible Example to explain when and why this bug happens.

  • Add a unit test to verify that the fix works.

  • Explain briefly why the bug exists and how to fix it.

  • Update the CHANGES log.

Comment on lines 48 to 56
if (isAlreadyRouted(exchange) || !isHttpScheme(exchange)) {
// In these cases, NettyRoutingFilter will not send request and HttpClientFinalizer#send will not be
// invoked.
// If HttpClientFinalizer#send don't invoke, "SpringCloudGateway/RoutingFilter" span created below will
// not close and produce a duplicate traceId bug.
// Return before create span to avoid this bug.
return;
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there is no span created, will the trace be broken? In the forward case, how the tracing context is going to propagate? I don't see the new workflow in your provided MVP. The bug description is clear about why tracing context was leaked.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These conditions are as same as NettyRoutingFilter.

If the request is a http/https request, tracing context is going to propagate like NettyRoutingFilter#filter -> HttpClientFinalizer#send -> HttpClientFinalizer#responseConnection .

If the request is not a http/https request, tracing context is going to propagate like NettyRoutingFilter#filter -> ForwardRoutingFilter#filter -> DispatcherHandler#handle, in this case, HttpClientFinalizer will not participate in the propagation.

By the way, in the Spring Cloud Gateway releases before 2021/05/24, conditions there are use equals instead of equalsIgnoreCase. Tracing context will still leaked when use a older version Spring Cloud Gateway and the scheme of request is HTTP/HTTPS ,

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the way, in the Spring Cloud Gateway releases before 2021/05/24, conditions there are use equals instead of equalsIgnoreCase.

2021 seems not long ago, could you try to fix that case too?
Could we use witness class to flag those versions and use a different interceptor, in order to match the target codes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No witness class around that Spring Cloud Gateway release,

How about we add some code in NettyRoutingFilterInterceptor#afterMethod instead of the check at NettyRoutingFilterInterceptor#beforeMethod, the code in NettyRoutingFilterInterceptor#afterMethod may like this:

    @Override
    public Object afterMethod(EnhancedInstance objInst, Method method, Object[] allArguments, Class<?>[] argumentsTypes,
                              Object ret) throws Throwable {
        if (ContextManager.isActive()) {
            // if HttpClientFinalizerSendInterceptor does not invoke, we will stop the span there to avoid context leak.
            ContextManager.stopSpan();
        }
        return ret;
    }

That is the way apm-spring-cloud-gateway-2.0.x-plugin to stop the span.

Consider two propagates above, have a check and stop the span in NettyRoutingFilterInterceptor#afterMethod may better than have a check at NettyRoutingFilterInterceptor#beforeMethod.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we can't find a way(witness class or method) to separate the two cases, it is better we could close the span in the after to avoid context leak.

@wu-sheng wu-sheng added bug Something isn't working plugin labels Mar 6, 2024
@cylx3126
Copy link
Contributor Author

cylx3126 commented Mar 7, 2024

Ok, I will modify and test the code that close the span in afterMethod hook.

…t invoke, the span created at NettyRoutingFilterInterceptor can not stop.
@cylx3126
Copy link
Contributor Author

cylx3126 commented Mar 7, 2024

Could you review this PR? I have done a refactor on my code.

@wu-sheng
Copy link
Member

wu-sheng commented Mar 7, 2024

impala case seems to have some issues, I am not sure the reason.

@cylx3126
Copy link
Contributor Author

cylx3126 commented Mar 7, 2024

Pulling postgres (parrotstream/postgres:10.5)...
The image for the service you're trying to recreate has been removed. If you continue, volume data could be lost. Consider backing up your data before continuing.

pull access denied for parrotstream/postgres, repository does not exist or may require 'docker login': denied: requested access to the resource is denied
docker startup failure!
Error: Process completed with exit code 1.

It seems like docker pull failed in impala test case.

@wu-sheng
Copy link
Member

wu-sheng commented Mar 7, 2024

I am trying to remove that case. This PR will be merged after I fix this.

@wu-sheng wu-sheng added this to the 9.2.0 milestone Mar 7, 2024
@wu-sheng wu-sheng merged commit f227543 into apache:main Mar 7, 2024
188 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working plugin
Projects
None yet
2 participants