Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Broken context propagation in HTTP request that involves hibernate-reactive-panache, resteasy-reactive and qute #27229

Closed
mkouba opened this issue Aug 10, 2022 · 27 comments

Comments

@mkouba
Copy link
Contributor

mkouba commented Aug 10, 2022

Describe the bug

The CDI request context is not active when a template (that leverages a @RequestScoped bean) is rendered in a Uni chain:

    @GET
    public Uni<TemplateInstance> get(@QueryParam("id") Long id) {
        // Fruit extends io.quarkus.hibernate.reactive.panache.PanacheEntity
        return Fruit.findById(id).onItem().transform(f -> Templates.fruit((Fruit) f));
    }

How to Reproduce?

  1. Checkout this quarkus-quickstarts branch
  2. Run the modified hibernate-reactive-panache-quickstart in the dev mode (can be reproduced in the prod mode as well)
  3. HTTP GET http://localhost:8080/test/template?id=1

You should see a ContextNotActiveException in the console.

UPDATE: I've added two more simplified resource methods which produce a slightly different error:

javax.enterprise.inject.IllegalProductException: Normal scoped producer method may not return null: io.quarkus.vertx.http.runtime.CurrentVertxRequest.getCurrent()

It seems that the request context is active but the CurrentVertxRequest bean instance is not propagated.

How to get this error:

  1. HTTP GET http://localhost:8080/test/simple?id=1 - this one does not involve a qute template
  2. HTTP GET http://localhost:8080/test/no-hibernate?id=1 - this one does not involve hibernate-reactive-panache

It might be as well a PEBKAC on my side. I would be more than happy if someone explains to me what's going on here ;-)

@quarkus-bot
Copy link

quarkus-bot bot commented Aug 10, 2022

@DavideD
Copy link
Contributor

DavideD commented Aug 11, 2022

Isn't this the same issue we had with ActivateRequestContext?
The context gets created/closed at the wrong time because the method is reactive.

Having a quick look at InvocationInterceptor, I can see that it doesn't check if the method is reactive. Is this correct?

By the way, it seems that only the first request fails, after that I don't have any erros.

@DavideD
Copy link
Contributor

DavideD commented Aug 11, 2022

When calling http://localhost:8080/test/template?id=1 - on the first request - the RequestContext gets deactivated just before calling Templates.fruit((Fruit) f). Could this be the reason for the exception?

@mkouba
Copy link
Contributor Author

mkouba commented Aug 11, 2022

Isn't this the same #26551?
The context gets created/closed at the wrong time because the method is reactive.

I'm not sure but @ActivateRequestContext is not used here...

Having a quick look at InvocationInterceptor, I can see that it doesn't check if the method is reactive. Is this correct?

InvocationInterceptor (which is a dev mode feature) makes no difference - as I mentioned above it's reproducible in the prod mode as well.

By the way, it seems that only the first request fails, after that I don't have any erros.

You're right! And that's really an interesting clue...

@DavideD
Copy link
Contributor

DavideD commented Aug 11, 2022

I'm not sure but @ActivateRequestContext is not used here...

Similar because the interceptors involved here don't keep track of the fact that the method is reactive, so the context is destroyed/deactivated at the wrong time.

@mkouba
Copy link
Contributor Author

mkouba commented Aug 11, 2022

I'm not sure but @ActivateRequestContext is not used here...

Similar because the interceptors involved here don't keep track of the fact that the method is reactive, so the context is destroyed/deactivated at the wrong time.

Like I mentioned above the interceptors do not need to be involved at all...

@mkouba
Copy link
Contributor Author

mkouba commented Aug 11, 2022

By the way, it seems that only the first request fails, after that I don't have any erros.

You're right! And that's really an interesting clue...

FTR this does not apply to org.acme.hibernate.orm.panache.TestResource.noHibernate(Long) which fails all the time.

@mkouba
Copy link
Contributor Author

mkouba commented Aug 11, 2022

It seems that the failure for the /test/simple?id=1 and /test/no-hibernate?id=1 resource methods is caused by the fact that resteasy-reactive removes the current request on CDI request context deactivation which happens when a ResteasyReactiveRequestContext is suspended. TestResource.simple() only fails if the deactivation happens before the Uni is processed. It always fails for TestResource.noHibernate() where the transformation is delayed a little bit.

In other words, for these cases the request context is propagated correctly but the state of the CurrentVertxRequest bean is explicitly set to null.

I'm still not sure about the other issue with a template. We know that the UniResponseHandler first suspends the ResteasyReactiveRequestContext and the CDI request context is deactivated once the UniResponseHandler.handle() method returns. Now the failure does not occur if the cached value is used (the Fruit entity is annotated with @javax.persistence.Cacheable) most likely because in this case the Uni processing is finished before the ResteasyReactiveRequestContext manages to deactivate the CDI request context. But the ContextNotActiveException does not make sense it actually makes sense if we look at the current ArcContextProvider implementation where no context is activated if no context was active during capture.

UPDATE: Except it seems that a context is active during capture 🤷 I was actually right the NullContextSnapshot is used.

@geoand
Copy link
Contributor

geoand commented Aug 22, 2022

So what's the conclusion on this one?

@mkouba
Copy link
Contributor Author

mkouba commented Aug 22, 2022

So what's the conclusion on this one?

No conclusion so far. We know the cause but I have no idea how to fix it.

@geoand
Copy link
Contributor

geoand commented Aug 22, 2022

Is there something that shoyud be done in RR do you think?

@maxandersen
Copy link
Contributor

related to #27354 ?

@geoand
Copy link
Contributor

geoand commented Aug 23, 2022

It could be, but I don't really think so

@geoand
Copy link
Contributor

geoand commented Aug 25, 2022

Based on the discussion we had yesterday, I think that the changes that @mkouba and @cescoffier have in store will address this. Even if they don't, I am sure they will get us closer to fixing this.
Essentially this issue comes up because RR removes the request scope when encountering a Uni. Normally this is not a problem because of context propagation, but there might be something additional we need to do in Qute to get context propagation working.

@cescoffier
Copy link
Member

Qute use Completion Stage. Completion Stage creation needs to be decorated to be managed by context propagation.
This will create a dependency on CP.

Another solution is to switch to Uni.

@mkouba
Copy link
Contributor Author

mkouba commented Aug 25, 2022

Based on the discussion we had yesterday, I think that the changes that @mkouba and @cescoffier have in store will address this. Even if they don't, I am sure they will get us closer to fixing this.

Yes, #27443 if merged can solve part of the problem.

Essentially this issue comes up because RR removes the request scope when encountering a Uni. Normally this is not a problem because of context propagation, but there might be something additional we need to do in Qute to get context propagation working.

Note that only one of the broken use case involves qute. Two other broken cases fail for another reason. I will repeat myself: resteasy-reactive removes the current request on CDI request context deactivation which happens when a ResteasyReactiveRequestContext is suspended. Now if you attempt to access an injected RoutingContext or HttpServerRequest that was not initialized yet then you get the IllegalProductException because CurrentVertxRequest.current is null, i.e. the request context is propagated correctly but the state of the CurrentVertxRequest bean is explicitly set to null.

A workaround exists though - it should help if you initialize the injected HttpServerRequest in the resource method (HttpServerRequest is stored in the request context and the CurrentRequestProducer.getCurrentRequest() producer does not need to be called later).

@cescoffier
Copy link
Member

I'm trying to quickly see if switching to Uni fixes the issue.

@mkouba
Copy link
Contributor Author

mkouba commented Aug 25, 2022

I'm trying to quickly see if switching to Uni fixes the issue.

What exactly do you have in mean? Qute won't internally switch to Uni.

@cescoffier
Copy link
Member

That's what I'm trying to do. If you do not do that, you need to track all the instantiation of CompletionStage and decorate them.

@cescoffier
Copy link
Member

cescoffier commented Aug 25, 2022

@mkouba the other solution is to decorate like this every time you create a new CompletionStage:

 ThreadContext threadContext = SmallRyeThreadContext.getCurrentThreadContextOrDefaultContexts();
return threadContext.withContextCapture(t);

Ref: https://github.com/smallrye/smallrye-mutiny/blob/main/context-propagation/src/main/java/io/smallrye/mutiny/context/MutinyContextManagerExtension.java#L17-L23

(I let you decide which way you prefer because, switching to Uni will require a bit of work, but not missing any CompletionStage creation is also a source of bugs).

@cescoffier
Copy link
Member

Decoration of the completion stage will also require large modifications of the code base.

@cescoffier
Copy link
Member

Just to be clear, the issue here is not due to the removal of the request scope on suspension. Even with the PR changing, the request scope storage (which also avoids the removal) does not prevent this issue.

@geoand
Copy link
Contributor

geoand commented Aug 25, 2022

I need to check and see what happens if the current context is not set to null

@geoand
Copy link
Contributor

geoand commented Aug 25, 2022

If we move to the DuplicatedContext backed storage, then again there should be no reason to clear this (by calling CurrentRequestManager.set(null)).

@geoand
Copy link
Contributor

geoand commented Aug 25, 2022

And I also completely agree with @cescoffier that we need the Qute CS usages to be Context Propagation enabled

geoand added a commit to geoand/quarkus that referenced this issue Aug 25, 2022
This does not completely solve quarkusio#27229, but it is a step in the right
direction.
mkouba added a commit that referenced this issue Aug 25, 2022
Ensure that CDI request scope is active in TemplateResponseUniHandler
fercomunello pushed a commit to fercomunello/quarkus that referenced this issue Aug 31, 2022
This does not completely solve quarkusio#27229, but it is a step in the right
direction.
gsmet pushed a commit to gsmet/quarkus that referenced this issue Sep 5, 2022
This does not completely solve quarkusio#27229, but it is a step in the right
direction.

(cherry picked from commit edfd611)
@mkouba
Copy link
Contributor Author

mkouba commented Sep 8, 2022

I'm going to close this one because I was not able to reproduce any of the problems after #27497 and #27443 were merged!

@mkouba mkouba closed this as completed Sep 8, 2022
@Sanne
Copy link
Member

Sanne commented Sep 8, 2022

Awesome @mkouba !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants