Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] EventhubReader stops working #39544

Open
the-mod opened this issue Apr 4, 2024 · 6 comments
Open

[BUG] EventhubReader stops working #39544

the-mod opened this issue Apr 4, 2024 · 6 comments
Assignees
Labels
Client This issue points to a problem in the data-plane of the library. customer-reported Issues that are reported by GitHub users external to the Azure organization. Event Hubs needs-team-attention This issue needs attention from Azure service team or SDK team question The issue doesn't require a change to the product in order to be resolved. Most issues start as that

Comments

@the-mod
Copy link

the-mod commented Apr 4, 2024

Describe the bug
In our Scenario two Applications are reading all Messages from an Eventhub on different ConsumerGroups.
But one Application (always the same one) irregularly stops reading from this Eventhub.
For me it looks like PartitionPumps are dying one after the other, cause the Outgoing Messages [should be 2x incoming] going constantly downwards to the level of the Incoming Messages. See the Chart.
hsi-downtime-2-1

Strangely both Application are sharing the Eventhub Reader Implementation which is done via Event Processor Host.

In the Logs I was able to catch some Traces:

Did not observe any item or terminal signal within 60000ms in 'filter' (and no fallback has been configured)

Stacktrace:

java.util.concurrent.TimeoutException: Did not observe any item or terminal signal within 60000ms in 'filter' (and no fallback has been configured)
    at reactor.core.publisher.FluxTimeout$TimeoutMainSubscriber.handleTimeout(FluxTimeout.java:296)
    at reactor.core.publisher.FluxTimeout$TimeoutMainSubscriber.doTimeout(FluxTimeout.java:281)
    at reactor.core.publisher.FluxTimeout$TimeoutTimeoutSubscriber.onNext(FluxTimeout.java:420)
    at io.opentelemetry.javaagent.shaded.instrumentation.reactor.v3_1.TracingSubscriber.lambda$onNext$1(TracingSubscriber.java:64)
    at io.opentelemetry.javaagent.shaded.instrumentation.reactor.v3_1.TracingSubscriber.withActiveSpan(TracingSubscriber.java:100)
    at io.opentelemetry.javaagent.shaded.instrumentation.reactor.v3_1.TracingSubscriber.withActiveSpan(TracingSubscriber.java:91)
    at io.opentelemetry.javaagent.shaded.instrumentation.reactor.v3_1.TracingSubscriber.onNext(TracingSubscriber.java:64)
    at reactor.core.publisher.FluxOnErrorReturn$ReturnSubscriber.onNext(FluxOnErrorReturn.java:162)
    at io.opentelemetry.javaagent.shaded.instrumentation.reactor.v3_1.TracingSubscriber.lambda$onNext$1(TracingSubscriber.java:64)
    at io.opentelemetry.javaagent.shaded.instrumentation.reactor.v3_1.TracingSubscriber.withActiveSpan(TracingSubscriber.java:100)
    at io.opentelemetry.javaagent.shaded.instrumentation.reactor.v3_1.TracingSubscriber.withActiveSpan(TracingSubscriber.java:91)
    at io.opentelemetry.javaagent.shaded.instrumentation.reactor.v3_1.TracingSubscriber.onNext(TracingSubscriber.java:64)
    at reactor.core.publisher.MonoDelay$MonoDelayRunnable.propagateDelay(MonoDelay.java:270)
    at reactor.core.publisher.MonoDelay$MonoDelayRunnable.run(MonoDelay.java:285)
    at reactor.core.scheduler.SchedulerTask.call(SchedulerTask.java:68)
    at reactor.core.scheduler.SchedulerTask.call(SchedulerTask.java:28)
    at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:317)
    at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
    at java.base/java.lang.Thread.run(Thread.java:1583)

Turning on some debug logs for the com.azure.messaging and com.azure.core.amqp I found the following, but not sure if has something to do with the Issue:

[DEBUG] 2024-04-04T14:13:01,679 - reactor-executor-3 - com.azure.core.amqp.implementation.ReactorReceiver - {"az.sdk.message":"There are no credits to add.","connectionId":"MF_d66b18_1712238562943","entityPath":"eventhub-01/ConsumerGroups/cg/Partitions/3","linkName":"3_d8df1f_1712238562943","credits":"0"} 

I tested it with
com.azure:azure-messaging-eventhubs:5.18.1 and com.azure:azure-messaging-eventhubs-checkpointstore-blob:1.19.1
as well as
com.azure:azure-messaging-eventhubs:5.17.1 and com.azure:azure-messaging-eventhubs-checkpointstore-blob:1.18.1

I can provide the more Logs and SourceCode if needed.

@github-actions github-actions bot added Client This issue points to a problem in the data-plane of the library. customer-reported Issues that are reported by GitHub users external to the Azure organization. Event Hubs needs-team-attention This issue needs attention from Azure service team or SDK team question The issue doesn't require a change to the product in order to be resolved. Most issues start as that labels Apr 4, 2024
Copy link

github-actions bot commented Apr 4, 2024

@anuchandy @conniey @lmolkova

Copy link

github-actions bot commented Apr 4, 2024

Thank you for your feedback. Tagging and routing to the team member best able to assist.

@conniey
Copy link
Member

conniey commented Apr 17, 2024

Hey @the-mod , Thanks for reporting this. I'm not sure where the stack trace is originating from because there's nothing about our code there. Can you provide some more logs around the time of this error in addition to the ReactorReceiver logs?

Cheers,
Connie

@conniey conniey added the needs-author-feedback More information is needed from author to address the issue. label Apr 18, 2024
@github-actions github-actions bot removed the needs-team-attention This issue needs attention from Azure service team or SDK team label Apr 18, 2024
Copy link

Hi @the-mod. Thank you for opening this issue and giving us the opportunity to assist. To help our team better understand your issue and the details of your scenario please provide a response to the question asked above or the information requested above. This will help us more accurately address your issue.

Copy link

Hi @the-mod, we're sending this friendly reminder because we haven't heard back from you in 7 days. We need more information about this issue to help address it. Please be sure to give us your input. If we don't hear back from you within 14 days of this comment the issue will be automatically closed. Thank you!

@github-actions github-actions bot added the no-recent-activity There has been no recent activity on this issue. label Apr 25, 2024
@the-mod
Copy link
Author

the-mod commented Apr 26, 2024

@conniey sorry for the late reply. I will provide log traces via email. Thanks in Advance

@github-actions github-actions bot added needs-team-attention This issue needs attention from Azure service team or SDK team and removed needs-author-feedback More information is needed from author to address the issue. no-recent-activity There has been no recent activity on this issue. labels Apr 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Client This issue points to a problem in the data-plane of the library. customer-reported Issues that are reported by GitHub users external to the Azure organization. Event Hubs needs-team-attention This issue needs attention from Azure service team or SDK team question The issue doesn't require a change to the product in order to be resolved. Most issues start as that
Projects
None yet
Development

No branches or pull requests

2 participants