Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Frequent cases with Starting lock renewal messages in log and messages are not delivered #39833

Closed
3 tasks done
sboland-bigw opened this issue Apr 22, 2024 · 3 comments
Closed
3 tasks done
Assignees
Labels
Client This issue points to a problem in the data-plane of the library. customer-reported Issues that are reported by GitHub users external to the Azure organization. needs-team-attention This issue needs attention from Azure service team or SDK team question The issue doesn't require a change to the product in order to be resolved. Most issues start as that Service Bus

Comments

@sboland-bigw
Copy link

sboland-bigw commented Apr 22, 2024

Describe the bug
We have a problem with a Kubernetes service when it consumes messages from a ServiceBus topic. In some instances when extra processing is performed, it will stop processing further messages and the log message shows repeated instances of lock renewals. We are using the latest stable version of Azure Service Bus client for Java - Version 7.15.2. https://learn.microsoft.com/en-us/java/api/overview/azure/messaging-servicebus-readme?view=azure-java-stable.

Example logs:

2024-04-22 08:56:14,796 INFO [parallel-1] LoggingEventBuilder: {"az.sdk.message":"Starting lock renewal.","isSession":false,"lockToken":"a12845bf-58e9-4650-9d3e-5dba5b234e4f"}
2024-04-22 08:56:14,804 INFO  [reactor-executor-1] LoggingEventBuilder: {"az.sdk.message":"Starting lock renewal.","isSession":false,"lockToken":"a12845bf-58e9-4650-9d3e-5dba5b234e4f","nextExpiration":"2024-04-22T08:56:44.788Z","next":"PT29.983430032S"}
2024-04-22 08:56:34,788 INFO [parallel-1] LoggingEventBuilder: {"az.sdk.message":"Starting lock renewal.","isSession":false,"lockToken":"a12845bf-58e9-4650-9d3e-5dba5b234e4f"}
2024-04-22 08:56:34,794 INFO [reactor-executor-1] LoggingEventBuilder: {"az.sdk.message":"Starting lock renewal.","isSession":false,"lockToken":"a12845bf-58e9-4650-9d3e-5dba5b234e4f","nextExpiration":"2024-04-22T08:57:04.788Z","next":"PT29.993437838S"}

This will often happen in less than 60 seconds from the message consumption even the message is received. Furthermore the message will not get cleared from the topic in the Azure ServiceBus console. It requires doing a restart of the Kubernetes service.

The Message Lock Duration is set to 30 seconds in the topic. If this is changed to 60 seconds then the logs above are spaced approximately 50 seconds apart.

Exception or Stack Trace
We don't observe any exceptions.

To Reproduce
Steps to reproduce the behavior:
I haven't created an exact cut-down / minimal example, however, the code used is trimmed below to describe what we're doing.

Code Snippet

The example code here is split into 4 files. The first 2 files exist in a common library while the second 2 files use this library in the Kubernetes service.

  1. ServiceBusConsumerHandler - We use ServiceBusProcessorClient and PEEK_LOCK receive mode.
@Builder
public class ServiceBusConsumerFactory<T> {
    private @NonNull String connectionString;
    private @NonNull String topicName;
    private @NonNull String subscriptionName;
    private @NonNull Long maxAutoLockRenewDurationMinutes;
    private @NonNull Class<T> cls;
    private @NonNull Function<Flux<T>, Mono<Void>> processorFunction;

    public ServiceBusConsumer create() {
        final var consumerHandler = ServiceBusConsumerHandler.<T>builder()
                .processor(processorFunction)
                .build();

        ServiceBusProcessorClient serviceBusProcessorClient = new ServiceBusClientBuilder()
                .connectionString(connectionString)
                .processor()
                .topicName(topicName)
                .subscriptionName(subscriptionName)
                .maxAutoLockRenewDuration(Duration.ofMinutes(maxAutoLockRenewDurationMinutes))
                .receiveMode(ServiceBusReceiveMode.PEEK_LOCK)
                .processMessage(consumerHandler::processMessage)
                .processError(consumerHandler::processError)
                .buildProcessorClient();

        return new ServiceBusConsumer(topicName, subscriptionName, serviceBusProcessorClient);
    }
}

  1. ServiceBusConsumerHander.java - I haven't included the deadLetter() method below. It simply looks for exceptions of instance RetryableMessageException. In this case it calls context.abandon() or else it calls context.deadLetter(option). There's no evidence of any exceptions in the logs.
@Builder
@Slf4j
public class ServiceBusConsumerHandler<T> {

    private @NonNull Function<Flux<T>, Mono<Void>> processor;

    public void processMessage(ServiceBusReceivedMessageContext context) {
        ServiceBusReceivedMessage message = context.getMessage();
        log.info(
                "Processing service bus message Id: {}, Sequence #: {}}, Contents: {}",
                message.getMessageId(),
                message.getSequenceNumber(),
                message.getBody());

        Flux.just(context)
                .transform(processor)
                .doOnComplete(context::complete)
                .doOnError(ex -> deadLetter(context, ex))
                .then()
                .onErrorComplete()
                .block();
    }

   public void processError(ServiceBusErrorContext context) {
        log.error(
                "Error when receiving message from namespace {} Entity {} error {}",
                context.getFullyQualifiedNamespace(),
                context.getEntityPath(),
                context.getException().getMessage());
    }
  1. ServiceBusConfig.
@Slf4j
@Configuration
public class ServiceBusConfig {

    @Bean
    public ServiceBusConsumer transactionServiceBusConsumer(
            @Autowired NotificationProcessor notificationProcessor,
            @Value("${servicebus.endpoint}") String endpoint,
            @Value("${servicebus.subscription-name}") String subscriptionName,
            @Value("${servicebus.topic-name}") String topicName,
            @Value("${servicebus.max-autolock-renew-duration-minutes}") String maxAutoLockRenewDurationMinutes) {
        final ServiceBusConsumerFactory<EventData> build =
                ServiceBusConsumerFactory.<EventData>builder()
                .connectionString(endpoint)
                .subscriptionName(subscriptionName)
                .topicName(topicName)
                .processorFunction(notificationProcessor)
                .maxAutoLockRenewDurationMinutes(Long.parseLong(maxAutoLockRenewDurationMinutes))
                .cls(EventData.class)
                .build();
        return build.create();
    }
}

maxAutoLockRenewDurationMinutes above is set to 5 minutes.

  1. NotificationProcessor.java
@Slf4j
@Service
public class NotificationProcessor implements Function<Flux<EventData>, Mono<Void>> {

    @Override
    public Mono<Void> apply(Flux<EventData> eventFlux) {
        return eventFlux
                .map(this::toContext) // simple transformation logic 
                .flatMap(this::processNotificationEvent)
                .then()
                .doOnError(err -> {
                    log.error("Processing event data failed. ", err);
                });
    }

The code inside processNotificationEvent makes a database query. In the cases where I can reproduce the problem it also performs a remote procedure call, however, this takes less than 2 seconds. In these instances, it will quite frequently result in producing the lock messages and the services stops processing the messages.

Expected behavior
I don't expect to see the lock messages occurring and the service should be able to make the remote procedure call.

Screenshots
If applicable, add screenshots to help explain your problem.

Setup (please complete the following information):

  • OS: Azure Kubernetes Service
  • IDE: N/A
  • Library/Libraries: com.azure:azure-messaging-servicebus:7.15.2, com.azure:azure-cosmos:4.57.0, azureSdkBom 1.2.22
  • Java version: 17
  • App Server/Environment: Reactor/Netty
  • Frameworks: Spring Boot 3.2

If you suspect a dependency version mismatch (e.g. you see NoClassDefFoundError, NoSuchMethodError or similar), please check out Troubleshoot dependency version conflict article first. If it doesn't provide solution for the problem, please provide:

  • verbose dependency tree (mvn dependency:tree -Dverbose)
  • exception message, full stack trace, and any available logs

Additional context
Add any other context about the problem here.

Information Checklist
Kindly make sure that you have added all the following information above and checkoff the required fields otherwise we will treat the issuer as an incomplete report

  • Bug Description Added
  • Repro Steps Added
  • Setup information Added
@github-actions github-actions bot added Client This issue points to a problem in the data-plane of the library. customer-reported Issues that are reported by GitHub users external to the Azure organization. needs-team-attention This issue needs attention from Azure service team or SDK team question The issue doesn't require a change to the product in order to be resolved. Most issues start as that Service Bus labels Apr 22, 2024
Copy link

@anuchandy @conniey @lmolkova

Copy link

Thank you for your feedback. Tagging and routing to the team member best able to assist.

@sboland-bigw
Copy link
Author

We found the issue. It was due to our client code not returning and this caused the lock messages. So I'm closing the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Client This issue points to a problem in the data-plane of the library. customer-reported Issues that are reported by GitHub users external to the Azure organization. needs-team-attention This issue needs attention from Azure service team or SDK team question The issue doesn't require a change to the product in order to be resolved. Most issues start as that Service Bus
Projects
None yet
Development

No branches or pull requests

2 participants