Skip to content
This repository has been archived by the owner on Apr 17, 2023. It is now read-only.

Aerogear 9672 - Create a retry/queuing mechanism for JMS messages #1028

Merged
merged 3 commits into from Aug 5, 2019

Conversation

secondsun
Copy link
Contributor

Motivation

We're currently using enmasse as one of our supported messaging systems. However, enmasse does not support (currently) DLQ/retry configuration. This PR let's UPS control requeuing those jms messages when we would have previously relied on the message broker.

What

This PR wraps the notification dispatcher in a try/catch block that will handle run time exceptions by looking at new metadata in the JMS message and optionally requeuing it. This behavior is configurable through environment variables.

This PR adds checks two environment variables : AMQ_MAX_RETRIES, and AMQ_BACKOFF_SECONDS. AMQ_MAX_RETRIES is the number of retries that will be sent.* AMQ_BACKOFF_SECONDS is the number of seconds to increase the delay timeout.

  • behavior note, this means that if you have a failure and three retries that fail you will see four error messages. One for the original and three for the retries.

Verification Steps

Add the steps required to check this change. Following an example.

  1. Configure a push application normally
  2. confirm it works
  3. Lobotomize the push server somehow; (block the connection to the external push services, set up a bad iOS certificate, add a throws runtime exception to a sender and rebuild/redeploy with that, etc
    The point is it has to be really rude and really bad)
  4. Send a push notification and confirm that your messages are being requeued

@psturc
Copy link

psturc commented Aug 1, 2019

👀

@psturc
Copy link

psturc commented Aug 2, 2019

I configured UPS on OpenShift cluster with AMQ_MAX_RETRIES and AMQ_BACKOFF_SECONDS env vars, created iOS variant with invalid certificate and verified that the retry mechanism works

07:59:49,030 DEBUG [org.jboss.aerogear.unifiedpush.message.NotificationDispatcher] (Thread-5 (ActiveMQ-client-global-threads)) Sending retry message 9688a236-d882-4cee-af24-e1f28e8f891e-1-1

One question though. I've set AMQ_MAX_RETRIES env var for value 3 and when I searched for the Sending retry message log, I found it ~20 times, all logged within 300 ms. Do you know why's that?

Another question. After sending 1 message to 1 registered (fake) device, UPS generated over 80k lines of logs with exceptions after couple of seconds. Would it make sense to limit it somehow?

@secondsun
Copy link
Contributor Author

@psturc So there are several places where things get logged, but I can def take a look at shortening that. Does the logging block the merging of this PR?

@psturc
Copy link

psturc commented Aug 2, 2019

@secondsun I don't think so, can be definitely done in a separate PR

@secondsun secondsun merged commit c18920b into aerogear:master Aug 5, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants