Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Retry initiation message sends upon "VERY BAD!" scenario. #36

Open
stolsvik opened this issue Feb 13, 2021 · 0 comments
Open

Retry initiation message sends upon "VERY BAD!" scenario. #36

stolsvik opened this issue Feb 13, 2021 · 0 comments

Comments

@stolsvik
Copy link
Contributor

As mentioned in (now closed) centiservice/mats#27, "Initiations - VERY BAD! Reconsider transaction demarcation in initiations", with an initiation, there is really no JMS transactional demarcation going on: You have not consumed a message - the only point is to produce one or several messages. Thus, if the actual sending of the message fails, you could just try to send it again. (Read centiservice/mats#27 now!)

There is a slight issue with possible double-sending: If the initial attempt actually went through, but you didn't get the "TCP packets" from the MQ informing you about this, you could now possibly send multiple identical messages. This must be handled on the receiving side. This might imply that such retrying logic would have to be opt-in.

Also, there is an issue with the DB commit: The very point about this situation is that the DB commit has gone through, but you have not gotten the messages on their way. The messages now reside only in memory. You might loose power immediately afterwards. Thus, you'd really want to notify the invoker immediately, so that he could possibly issue compensating transactions to get back to a correct state (i.e. no jobs were allocated after all). However, if you now are going into a retry-cycle, you do not want to exit out just yet.

Also, there is an issue with logging: To be prudent about the situation occurring, you'd like to output a log-line that documents the problem immediately. This is because "all bets are off" at this point: We are only holding the message in memory. You might loose power right after, and then those messages really are gone forever. You'd then possibly might want to output a "Possible VERY BAD!", that after retrying for e.g. 7 seconds, you'd either output a "Cancelled VERY BAD!" if you got the message through, or do an "Actual VERY BAD!" if you gave up.

These problems very much point to the outbox pattern (#77) really being the way to go here, as that offloads the message storage to the database on the same commit as the database changes happened.

However, a rather simple retrying logic within mats itself would probably realistically alleviate very much of the actual problems occurring with initiations, since dropped MQ connections probably more often is due to booted MQ broker, rather than disastrous data center crashes.

@stolsvik stolsvik transferred this issue from another repository Sep 27, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant