Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The Greenwave message consumer will cause Nackloops #3302

Closed
bowlofeggs opened this issue Jun 7, 2019 · 6 comments
Closed

The Greenwave message consumer will cause Nackloops #3302

bowlofeggs opened this issue Jun 7, 2019 · 6 comments
Assignees
Labels
Crash Issues related to an unhandled crash Critical We can't go on living in this sqalor, drop everything and fix it! reliability Issues pertaining to Bodhi's reliability

Comments

@bowlofeggs
Copy link
Contributor

bowlofeggs commented Jun 7, 2019

I saw that #3200 got merged today, and while looking over the final patch I noticed that it raises BodhiExceptions when it receives a message it cannot process. This will cause the Bodhi consumer to send an error e-mail to the bodhi-admin list and then Nack the message to the broker, which will put the message back into the queue. This will cause the message to loop in the queue forever:

self.greenwave_handler(msg)
except Exception as e:
error_msg = f'{str(e)}: Unable to handle message: {msg}'
log.exception(error_msg)
raise fedora_messaging.exceptions.Nack(error_msg)

I believe that messages looping on the queue forever will also lead to a halt of message processing, because eventually the head of the queue will only be messages that Bodhi cannot process and all other messages will not be delivered to the consumer.

If we receive a message we cannot process, we should log it and discard (and in this case, probably not at error if messages actually get received with these sorts of defects, because that will lead to error e-mails that are in-actionable).

This is critical to fix - we cannot release in the current state.

@bowlofeggs bowlofeggs added Critical We can't go on living in this sqalor, drop everything and fix it! Crash Issues related to an unhandled crash reliability Issues pertaining to Bodhi's reliability labels Jun 7, 2019
@tdawson tdawson self-assigned this Jun 7, 2019
@tdawson
Copy link
Contributor

tdawson commented Jun 7, 2019

Should be fixed in this pull request: #3303

@cverna
Copy link
Contributor

cverna commented Jun 8, 2019

If I understand this correctly we should not raise any exceptions in the consumers, then we also need to fix the following consumers :
https://github.com/fedora-infra/bodhi/blob/develop/bodhi/server/consumers/automatic_updates.py
https://github.com/fedora-infra/bodhi/blob/develop/bodhi/server/consumers/updates.py
https://github.com/fedora-infra/bodhi/blob/develop/bodhi/server/consumers/composer.py

Also maybe we could add a comment about that in each consumers.

@bowlofeggs
Copy link
Contributor Author

bowlofeggs commented Jun 10, 2019 via email

@cverna
Copy link
Contributor

cverna commented Jun 10, 2019

Oh yes that makes sense thanks for the clarification 😄. So I think the automatic_updates will need to be modified too, since it is raising exception if there are missing info from the message.

@bowlofeggs
Copy link
Contributor Author

Good catch @cverna - I filed #3302 for this.

@cverna
Copy link
Contributor

cverna commented Jun 13, 2019

This issue was fixed by #3303 let's close it

@cverna cverna closed this as completed Jun 13, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Crash Issues related to an unhandled crash Critical We can't go on living in this sqalor, drop everything and fix it! reliability Issues pertaining to Bodhi's reliability
Projects
None yet
Development

No branches or pull requests

3 participants