Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Events other than job results are forwarded to a random master #35588

Closed
szjur opened this issue Aug 19, 2016 · 3 comments
Closed

Events other than job results are forwarded to a random master #35588

szjur opened this issue Aug 19, 2016 · 3 comments
Labels
Pending-Discussion The issue or pull request needs more discussion before it can be closed or merged stale
Milestone

Comments

@szjur
Copy link

szjur commented Aug 19, 2016

Description of Issue/Question

In a multi-master / multi-syndic environment events other than job results (such as minion_start) are forwarded to a random master. If you want to watch the event bus for an event such as minion starting after a reboot, you never know which top level master gets it. This makes it pretty tough to make use of those events. Suppose you start a reboot job on a minion from one of the masters. Then the event that it went up arrives to the other one.

The reason for that behaviour is that MultiSyndic._forward_events() calls
_call_syndic() method without any master_id, which effectively forwards self.raw_events to a first syndic on the list. This can be easily changed so that a special master_id value (such as '*') makes _call_syndic() forward events to every available master. At least this works for me.

Setup

2 masters, 2 syndics connecting to both of them.

Steps to Reproduce Issue

Restart the minion on an end node a couple of times and see the event appearing on one of the masters (semi-randomly)

Versions Report

2016.3.2

@Ch3LL
Copy link
Contributor

Ch3LL commented Aug 19, 2016

Looks like in the docs here it does indeed state:

Since each syndic is connected to each master, jobs sent from any master are forwarded to minions that are connected to each syndic. If the master_id value is set in the master config on the higher level masters, job results are returned to the master that originated the request in a best effort fashion. Events/jobs without a master_id are returned to any available master.

Which I believe is the behavior you are running into where events without a master_id are returned to a master. Since my knowledge is not as refined in this area though...ping @cachedout is there a way that we could improve on this. Maybe an option to specify using all masters in the list? Also just to confirm am I understanding these docs correctly?

@Ch3LL Ch3LL added the Pending-Discussion The issue or pull request needs more discussion before it can be closed or merged label Aug 19, 2016
@Ch3LL Ch3LL added this to the Blocked milestone Aug 19, 2016
@szjur
Copy link
Author

szjur commented Aug 19, 2016

Yes, I guess you understand the docs correctly. I didn't know that job results with master_id are returned to that master only in a best effort fashion. But this can probably be easily alleviated by using a shared job cache. On the other hand events such as minion_start don't go to any job cache as they aren't even jobs. I may be talking rubbish now but I guess they are gone if you don't read them from the event buss at the time they arrive. For me forwarding such events just to any master is a bit pointless.

As said previously, changing that behaviour is really easy. When _forward_events() parses self.raw_events (https://github.com/saltstack/salt/blob/develop/salt/minion.py#L2577) it can set master_id to '*' or another special value which then can be treated appropriately by _call_syndic() (https://github.com/saltstack/salt/blob/develop/salt/minion.py#L2423).

Anyway guys, that is only my suggestion. I patched it myself and the least I can do is report it here so that the community may potentially benefit.

@stale
Copy link

stale bot commented Jun 20, 2018

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

If this issue is closed prematurely, please leave a comment and we will gladly reopen the issue.

@stale stale bot added the stale label Jun 20, 2018
@stale stale bot closed this as completed Jun 27, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Pending-Discussion The issue or pull request needs more discussion before it can be closed or merged stale
Projects
None yet
Development

No branches or pull requests

2 participants