don't allow proxying requests to self #497

ryane · 2015-07-16T11:31:46Z

Even with the change in #463, it still appears to be possible to end up in a state where a supposed non-leader instance ends up getting itself as the leader path. When this happens, in RedirectFilter, jobScheduler.isLeader == false but jobScheduler.getLeader returns the path of the current instance. This results in chronos redirecting all requests to itself, and, ultimately an unresponsive REST api. It never appears to be able to recover when in this state. If the other chronos instances think that the stuck node is the leader, they will also end up being unresponsive resulting in an unusable chronos cluster.

Unfortunately, it is a bit difficult to reproduce the problem. One way I have found where I can duplicate it with some consistency is to have 3 servers, each running zookeeper, chronos, and mesos. Then, reboot each server serially. Often (but not always), one of the Chronos instances will have proxied all requests to itself and will be unresponsive.

I am not very familiar with the chronos code (or scala for that matter) so there may be a better way to handle this. And, it does not address the root cause of why and how JobScheduler ends up thinking it is not a leader but yet still returns the current instance's path from getLeader - I have not been able to figure that out. But, this commit does seem to prevent chronos from proxying requests to itself and ending up in the unresponsive state.

kolloch · 2015-08-05T17:03:24Z

Hi @ryane, thanks for your pull request.

If we get a request and we do not have consistent leadership information, we should probably wait for consistent leadership information or reject the request.

That's what we have done for Marathon. See here:

https://github.com/mesosphere/marathon/blob/master/src/main/scala/mesosphere/marathon/api/LeaderProxyFilter.scala#L121-L152

pdericson · 2016-03-07T05:25:47Z

👍

brndnmtthws · 2016-04-08T15:38:33Z

Thanks!

don't allow proxying requests to self

989d07f

brndnmtthws merged commit 8c77669 into mesos:master Apr 8, 2016

pereile mentioned this pull request Jul 28, 2016

Chronos Proxying Request to same Node #706

Open

gkleiman pushed a commit that referenced this pull request Apr 26, 2017

don't allow proxying requests to self (#497)

002138d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

don't allow proxying requests to self #497

don't allow proxying requests to self #497

ryane commented Jul 16, 2015

kolloch commented Aug 5, 2015

pdericson commented Mar 7, 2016

brndnmtthws commented Apr 8, 2016

don't allow proxying requests to self #497

don't allow proxying requests to self #497

Conversation

ryane commented Jul 16, 2015

kolloch commented Aug 5, 2015

pdericson commented Mar 7, 2016

brndnmtthws commented Apr 8, 2016