Skip to content

Commit

Permalink
[FLINK-3981] don't log duplicate TaskManager registrations
Browse files Browse the repository at this point in the history
Duplicate TaskManager registrations shouldn't be logged with Exceptions
in the ResourceManager. Duplicate registrations can happen if the
TaskManager sends out registration messages too fast when the actual
reply is not lost but still in transit.

The ResourceManager should simply acknowledge the duplicate
registrations, leaving it up to the JobManager to decide how to treat
the duplicate registrations (currently it will send an AlreadyRegistered
to the TaskManager).

This closes #2045
  • Loading branch information
mxm committed May 30, 2016
1 parent 254054f commit 9328006
Showing 1 changed file with 5 additions and 4 deletions.
Expand Up @@ -354,18 +354,19 @@ private void handleRegisterResource(ActorRef jobManager, ActorRef taskManager,
ResourceID resourceID = msg.resourceId();
try {
Preconditions.checkNotNull(resourceID);
WorkerType newWorker = workerRegistered(resourceID);
WorkerType oldWorker = registeredWorkers.put(resourceID, newWorker);
// check if resourceID is already registered (TaskManager may send duplicate register messages)
WorkerType oldWorker = registeredWorkers.get(resourceID);
if (oldWorker != null) {
LOG.warn("TaskManager {} had been registered before.", resourceID);
LOG.debug("TaskManager {} had been registered before.", resourceID);
} else {
WorkerType newWorker = workerRegistered(resourceID);
registeredWorkers.put(resourceID, newWorker);
LOG.info("TaskManager {} has registered.", resourceID);
}
jobManager.tell(decorateMessage(
new RegisterResourceSuccessful(taskManager, msg)),
self());
} catch (Exception e) {
// This may happen on duplicate task manager registration message to the job manager
LOG.warn("TaskManager resource registration failed for {}", resourceID, e);

// tell the JobManager about the failure
Expand Down

0 comments on commit 9328006

Please sign in to comment.