Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

Enforcing ordering of Terminated wrt remote/local #2835 #997

Merged
merged 2 commits into from Jan 4, 2013

Conversation

Projects
None yet
5 participants
Member

drewhk commented Jan 4, 2013

Due to the undefined ordering between Terminated messages it was possible for the RemoteDaemon to shut down and proceed with shutting down the Remoting before a Terminated was sent to remote watchers -- causing the message to be dropped.

I am not sure this is the best way to distinguish between local/remote ActorRefs, but the bug is confirmed to be fixed. Also, some feedback how this might affect cluster deployed actors is welcome.

@viktorklang viktorklang and 2 others commented on an outdated diff Jan 4, 2013

...or/src/main/scala/akka/actor/dungeon/DeathWatch.scala
}
+
+ val (remote, local) = watchedBy.partition(_.path.address.host.isDefined)
+
+ remote foreach sendTerminated
+ local foreach sendTerminated
@viktorklang

viktorklang Jan 4, 2013

Owner

Can we come up with a more efficient solution?

@drewhk

drewhk Jan 4, 2013

Member

Two loops is more efficient, but uglier as well
or we can keep the watchedBy collection sorted
or we can have to watchedBy collections.
or any more ideas you have

@viktorklang

viktorklang Jan 4, 2013

Owner

Since the above solution is O(2*N) anyway, there's no need for creating an intermediate collection.

def sendTerminated(ifLocal: Boolean)(watcher: ActorRef): Unit = 
  watcher match {
    case a: ActorRefScope if a.isLocal == ifLocal =>
      try watcher.tell(terminated, self) catch {
        case NonFatal(t) ⇒ publish(Error(t, self.path.toString, clazz(actor), "deathwatch"))
      }
    case _ => // N/A
  }

watchedBy foreach sendTerminated(ifLocal = false)
watchedBy foreach sendTerminated(ifLocal = true)
@drewhk

drewhk Jan 4, 2013

Member

That's the two loops approach, but much prettier than I thought. I like it.

@drewhk

drewhk Jan 4, 2013

Member

Is it safe to ignore ActorRefs without ActorRefScope?

@viktorklang

viktorklang Jan 4, 2013

Owner

@drewhk Yes, it should be: private[akka] abstract class InternalActorRef extends ActorRef with ScalaActorRef { this: ActorRefScope ⇒

However, you can add ERROR logging there to make sure.

@patriknw

patriknw Jan 4, 2013

Owner

I think that is good
do we have any ActorRef without ActorRefScope? should it be silently ignored?

@patriknw

patriknw Jan 4, 2013

Owner

or just remove the case, if it should never happen

@drewhk

drewhk Jan 4, 2013

Member

How does logging work from here? I am inside DeathWatch (and ActorCell via self type)

@viktorklang

viktorklang Jan 4, 2013

Owner

Did you see my example code above?

@drewhk

drewhk Jan 4, 2013

Member

I saw publish, but I was not sure if it applies to my case.

@viktorklang

viktorklang Jan 4, 2013

Owner

Why wouldn't it?

protected final def publish(e: LogEvent): Unit = try system.eventStream.publish(e) catch { case NonFatal(_) ⇒ }
@patriknw

patriknw Jan 4, 2013

Owner

I would let it blow up instead (remove the case _)

@viktorklang

viktorklang Jan 4, 2013

Owner

The irony is that it will blow up if the default case is removed, since the local-check is done in the guard. So if you remove the default, you haveto make the guard into an if-expression in the closure body instead.

@drewhk

drewhk Jan 4, 2013

Member

I noticed that already. I just added another case statement without the guard but with matching on ActorRefScope. Of course I can move the guard inside the case statement as well. Which one do you prefer?

@viktorklang

viktorklang Jan 4, 2013

Owner

perhaps just scrap the match all together and just do:

if (watcher.asInstanceOf[ActorRefScope].isLocal == ifLocal)
  watcher.tell(terminated, self)  

@patriknw patriknw and 3 others commented on an outdated diff Jan 4, 2013

akka-actor/src/main/scala/akka/actor/Address.scala
@@ -19,6 +19,7 @@ import scala.collection.immutable
*/
@SerialVersionUID(1L)
final case class Address private (protocol: String, system: String, host: Option[String], port: Option[Int]) {
+ // Please note that local/non-local distinction must be preserved: host.isDefined == !isLocal
@patriknw

patriknw Jan 4, 2013

Owner

perhaps add a

def isLocal: Boolean = host.isEmpty
@drewhk

drewhk Jan 4, 2013

Member

Do we need this comment anymore if we use ActorRefScope wich Viktor proposed?

@rkuhn

rkuhn Jan 4, 2013

Collaborator

I am at least half-certain that Addresses may also be interrogated for local-ness, and there ActorRefScope does not help; I’d like to keep this comment; and we should also add an isLocal method which encodes this convention.

@drewhk

drewhk Jan 4, 2013

Member

But if the Address is the same as our listen address then isLocal should return true.

@rkuhn

rkuhn Jan 4, 2013

Collaborator

no, Address has nothing to do with the remoting implementation: the address either specifies host/port, in which case it is non-local, or it does not. Address.isLocal only means that this Address cannot be migrated to other systems or nodes without additional information.

@drewhk

drewhk Jan 4, 2013

Member

Maybe we should use a different name for that function.
isParial? migratable?
WDYT?

@rkuhn

rkuhn Jan 4, 2013

Collaborator

isLocalScope?

@drewhk

drewhk Jan 4, 2013

Member

locallyDefined? totallyDefined?

@rkuhn

rkuhn Jan 4, 2013

Collaborator

no, it’s really about the scope of the address, not unlike IPv6 addresses (host-only, link-local, global); it is basically the realm within which this address is supposed to be unique; hence I think there should be both

def hasLocalScope = host.isEmpty
def hasGlobalScope = host.isDefined

@patriknw patriknw commented on the diff Jan 4, 2013

...or/src/main/scala/akka/actor/dungeon/DeathWatch.scala
@@ -51,12 +51,14 @@ private[akka] trait DeathWatch { this: ActorCell ⇒
if (!watchedBy.isEmpty) {
val terminated = Terminated(self)(existenceConfirmed = true, addressTerminated = false)
try {
- watchedBy foreach {
- watcher ⇒
- try watcher.tell(terminated, self) catch {
@patriknw

patriknw Jan 4, 2013

Owner

why do we have the try-catch around tell?

@drewhk

drewhk Jan 4, 2013

Member

My intuition would be that not all ActorRef's are actual actors, so we might get exceptions when invoking tell. But I am not a dungeon-dweller :)

@viktorklang

viktorklang Jan 4, 2013

Owner

I think the try-catch is an artifact from the past where tell was allowed to throw exceptions. However, if it does throw an exception, if we don't catch and continue, the system is broken.

@patriknw

patriknw Jan 4, 2013

Owner

but we rely on that tell should not throw exception in other places, to avoid sprinkling this all over the place

@viktorklang

viktorklang Jan 4, 2013

Owner

That's true, let's remove and pray :-)

@drewhk

drewhk Jan 4, 2013

Member

Could sendSystemMessage throw an exception? Because there is a similar catch construct just one method down.

@viktorklang

viktorklang Jan 4, 2013

Owner

Not likely, but lets open a ticket about fixing all such occurences separately.

@rkuhn

rkuhn Jan 4, 2013

Collaborator

+1

Owner

patriknw commented Jan 4, 2013

LGTM after some optimization

Member

drewhk commented Jan 4, 2013

Updated according to review comments.

Owner

viktorklang commented Jan 4, 2013

+1!

Member

drewhk commented Jan 4, 2013

Umm, looking again, I should probably add a comment why the ordering is important.

Collaborator

akka-ci commented Jan 4, 2013

Started jenkins job akka-pr-validator at https://jenkins.akka.io/job/akka-pr-validator/274/

Collaborator

akka-ci commented Jan 4, 2013

jenkins job akka-pr-validator: Success - https://jenkins.akka.io/job/akka-pr-validator/274/

Member

drewhk commented Jan 4, 2013

Added comment describing why the specific ordering is needed, and squashed commits.

Owner

viktorklang commented Jan 4, 2013

👍

Member

drewhk commented Jan 4, 2013

PLS REBUILD ALL

Member

drewhk commented Jan 4, 2013

@rkuhn Can you look at this one before I merge?

Collaborator

rkuhn commented Jan 4, 2013

LGTM

Member

drewhk commented Jan 4, 2013

Updated: added scope queries to Address

Collaborator

rkuhn commented Jan 4, 2013

thanks, merge!

@drewhk drewhk added a commit that referenced this pull request Jan 4, 2013

@drewhk drewhk Merge pull request #997 from drewhk/wip-2835-ordering-of-terminated
Enforcing ordering of Terminated wrt remote/local #2835
cdd86cb

@drewhk drewhk merged commit cdd86cb into akka:master Jan 4, 2013

Collaborator

akka-ci commented Jan 4, 2013

Started jenkins job akka-pr-validator at https://jenkins.akka.io/job/akka-pr-validator/279/

Collaborator

akka-ci commented Jan 4, 2013

jenkins job akka-pr-validator: Failed - https://jenkins.akka.io/job/akka-pr-validator/279/

Member

drewhk commented Jan 4, 2013

CallingThreadDispatcherModelSpec failed. I added the failure to #2821.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment