Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

supervisor does not appear to handle gen-server crashing #63

Closed
sandhu opened this issue Sep 22, 2016 · 9 comments
Closed

supervisor does not appear to handle gen-server crashing #63

sandhu opened this issue Sep 22, 2016 · 9 comments
Labels

Comments

@sandhu
Copy link

sandhu commented Sep 22, 2016

I'm trying to get a handle on the supervisor in pulsar and running into an issue when using it to manage gen-servers.

The supervisor does not appear to do anything if the gen-server throws an exception in the init.

The gen-server code is as follows:

(defn test-gen-server
  [name]
  (gen-server
   (reify Server
     (init [_]
       (println "Starting Server...")
       ;; Intentionally crash the gen-server
       (throw (Exception. "Blah"))
       (register! name @self)
       (println "Started Server."))
     (terminate [_ cause]
       (println "Stopping Server...")
       (unregister! @self)
       (println "Stopped Server."))
     (handle-call [_ from id [command param]]
       ))))

And I'm launching it as follows:

(defn run-server-via-supervisor
  []
  (spawn
   (supervisor "entry-point" :one-for-one
               (fn []
                 [["test/test-server" :permanent 20 5 :sec 100
                   (test-gen-server "test/test-server")]]))))

Running it produces:

> (run-server-via-supervisor)
#object[co.paralleluniverse.actors.behaviors.Supervisor 0x41f14811 "Supervisor{ActorRef@41f14811{SupervisorActor@entry-point[owner: entry-point]}}"]
Starting Server...
Stopping Server...
Stopped Server.

It does not appear that the supervisor is attempting the 20 restarts as indicated in the spec.

Additionally the output is identical to simply spawning the gen-server directly.

> (spawn (test-gen-server "test/test-server"))
#object[co.paralleluniverse.actors.behaviors.Server 0x129ff808 "Server{ActorRef@129ff808{ServerActor@6becac41[owner: fiber-10000007]}}"]
Starting Server... 
Stopping Server...
Stopped Server.

I've attached a minimum test project for reference — pulsar-test.zip

Please let me know if there is any additional information I can provide to help debug this, or if my understanding of the supervisor is incorrect.

@sandhu
Copy link
Author

sandhu commented Sep 22, 2016

I'll add that things work as expected with an actor

pulsar-test.core> (spawn
                   (supervisor "entry-point" :one-for-one
                               (fn []
                                 [["test/test-server" :permanent 20 5 :sec 100
                                   (fn []
                                     (println "Starting actor")
                                     (throw (Exception. "from actor")))]])))
#object[co.paralleluniverse.actors.behaviors.Supervisor 0x2153fbbc "Supervisor{ActorRef@2153fbbc{SupervisorActor@entry-point[owner: entry-point]}}"]
Starting actor
Starting actor
Starting actor
Starting actor
Starting actor
Starting actor
Starting actor
Starting actor
Starting actor
Starting actor
Starting actor
Starting actor
Starting actor
Starting actor
Starting actor
Starting actor
Starting actor
Starting actor
Starting actor
Starting actor
Starting actor

@sandhu
Copy link
Author

sandhu commented Sep 22, 2016

Similar result when throwing from handle-timeout

(defn test-gen-server
  [name]
  (gen-server :timeout 2000
              (reify Server
                (init [_]
                  (println "Starting Server...")
                  (register! name @self)
                  (println "Started Server."))
                (terminate [_ cause]
                  (println "Stopping Server..." cause)
                  (unregister! @self)
                  (println "Stopped Server."))
                (handle-call [_ from id [command param]]
                  )
                (handle-timeout [_]
                  (println "Throwing from handle-timeout")
                  (throw (Exception. "Blah timeout"))))))
pulsar-test.core> (def s (run-server-via-supervisor)) 
#'pulsar-test.core/s
Starting Server...
Started Server.
Throwing from handle-timeout
Stopping Server... #error {
 :cause Blah timeout
 :via
 [{:type java.lang.Exception
   :message Blah timeout
   :at [pulsar_test.core$test_gen_server$reify__25316 handle_timeout form-init1441529214695591630.clj 27]}]
 :trace
 [[pulsar_test.core$test_gen_server$reify__25316 handle_timeout form-init1441529214695591630.clj 27]
  [co.paralleluniverse.pulsar.actors$Server$reify__25164 handleTimeout actors.clj 665]
  [co.paralleluniverse.actors.behaviors.ServerActor handleTimeout ServerActor.java 360]
  [co.paralleluniverse.actors.behaviors.ServerActor behavior ServerActor.java 199]
  [co.paralleluniverse.actors.behaviors.BehaviorActor doRun BehaviorActor.java 293]
  [co.paralleluniverse.actors.behaviors.BehaviorActor doRun BehaviorActor.java 36]
  [co.paralleluniverse.actors.Actor run0 Actor.java 691]
  [co.paralleluniverse.actors.ActorRunner run ActorRunner.java 51]
  [co.paralleluniverse.fibers.Fiber run Fiber.java 1072]
  [co.paralleluniverse.fibers.Fiber run1 Fiber.java 1067]
  [co.paralleluniverse.fibers.Fiber
 exec Fiber.java 767]
  [co.paralleluniverse.fibers.FiberForkJoinScheduler$FiberForkJoinTask exec1 FiberForkJoinScheduler.java 266]
  [co.paralleluniverse.concurrent.forkjoin.ParkableForkJoinTask doExec ParkableForkJoinTask.java 117]
  [co.paralleluniverse.concurrent.forkjoin.ParkableForkJoinTask exec ParkableForkJoinTask.java 74]
  [jsr166e.ForkJoinTask doExec ForkJoinTask.java 261]
  [jsr166e.ForkJoinPool$WorkQueue runTask ForkJoinPool.java 988]
  [jsr166e.ForkJoinPool runWorker ForkJoinPool.java 1628]
  [jsr166e.ForkJoinWorkerThread run ForkJoinWorkerThread.java 107]]}
Stopped Server.
pulsar-test.core> 

@sandhu sandhu changed the title supervisor does not appear to handle gen-server crashing in the (init ...) supervisor does not appear to handle gen-server crashing Sep 23, 2016
@sandhu
Copy link
Author

sandhu commented Sep 30, 2016

@pron, @circlespainter — Apologies for tagging you guys directly, but this issue is blocking my application.

Could you please take a look. I may very well be using gen-server and/or supervisor incorrectly, but it'd be helpful to know.

A minimal example that reproduces the issue is attached to the original post.

Thank you.

@pron
Copy link
Contributor

pron commented Oct 3, 2016

I'll take a look shortly, and respond within the week.

@sandhu
Copy link
Author

sandhu commented Oct 3, 2016

Thank you @pron. Very much appreciated.

@pron pron added the bug label Oct 7, 2016
@pron
Copy link
Contributor

pron commented Oct 7, 2016

Found the problem. I'll push a fix tomorrow.

@sandhu
Copy link
Author

sandhu commented Oct 7, 2016

That's great. Thank you.

@pron pron closed this as completed in 2c75786 Oct 8, 2016
@pron
Copy link
Contributor

pron commented Oct 8, 2016

Alright, you'll need to use 0.7.7-SNAPSHOT.
Instead of (test-gen-server "test/test-server") write:

(actor-builder test-gen-server "test/test-server")

or

(actor-builder #(test-gen-server "test/test-server"))

to tell the supervisor that this is a function that (re)creates the actor.

@sandhu
Copy link
Author

sandhu commented Oct 9, 2016

Thanks @pron, it works as expected in my example code. Will try it in the full application later today and let you know if there are any issues.

Is there a timeframe for the 0.7.7 release ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants