Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
EMR Action on failure not working #1361
I concur, although I don't think it is boto's fault. Boto sets the ActionOnFailure parameter to the string passed in. If CANCEL_AND_WAIT is used, and the step is checked inside AWS EMR web interface, the step lists:
Action on failure: Terminate cluster
Does this work for anyone? I thought a workaround would be to set keep_alive=True, I do see Auto-terminate:No. However, since the step still lists terminating the cluster, it shuts down.
Basic initiation script (pigtest.csv can probably be any short file, and testscript1.pig just loads and then writes to a database); this errors out for some reason, and the cluster terminates.
Perhaps even more strangely, merely adding the pigstep with conn.add_jobflow_steps() to a cluster launched as follows will cause the server to shut down after its failure, even though the cluster has CONTINUE specified under ActionOnFailure (this doesn't happen if a step fails when submitted to the cluster from the console, or command line):
+1 I was able to fix the issue for the
This is in