Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nemesis crashes on concurrent iptables operations #63

Closed
jhalterman opened this issue Jun 15, 2015 · 4 comments
Closed

Nemesis crashes on concurrent iptables operations #63

jhalterman opened this issue Jun 15, 2015 · 4 comments

Comments

@jhalterman
Copy link
Contributor

I'm experiencing fairly frequent nemesis crashes on the partition! call in a test I'm working on, which AFAICT is being caused by concurrent iptables operations:

18:00:15.321 WARN  [jepsen nemesis]: jepsen.core - Nemesis crashed evaluating {:time 79828671383, :process :nemesis, :type :info, :f :start}
java.util.concurrent.ExecutionException: java.util.concurrent.ExecutionException: java.lang.RuntimeException: Another app is currently holding the xtables lock. Perhaps you want to use the -w option?


  at java.util.concurrent.FutureTask.report(FutureTask.java:122)
  at java.util.concurrent.FutureTask.get(FutureTask.java:192)
  at clojure.core$deref_future.invoke(core.clj:2180)
  at clojure.core$future_call$reify__6320.deref(core.clj:6420)
  at clojure.core$deref.invoke(core.clj:2200)
  at clojure.core$map$fn__4245.invoke(core.clj:2559)
  at clojure.lang.LazySeq.sval(LazySeq.java:40)
  at clojure.lang.LazySeq.seq(LazySeq.java:49)
  at clojure.lang.Cons.next(Cons.java:39)
  at clojure.lang.RT.next(RT.java:598)
  at clojure.core$next.invoke(core.clj:64)
  at clojure.core$dorun.invoke(core.clj:2856)
  at jepsen.nemesis$partition_BANG_.invoke(nemesis.clj:28)
  at jepsen.nemesis$partitioner$reify__3494.invoke_BANG_(nemesis.clj:85)
  at jepsen.core$nemesis_worker$fn__3163$fn__3168.invoke(core.clj:192)
  at jepsen.core$nemesis_worker$fn__3163.invoke(core.clj:190)
  at clojure.core$binding_conveyor_fn$fn__4145.invoke(core.clj:1910)
  at clojure.lang.AFn.call(AFn.java:18)
  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
  at java.lang.Thread.run(Thread.java:745)

I'm not sure what the best approach for avoiding this is, but wanted to at least file it and see what you thought...

@aphyr
Copy link
Collaborator

aphyr commented Jun 15, 2015

Probably best to do a short retry loop with a timeout there.

@jkni
Copy link
Contributor

jkni commented Jun 18, 2015

I've seen this too - I've done short retry loops, but failing that, --wait on the iptables commands hasn't caused any issues either.

@aphyr
Copy link
Collaborator

aphyr commented Jun 18, 2015

Oh, yeah, let's just add --wait then :)

@jhalterman
Copy link
Contributor Author

@jkni Good call. -w seems to work fine!

def- pushed a commit to def-/jepsen that referenced this issue May 24, 2023
* Bank improved scenarios

Co-authored-by: Dmitry Sherstobitov <dsherstobitov@yugabyte.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants