Parallel verrou_dd crashes with ValueError #10

HadrienG2 · 2018-06-19T16:27:34Z

When I run verrou_dd in parallel on a test workload of mine, it systematically crashes on the second iteration with this kind of backtrace. Sequential runs work fine on the same workload.

$ VERROU_DD_NUM_THREADS=4 VERROU_DD_NRUNS=4 verrou_dd `pwd`/run.sh `pwd`/cmp.sh
[...]
dd (run #1): trying 6275 + 6275
/root/acts-core/build/IntegrationTests/dd.sym/ca2681d399ee504572a37d53b1416f6f  --( run )-> 
Traceback (most recent call last):
  File "/usr/local/bin/verrou_dd", line 633, in <module>
    main(runScript, cmpScript, algoSearch=ddAlgo)
  File "/usr/local/bin/verrou_dd", line 605, in main
    (refSym, confSymsTab) = ddSym(run, compare)
  File "/usr/local/bin/verrou_dd", line 438, in ddSym
    conf = dd.ddmax(deltas)
  File "/usr/local/lib/python2.7/site-packages/valgrind/DD.py", line 733, in ddmax
    return self.ddgen(c, 0, 1)
  File "/usr/local/lib/python2.7/site-packages/valgrind/DD.py", line 607, in ddgen
    outcome = self._dd(c, n)
  File "/usr/local/lib/python2.7/site-packages/valgrind/DD.py", line 670, in _dd
    (t, cs[i]) = self.test_mix(cs[i], c, self.REMOVE)
  File "/usr/local/lib/python2.7/site-packages/valgrind/DD.py", line 580, in test_mix
    directionbar)
  File "/usr/local/lib/python2.7/site-packages/valgrind/DD.py", line 384, in test_and_resolve
    t = self.test(csubr)
  File "/usr/local/lib/python2.7/site-packages/valgrind/DD.py", line 313, in test
    outcome = self._test(c)
  File "/usr/local/bin/verrou_dd", line 409, in _test
    return vT.run()
  File "/usr/local/bin/verrou_dd", line 127, in run
    return self.runParMax(maxNbPROC)
  File "/usr/local/bin/verrou_dd", line 202, in runParMax
    run=self.pidRunTab.index(pid)                
ValueError: 50 is not in list

My test workload is a bit complicated, but I have it inside of a docker container if that can be useful. Or maybe we can find a simpler reproducer.

This is somewhat related to #8 , in the sense that if the end decision is to change the verrou_dd parallelization algorithm, it may not be worth expending too much energy at fixing the existing one.

The text was updated successfully, but these errors were encountered:

lathuili · 2018-06-19T18:09:15Z

I think you get a high score with 12500 symbols.
It looks like a bug in your python scheduler which I want to write again with python3.
If you can keep this workload test for latter, I'm interested.

HadrienG2 · 2018-06-19T18:20:36Z

That's a C++ binary that uses boost + Eigen and is built in O0 mode. I'm not surprised that the symbol table got crazy :) Please ping me when you are done with the python3 port, and it will be my pleasure to torture it as well.

lathuili · 2021-06-01T09:22:12Z

Since v2.3.1, the delta-debug should be robust enough to treat this problem. If not you can open
a new issue.

lathuili closed this as completed Jun 1, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parallel verrou_dd crashes with ValueError #10

Parallel verrou_dd crashes with ValueError #10

HadrienG2 commented Jun 19, 2018 •

edited

Loading

lathuili commented Jun 19, 2018

HadrienG2 commented Jun 19, 2018

lathuili commented Jun 1, 2021

Parallel verrou_dd crashes with ValueError #10

Parallel verrou_dd crashes with ValueError #10

Comments

HadrienG2 commented Jun 19, 2018 • edited Loading

lathuili commented Jun 19, 2018

HadrienG2 commented Jun 19, 2018

lathuili commented Jun 1, 2021

HadrienG2 commented Jun 19, 2018 •

edited

Loading