Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parallel verrou_dd crashes with ValueError #10

Closed
HadrienG2 opened this issue Jun 19, 2018 · 3 comments
Closed

Parallel verrou_dd crashes with ValueError #10

HadrienG2 opened this issue Jun 19, 2018 · 3 comments

Comments

@HadrienG2
Copy link
Contributor

HadrienG2 commented Jun 19, 2018

When I run verrou_dd in parallel on a test workload of mine, it systematically crashes on the second iteration with this kind of backtrace. Sequential runs work fine on the same workload.

$ VERROU_DD_NUM_THREADS=4 VERROU_DD_NRUNS=4 verrou_dd `pwd`/run.sh `pwd`/cmp.sh
[...]
dd (run #1): trying 6275 + 6275
/root/acts-core/build/IntegrationTests/dd.sym/ca2681d399ee504572a37d53b1416f6f  --( run )-> 
Traceback (most recent call last):
  File "/usr/local/bin/verrou_dd", line 633, in <module>
    main(runScript, cmpScript, algoSearch=ddAlgo)
  File "/usr/local/bin/verrou_dd", line 605, in main
    (refSym, confSymsTab) = ddSym(run, compare)
  File "/usr/local/bin/verrou_dd", line 438, in ddSym
    conf = dd.ddmax(deltas)
  File "/usr/local/lib/python2.7/site-packages/valgrind/DD.py", line 733, in ddmax
    return self.ddgen(c, 0, 1)
  File "/usr/local/lib/python2.7/site-packages/valgrind/DD.py", line 607, in ddgen
    outcome = self._dd(c, n)
  File "/usr/local/lib/python2.7/site-packages/valgrind/DD.py", line 670, in _dd
    (t, cs[i]) = self.test_mix(cs[i], c, self.REMOVE)
  File "/usr/local/lib/python2.7/site-packages/valgrind/DD.py", line 580, in test_mix
    directionbar)
  File "/usr/local/lib/python2.7/site-packages/valgrind/DD.py", line 384, in test_and_resolve
    t = self.test(csubr)
  File "/usr/local/lib/python2.7/site-packages/valgrind/DD.py", line 313, in test
    outcome = self._test(c)
  File "/usr/local/bin/verrou_dd", line 409, in _test
    return vT.run()
  File "/usr/local/bin/verrou_dd", line 127, in run
    return self.runParMax(maxNbPROC)
  File "/usr/local/bin/verrou_dd", line 202, in runParMax
    run=self.pidRunTab.index(pid)                
ValueError: 50 is not in list

My test workload is a bit complicated, but I have it inside of a docker container if that can be useful. Or maybe we can find a simpler reproducer.

This is somewhat related to #8 , in the sense that if the end decision is to change the verrou_dd parallelization algorithm, it may not be worth expending too much energy at fixing the existing one.

@lathuili
Copy link
Contributor

I think you get a high score with 12500 symbols.
It looks like a bug in your python scheduler which I want to write again with python3.
If you can keep this workload test for latter, I'm interested.

@HadrienG2
Copy link
Contributor Author

That's a C++ binary that uses boost + Eigen and is built in O0 mode. I'm not surprised that the symbol table got crazy :) Please ping me when you are done with the python3 port, and it will be my pleasure to torture it as well.

@lathuili
Copy link
Contributor

lathuili commented Jun 1, 2021

Since v2.3.1, the delta-debug should be robust enough to treat this problem. If not you can open
a new issue.

@lathuili lathuili closed this as completed Jun 1, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants