Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ctrl-c to abort running tests broken in 4.0.5 #408

Closed
jrwagz opened this issue Nov 13, 2018 · 13 comments
Closed

ctrl-c to abort running tests broken in 4.0.5 #408

jrwagz opened this issue Nov 13, 2018 · 13 comments

Comments

@jrwagz
Copy link

jrwagz commented Nov 13, 2018

First of all, great framework! I've been very happy with it.

I first started using v3.9.0 and noticed that when running the simulations in parallel that I could easily abort them all with ctrl-c and I'd get a nice report of what ran, and what was aborted.

Now I grabbed v4.0.5 and I've got simulations running, however I am unable to abort the execution with a ctrl-c like I used to be able to.

Has anyone else noticed this?

@jrwagz
Copy link
Author

jrwagz commented Nov 13, 2018

On second attempt, it appears to finally work, but did require 4 separate ctrl-c commands (i had p set to 4)

Starting lib.top.serial_channel
Output file: vunit_out/test_output/lib.top.serial_channel_fe65354585007cab7ed52542326197aee220c79b/output.txt
Starting lib.top.ethernet_channel
Output file: vunit_out/test_output/lib.top.ethernet_channel_dc92795136e4223f7bc16a18cdbbc55d157b4a35/output.txt
Starting lib.top.embedded_channel
Output file: vunit_out/test_output/lib.top.embedded_channel_832b083981f2f139055d7c5f89c31a8f31c50987/output.txt
Starting lib.top.arl_filter
Output file: vunit_out/test_output/lib.top.arl_filter_885977bf85dcef8dea96c4889ec24ec5d0a74db1/output.txt
^C^C
==== Summary ====================================
skip lib.top.arl_filter       (12.3 seconds)
skip lib.top.embedded_channel (12.4 seconds)
skip lib.top.serial_channel   (12.4 seconds)
skip lib.top.ethernet_channel (12.4 seconds)
=================================================
pass 0 of 4
skip 4 of 4
=================================================
Total time was 49.5 seconds
Elapsed time was 12.5 seconds
=================================================
Some skipped!
WARNING: Test execution aborted after running 4 out of 11 tests
^CException ignored in: <bound method Process.__del__ of <vunit.ostools.Process object at 0x7fac161fcbe0>>
Traceback (most recent call last):
  File "/data/engtmp/jrwagner/python_scripts/vunit/vunit/ostools.py", line 243, in __del__
    self.terminate()
  File "/data/engtmp/jrwagner/python_scripts/vunit/vunit/ostools.py", line 238, in terminate
    self._reader.join()
  File "/opt/rh/rh-python36/root/usr/lib64/python3.6/threading.py", line 1056, in join
    self._wait_for_tstate_lock()
  File "/opt/rh/rh-python36/root/usr/lib64/python3.6/threading.py", line 1072, in _wait_for_tstate_lock
    elif lock.acquire(block, timeout):
KeyboardInterrupt
^CException ignored in: <module 'threading' from '/opt/rh/rh-python36/root/usr/lib64/python3.6/threading.py'>
Traceback (most recent call last):
  File "/opt/rh/rh-python36/root/usr/lib64/python3.6/threading.py", line 1294, in _shutdown
    t.join()
  File "/opt/rh/rh-python36/root/usr/lib64/python3.6/threading.py", line 1056, in join
    self._wait_for_tstate_lock()
  File "/opt/rh/rh-python36/root/usr/lib64/python3.6/threading.py", line 1072, in _wait_for_tstate_lock
    elif lock.acquire(block, timeout):
KeyboardInterrupt

@jrwagz
Copy link
Author

jrwagz commented Nov 13, 2018

Notice the two instances of ^C and then two instances of KeyboardInterrupt which correspond to when I was inputting ctrl-c

@kraigher
Copy link
Collaborator

Unfortunately I cannot reproduce this.
I have reviewed the changes between v4.0.5 and v.3.9.0 and I cannot spot anything in the environment which is related Ctrl-C during test running.

Could you run with --log-level=debug and post the output?

@jrwagz
Copy link
Author

jrwagz commented Nov 14, 2018

Hmm, ok. Sounds like it could very well possibly be an environment issue on my end. I’ll reboot my windows machine (I’m using Cygwin on Windows 10 to ssh into the Linux machines running the actual code in question), and test again, and specifically test the two versions. I’ll report back. Thanks for the quick response.

@slaweksiluk
Copy link
Contributor

I've also encountered this problem a couple of times. I will paste debug log when it will happen again.

@jrwagz
Copy link
Author

jrwagz commented Nov 14, 2018

Ok, I performed a reboot, and I still get the same behavior. So I again ran with the debug output as requested. Here is the results:

The command I used was: python3 ./run.py -p 4 -x vunit_sim_results.xml --log-level=debug

I've skipped lots of the debug output that occurred before the tests started running, here is the output from the point where it determined that no recompile was necessary till the end. Again, I needed to press ctrl-c 4 times for it to completely stop, and each time you'll see the ^C in the log indicating where that occurred.

Re-compile not needed

Starting lib.top.serial_channel
Output file: vunit_out/test_output/lib.top.serial_channel_fe65354585007cab7ed52542326197aee220c79b/output.txt
Starting lib.top.ethernet_channel
Output file: vunit_out/test_output/lib.top.ethernet_channel_dc92795136e4223f7bc16a18cdbbc55d157b4a35/output.txt
Starting lib.top.embedded_channel
Output file: vunit_out/test_output/lib.top.embedded_channel_832b083981f2f139055d7c5f89c31a8f31c50987/output.txt
Starting lib.top.arl_filter
Output file: vunit_out/test_output/lib.top.arl_filter_885977bf85dcef8dea96c4889ec24ec5d0a74db1/output.txt
  DEBUG - Started process with pid=15562: '/apps/questasim-10.2d/questasim/linux_x86_64/vsim -c -l .../vunit_out/modelsim/transcript139838126200576 -do .../vunit/vunit/tcl_read_eval_loop.tcl'
  DEBUG - Started process with pid=15566: '/apps/questasim-10.2d/questasim/linux_x86_64/vsim -c -l .../vunit_out/modelsim/transcript139838070445824 -do .../vunit/vunit/tcl_read_eval_loop.tcl'
  DEBUG - Started process with pid=15572: '/apps/questasim-10.2d/questasim/linux_x86_64/vsim -c -l .../vunit_out/modelsim/transcript139838059955968 -do .../vunit/vunit/tcl_read_eval_loop.tcl'
  DEBUG - Started process with pid=15584: '/apps/questasim-10.2d/questasim/linux_x86_64/vsim -c -l .../vunit_out/modelsim/transcript139838049466112 -do .../vunit/vunit/tcl_read_eval_loop.tcl'
^C  DEBUG - Terminating process with pid=15562
  DEBUG - Killing process with pid=15562
  DEBUG - Waiting for process with pid=15562
  DEBUG - Waiting for process with pid=15562 to stop
  DEBUG - Process with pid=15562 terminated with code=-9
^C  DEBUG - TestRunner: Caught Ctrl-C shutting down
  DEBUG - Terminating process with pid=15566
  DEBUG - Terminating process with pid=15584
  DEBUG - Killing process with pid=15566
  DEBUG - Waiting for process with pid=15566
  DEBUG - Terminating process with pid=15572
  DEBUG - Killing process with pid=15584
  DEBUG - Waiting for process with pid=15584
  DEBUG - Killing process with pid=15572
  DEBUG - Waiting for process with pid=15572
  DEBUG - TestRunner: Leaving

  DEBUG - _main: Caught Ctrl-C shutting down
==== Summary ====================================
skip lib.top.arl_filter       (19.2 seconds)
skip lib.top.serial_channel   (19.3 seconds)
skip lib.top.embedded_channel (19.3 seconds)
skip lib.top.ethernet_channel (19.4 seconds)
=================================================
pass 0 of 4
skip 4 of 4
=================================================
Total time was 77.2 seconds
Elapsed time was 19.4 seconds
=================================================
Some skipped!
WARNING: Test execution aborted after running 4 out of 10 tests
  DEBUG - Process with pid=15562 terminated with code=-9
^CException ignored in: <bound method Process.__del__ of <vunit.ostools.Process object at 0x7f2e9684f470>>
Traceback (most recent call last):
  File "../../git_cores/python_scripts/vunit/vunit/ostools.py", line 243, in __del__
    self.terminate()
  File "../../git_cores/python_scripts/vunit/vunit/ostools.py", line 238, in terminate
    self._reader.join()
  File "/opt/rh/rh-python36/root/usr/lib64/python3.6/threading.py", line 1056, in join
    self._wait_for_tstate_lock()
  File "/opt/rh/rh-python36/root/usr/lib64/python3.6/threading.py", line 1072, in _wait_for_tstate_lock
    elif lock.acquire(block, timeout):
KeyboardInterrupt
  DEBUG - Process with pid=15566 terminated with code=-9
  DEBUG - Process with pid=15572 terminated with code=-9
  DEBUG - Process with pid=15584 terminated with code=-9
^CException ignored in: <module 'threading' from '/opt/rh/rh-python36/root/usr/lib64/python3.6/threading.py'>
Traceback (most recent call last):
  File "/opt/rh/rh-python36/root/usr/lib64/python3.6/threading.py", line 1294, in _shutdown
    t.join()
  File "/opt/rh/rh-python36/root/usr/lib64/python3.6/threading.py", line 1056, in join
    self._wait_for_tstate_lock()
  File "/opt/rh/rh-python36/root/usr/lib64/python3.6/threading.py", line 1072, in _wait_for_tstate_lock
    elif lock.acquire(block, timeout):
KeyboardInterrupt

@jrwagz
Copy link
Author

jrwagz commented Nov 14, 2018

It's completely possible that this was present in 3.9.0 and that I just happened to never notice it. I had only been using 3.9.0 for a couple weeks tops.

@jrwagz
Copy link
Author

jrwagz commented Nov 14, 2018

I'm using python 3.6.3:

python3 --version
Python 3.6.3

@kraigher
Copy link
Collaborator

I will add more detailed debug logging tomorrow and ask you to try again.

@kraigher
Copy link
Collaborator

I was able to reproduce an instance of the problem now and fixed it.
I hope it was the same problem instance as you experience.
Also I added more debug logging in case it was not fixed.
Could you try it out and report back?

@jrwagz
Copy link
Author

jrwagz commented Nov 15, 2018

Thanks! That definitely affected the behavior I was seeing, now only a single ctrl-c was needed to abort the process. Here is the output, just for reference:

Starting lib.top.test1
Output file: vunit_out/test_output/lib.top.test1_fe65354585007cab7ed52542326197aee220c79b/output.txt
Starting lib.top.test2
Output file: vunit_out/test_output/lib.top.test2_dc92795136e4223f7bc16a18cdbbc55d157b4a35/output.txt
Starting lib.top.test3
Output file: vunit_out/test_output/lib.top.test3_832b083981f2f139055d7c5f89c31a8f31c50987/output.txt
Starting lib.top.test4
Output file: vunit_out/test_output/lib.top.test4_885977bf85dcef8dea96c4889ec24ec5d0a74db1/output.txt
  DEBUG - Started process with pid=23497: '/apps/questasim-10.2d/questasim/linux_x86_64/vsim -c -l .../vunit_out/modelsim/transcript139694629865216 -do .../vunit/tcl_read_eval_loop.tcl'
  DEBUG - Started process with pid=23500: '/apps/questasim-10.2d/questasim/linux_x86_64/vsim -c -l .../vunit_out/modelsim/transcript139694619375360 -do .../vunit/tcl_read_eval_loop.tcl'
  DEBUG - Started process with pid=23511: '/apps/questasim-10.2d/questasim/linux_x86_64/vsim -c -l .../vunit_out/modelsim/transcript139694760331008 -do .../vunit/tcl_read_eval_loop.tcl'
  DEBUG - Started process with pid=23519: '/apps/questasim-10.2d/questasim/linux_x86_64/vsim -c -l .../vunit_out/modelsim/transcript139694608885504 -do .../vunit/tcl_read_eval_loop.tcl'
^C  DEBUG - MainWorkerThread: Caught Ctrl-C shutting down
  DEBUG - TestRunner: Caught Ctrl-C shutting down
  DEBUG - ProgramStatus.shutdown
  DEBUG - TestRunner: Leaving

  DEBUG - _main: Caught Ctrl-C shutting down
==== Summary ====================================
skip lib.top.test4 (21.0 seconds)
skip lib.top.test3 (21.1 seconds)
skip lib.top.test1 (21.1 seconds)
skip lib.top.test2 (21.1 seconds)
=================================================
pass 0 of 4
skip 4 of 4
=================================================
Total time was 84.2 seconds
Elapsed time was 21.1 seconds
=================================================
Some skipped!
WARNING: Test execution aborted after running 4 out of 10 tests
  DEBUG - PersistentTclShell.__del__: Ignoring KeyboardInterrupt
  DEBUG - Terminating process with pid=23497
  DEBUG - Killing process with pid=23497
  DEBUG - Waiting for process with pid=23497
  DEBUG - Process.__del__: Ignoring KeyboardInterrupt
  DEBUG - Terminating process with pid=23500
  DEBUG - Killing process with pid=23500
  DEBUG - Waiting for process with pid=23500
  DEBUG - Process.__del__: Ignoring KeyboardInterrupt
  DEBUG - Terminating process with pid=23511
  DEBUG - Killing process with pid=23511
  DEBUG - Waiting for process with pid=23511
  DEBUG - Process.__del__: Ignoring KeyboardInterrupt
  DEBUG - Terminating process with pid=23519
  DEBUG - Killing process with pid=23519
  DEBUG - Waiting for process with pid=23519
  DEBUG - Process.__del__: Ignoring KeyboardInterrupt

@jrwagz jrwagz closed this as completed Nov 15, 2018
@lunamoonli
Copy link

@kraigher Hi I'm just wondering how you fixed the issue? I seem to be experiencing the same error

@a-panella
Copy link

a-panella commented Jul 7, 2021

Hi all, just bumping this thread because like @lunamoonli I have the same issue. On older versions of vunit that I can remember, this never occurred. Now it occurs when I use python v3.8.10, installed vunit-hdl via pip3, and and using ghdl with the gcc backend
It happen for every single vunit TB I have written, but it consistently happens for every TB that is broken. Here is some output:

^CException ignored in: <module 'threading' from '/usr/lib/python3.8/threading.py'>
Traceback (most recent call last):
  File "/usr/lib/python3.8/threading.py", line 1388, in _shutdown
    lock.acquire()
KeyboardInterrupt: 

Will run an update and see if that fixes it....

Edit: It seems that the lockups are due to running distributed processes, and not waiting until starting the second one. I.e your process that uses

if running_test_case = "x" then ...

should wait until a start condition, and then start. This fixed the lockup for me. Example:

checker_proc : process begin
    wait until start_stim; --important!
    if running_test_case = "test_1" then

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants