Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MSVCR - aborting on theories test as worker_set is NULL in worker_read_event #58

Closed
am11 opened this issue Sep 25, 2015 · 12 comments
Closed

Comments

@am11
Copy link
Contributor

am11 commented Sep 25, 2015

On Windows x64 with VS2015, current bleeding branch (ref. 4cc826e) chokes on running theories test with following commands sequence:

:: in cmd
c: && mkdir \temp && cd \temp
git clone https://github.com/Snaipe/Criterion --recursive
cd Criterion
cmake . -DCTESTS=ON
cmake --build .
cmake --build . --target criterion_tests
ctest

Following dialog is shown, when building with Debug configuration and debugging with IDE suggests that it is aborting from src/core/worker.c#L72 (in release, it shows a different more cryptic message):

image

Here is the call stack:

    ucrtbased.dll!issue_debug_notification(const wchar_t * const message) Line 125  C++
    ucrtbased.dll!__acrt_report_runtime_error(const wchar_t * message) Line 142 C++
    ucrtbased.dll!abort() Line 51   C++
>   criterion.dll!worker_read_event(worker_set * workers, _iobuf * pipe) Line 72    C
    criterion.dll!run_tests_async(criterion_test_set * set, criterion_global_stats * stats) Line 357    C
    criterion.dll!criterion_run_all_tests_impl(criterion_test_set * set) Line 396   C
    criterion.dll!criterion_run_all_tests(criterion_test_set * set) Line 419    C
    criterion.dll!main(int argc, char * * argv) Line 9  C
    theories.c.bin.exe!invoke_main() Line 74    C++
    theories.c.bin.exe!__scrt_common_main_seh() Line 264    C++
    theories.c.bin.exe!__scrt_common_main() Line 309    C++
    theories.c.bin.exe!mainCRTStartup() Line 17 C++
    kernel32.dll!@BaseThreadInitThunk@12�() Unknown
    ntdll.dll!__RtlUserThreadStart()    Unknown
    ntdll.dll!__RtlUserThreadStart@8�() Unknown

while still at breakpoints, here is how the locals look like:

+       workers 0x0055f8cc {workers=0x0074a3f0 {0x00000000 <NULL>} max_workers=8 }  worker_set *
+       pipe    0x00754e50 {_Placeholder=0x00754e65 }   _iobuf *
+       ev  0x00756100 {pid=144115188075855918 kind=-1006632960 data=0x00000000 ...}    event *
@Snaipe
Copy link
Owner

Snaipe commented Sep 25, 2015

Just to be sure, does running the test with -j1 makes the runner still hit abort() ?

@am11
Copy link
Contributor Author

am11 commented Sep 25, 2015

Just to be sure, does running the test with -j1 makes the runner still hit abort() ?

Yup. Here is the working directory: http://1drv.ms/1PDoaZg

@Snaipe
Copy link
Owner

Snaipe commented Sep 26, 2015

Huh.

I downloaded your work directory, and run the test -- everything works fine.
I triple checked to see if it was using the correct .dll, and sure enough, it's using the one I downloaded.

... I hope I'm not going to lose my mind on this. In any case, I don't see how I can help beyond that. I'm going to check if this happens somehow on my windows 7 setup, but I think this has something to do with your setup, somehow...

Could you maybe try this:

  1. log in on your windows as an other user
  2. grab theories.cc.bin.exe
  3. put the .exe in its own directory
  4. execute the .exe (if it does not complain about not being able to load criterion.dll, then it's picking up a criterion.dll from somewhere else, and that's most likely the issue)
  5. grab criterion.dll, and put it in the same directory as the .exe
  6. execute it again

and report your findings ?

@am11
Copy link
Contributor Author

am11 commented Sep 26, 2015

Thanks for the steps and sure thing, I will report my finding soon'ish. I am actually working on two virtual machines. Once I will finish the tasks in hand, will reboot the system and take a fresh stab at it.

Incidentally, I closed the running command prompt and copied theories.cc.bin.exe to a separate directory. When I ran it from File Explorer (double clicking the exe), it complained about missing dll. Then I placed c:\temp\Criterion\Debug\criterion.dll with the relocated theories.cc.bin.exe and it executed successfully but aborted with same error. Could be due to the fact that my system resources are overloaded at the moment (as I running 2 VMs and many other projects on host), therefore some memory corruption / thread overlapping is taking place?

Nonetheless, I will keep you apprised if I find anything interesting. Meanwhile, this can be considered as a "maybe bug", with least probability. :)

@am11
Copy link
Contributor Author

am11 commented Sep 26, 2015

Finally got a chance to reboot. On my system, it is still throwing the exception but cannot reproduce on windows 10 VM with ditto steps. I will try to investigate what went wrong with my setup.

As this issue is specific to my system, I presume this ticket should be closed?

@Snaipe
Copy link
Owner

Snaipe commented Sep 26, 2015

I'm going to leave this open a bit more until we get some more infos, maybe there is a bug somewhere, but it only manifest itself on some very special conditions.

I'm currently testing multiple low-resource configurations to see if it comes from this.

@Snaipe
Copy link
Owner

Snaipe commented Sep 26, 2015

Ok, it seems that I am able to reproduce it sometimes.

I have configured the Windows 10 VM to use 1GB of RAM, and 2 cores running with a 50% resource attribution.

I am running from a git shell while ./theories.cc.bin.exe -j8 --no-early-exit; do true; done, eventually the same abort() is hit. Removing -j8 does not seem to trigger the error.

Edit: I have further lifted the resources to 2GB of RAM, and 2 dedicated cores; it still happens with the command above.

@Snaipe Snaipe added Normal and removed Unconfirmed labels Sep 26, 2015
@Snaipe
Copy link
Owner

Snaipe commented Sep 26, 2015

I've linked the error to the effective number of maximum jobs. Calling ./theories.cc.bin.exe -j30 immediately triggers the abort(). This should provide enough ground to work with.

@Snaipe Snaipe added the Windows label Sep 26, 2015
@Snaipe
Copy link
Owner

Snaipe commented Sep 26, 2015

After further investigation, it seems that the problem arises from the pipe created by windows. Currently, to create a pipe, CreatePipe is invoked with a size of 0, which means that it should use the defaults that are system and context dependent (which is probably why I wasn't able to reproduce it at first). I suspect that in fact, when writing the event from the test to the runner, not everything gets written to the pipe, most likely because it happens to be full.

Explicitely specifying a large size for the buffer (like 10 * 4096) fixes the issue, but would be the wrong way of fixing it. What I don't understand is why writing to the pipe when the pipe is full does not block the fwrite call, when CreatePipe returns file handles that are explicitely blocking.

@Snaipe
Copy link
Owner

Snaipe commented Sep 26, 2015

@am11 I'm trying an experimental fix that so far didn't trigger the error on my end, could you try it on your system ? The fix is on the fix/58 branch.

Edit: I tried another (and probably more correct) approach for a fix that also doesn't trigger the abort() on the fix/58-alt branch. Could you also test it on your end ?

@am11
Copy link
Contributor Author

am11 commented Sep 27, 2015

@Snaipe, I can confirm that both fix/58 and fix/58-alt branches do not possess this issue, which I am getting on same system with tip of the bleeding (f1dfff5 as of now).
Thanks for the quick fixes! 👍

@Snaipe
Copy link
Owner

Snaipe commented Sep 27, 2015

Great. I'm merging fix/58-alt, as I suspect the commit on fix/58 to work more by accident, and does not address the root cause, which fix/58-alt does.

@Snaipe Snaipe closed this as completed Sep 27, 2015
@Snaipe Snaipe modified the milestone: v2.2.0 Nov 27, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants