Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simulation models compiled with -shared do not exit cleanly upon failure #803

Closed
umarcor opened this issue Apr 25, 2019 · 5 comments
Closed

Comments

@umarcor
Copy link
Member

umarcor commented Apr 25, 2019

Ref #640, #670, #800, #804, #805, #1053 and #1398.

It is possible to dynamically load a PIE binary or shared library generated by GHDL. This is useful for co-simulation, as it allows to allocate memory buffers in any language compatible with C-alike objects (C/C++, Ada, Python, Golang, Rust, etc.) and have data processed by a GHDL simulation without intermediate files. It is also possible to inspect shared buffers which are used in the VHDL design. See ghdl/ghdl-cosim.


Currently, failing simulations that are dynamically loaded do produce an Abortion. This forces any C or Python wrapper/caller to exit inmediately, without running any post-check. See ghdl/ghdl-cosim#15.

@tgingold:
About abort(): I don't think it is used during normal operations (even simulation failure). A reproducer would be useful.


@tgingold
For the mcode backend, maybe a shared library of ghdl_mcode should be provided, like python is providing a shared library of the interpreter.
I will think about it.


-fPIC -pie -fPIE

@umarcor umarcor changed the title Dynamically loading design binaries : shared libraries co-simulation: dynamically loading designs/artifacts built with ghdl Sep 14, 2019
@umarcor umarcor mentioned this issue Oct 13, 2019
1 task
@umarcor umarcor changed the title co-simulation: dynamically loading designs/artifacts built with ghdl Simulation models compiled with -shared do not exit cleanly upon failure Jan 24, 2021
@umarcor
Copy link
Member Author

umarcor commented Jan 24, 2021

Description

This is a MWE to illustrate the issue in ghdl/ghdl-cosim#15:

How to reproduce?

entity tb is
end entity;

architecture arch of tb is
begin
  process begin
    report "Hello!" severity failure;
    wait;
  end process;
end;

--:file: tb.vhd
#include <dlfcn.h>
#include <stdio.h>
#include <stdlib.h>

int main(int argc, void** argv) {

  void* h = dlopen("./tb.so", RTLD_LAZY);
  if (!h){
    fprintf(stderr, "%s\n", dlerror());
    exit(1);
  }

  typedef int main_t(int, void**);

  h = dlopen("./tb.so", RTLD_LAZY);
  if (!h){
    fprintf(stderr, "%s\n", dlerror());
    exit(1);
  }

  main_t* ghdl_main = (main_t*)dlsym(h, "ghdl_main");
  if (!ghdl_main){
    fprintf(stderr, "%s\n", dlerror());
    exit(2);
  }

  printf("ghdl_main return: %d\n", ghdl_main(argc, argv));

  dlclose(h);

  return 0;

}

//:file: main.c
#!/usr/bin/env sh

gcc main.c -o main -ldl
ghdl -a tb.vhd
ghdl -e -Wl,-fPIC -Wl,-shared -Wl,-Wl,-u,ghdl_main -o tb.so tb
./main

#:file: run.sh

The output is the following:

# ./test.sh 
tb.vhd:7:5:@0ms:(report failure): Hello!
D:\tmp-abort\main.exe:error: report failed
in process .tb(arch).P0
D:\tmp-abort\main.exe:error: simulation failed

If the report in the VHDL source is changed to note, the output is the following:

# ./test.sh 
tb.vhd:7:5:@0ms:(report error): Hello!
ghdl_main return: 0

Expected behaviour

I would expect the output when failure to be:

# ./test.sh 
tb.vhd:7:5:@0ms:(report failure): Hello!
D:\tmp-abort\main.exe:error: report failed
in process .tb(arch).P0
D:\tmp-abort\main.exe:error: simulation failed
ghdl_main return: 1

Or any other return code other than 0. However, the problem is that the wrapper (the C program) exits immediately, so the ghdl_main return: 1 line (or any code after ghdl_main) is not executed.

Context

MINGW64:

# ghdl --version
GHDL 1.0-dev (0.37.0.r1342.geeb25ef2) [Dunoon edition]
 Compiled with GNAT Version: 10.2.0
 llvm code generator
Written by Tristan Gingold.

Copyright (C) 2003 - 2021 Tristan Gingold.
GHDL is free software, covered by the GNU General Public License.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

Additional context

Other tests (https://github.com/ghdl/ghdl-cosim/runs/1738969727?check_suite_focus=true#step:3:1558) show that an abortion signal is generated:

   /usr/local/lib/python3.7/dist-packages/vunit/vhdl/core/src/stop_body_93-2002.vhd:10:5:@0ms:(report failure): Stopping simulation with status 0
  /src/vhpidirect/quickstart/wrapping/exitcb/py/vunit_out/test_output/lib.tb_abrt.all_775e0a86ed093bdf996e118af0a0791ccd72f0fd/ghdl/tb_abrt-tb:error: report failed
  in process .tb_abrt(tb).main
  SIGABRT caught 6!
  Aborted (core dumped)

@tgingold
Copy link
Member

tgingold commented Jan 25, 2021 via email

@umarcor
Copy link
Member Author

umarcor commented Jan 25, 2021

😆 Did I tell that you are sometimes so annoyingly good? You are obviously correct. In fact, you can test this on Linux as-is.

Yet, dlfcn.h is available on MSYS2:

# pacman -F dlfcn.h
mingw32/mingw-w64-i686-dlfcn 1.2.0-1
    mingw32/include/dlfcn.h
mingw32/mingw-w64-i686-postgresql 12.4-1
    mingw32/include/postgresql/server/port/win32/dlfcn.h
mingw32/mingw-w64-i686-python-autopxd2 1.1.0-1
    mingw32/lib/python3.8/site-packages/autopxd/include/dlfcn.h
mingw64/mingw-w64-x86_64-dlfcn 1.2.0-1 [installed: 1.2.0-2]
    mingw64/include/dlfcn.h
mingw64/mingw-w64-x86_64-postgresql 12.4-1
    mingw64/include/postgresql/server/port/win32/dlfcn.h
mingw64/mingw-w64-x86_64-python-autopxd2 1.1.0-1
    mingw64/lib/python3.8/site-packages/autopxd/include/dlfcn.h
msys/msys2-runtime-devel 3.1.7-4 (msys2-devel) [installed]
    usr/include/dlfcn.h

Precisely, it is included in mingw-w64-x86_64-dlfcn. You can install it with pacman -S mingw-w64-x86_64-dlfcn. My guess is that it's a wrapper around Windows' native DLL open/close features.

FTR, this is what I use in Python for achieving the same result: https://github.com/ghdl/ghdl-cosim/blob/master/vhpidirect/shared/pycb/utils.py#L10-L34

Anyway, the code above is extracted from https://github.com/ghdl/ghdl-cosim/actions/runs/500013976. There you can see it tested on Windows and Linux, using different versions of GHDL installed through different procedures; examples include C only examples and also Python based ones. This error/bug seems to be consistent on all runs, regardless of the platform and the wrapper language. The difference is that C allows capturing Abortion signals, but Python cannot.

@tgingold
Copy link
Member

tgingold commented Jan 25, 2021 via email

@umarcor
Copy link
Member Author

umarcor commented Jan 25, 2021

My intuitive guess is that GHDL binaries can exit abruptely on failure. Note that the "crash" is not produced with reports of level note or error. However, exiting abruptely from a shared library is undesired. Therefore, this is not a bug per se. But a missing enhancement resulting from the addition of -shared to GHDL's elaboration options some months ago.

Yet, this issue is previous to opt -shared being added to GHDL. Before that, I used GCC for elaboration (through --bind and --list-link), which produced the same failure/abortion. It was not considered a bug back then because loading GHDL models dynamically was explicitly hackish.

As a result, my guess is that GHDL should handle failure reports similarly to error; thus existing but not aborting. Maybe the return value can be different for users to tell them apart.

These are the places I tried to look at:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants