Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sjlj vs dwarf. Why should be sjlj preferred over dwarf (especially for C code) #20

Closed
ZsoltKantor opened this issue Jun 26, 2020 · 57 comments

Comments

@ZsoltKantor
Copy link

ZsoltKantor commented Jun 26, 2020

Hello to everybody!

I will provide some info about sjlj and I can also give some arguments supporting the idea that gcc sjlj variants should be released.

The sjlj functionality is specified in the C standard from C90 upwards.
sjlj is described by the standard as a functionality to make jumps in the code from one function to the other (goto can do jumps only in the current function).
Also if somebody works only with C code dwarf exception handling is probably useless, because as I know only C++ code can use dwarf exceptions. In fact the C programming language does not even has the idea of exception handling. SJLJ is implemented in the C standard as a method to jump in the code, to do "long jumps", but not as an exception handling mechanism. It is true that using the sjlj functionality, an exception handling mechanism can be implemented, so an exception handling emulation can be done with C code.

Also if you work with pure C code dwarf is a waste of space and code execution.
Compiled C code with a dwarf enabled compiler inserts extra sections in the code + does extra function calls.
So the code size will increase and there is no gcc option to completely exclude dwarf sections and code from the final binary.
The gcc dwarf compiler inserts 4 functions in you code from which one it is even called despite you don't use exception handling in your program. This extra function call is GetModuleHandleA() which is called every time in a dwarf enabled program before the main function is called.

So my opinion and conclusion is that gcc variants with sjlj should be released for newer gcc compilers also (e.g. 10.1.0) because if somebody writes code only in the C language the sjlj variant is preferred over dwarf which, remember adds no benefits to C code.

@ZsoltKantor ZsoltKantor changed the title sjlj vs dwarf. Why should be sjlj preferred over dwarf sjlj vs dwarf. Why should be sjlj preferred over dwarf (especially for C code) Jun 26, 2020
@brechtsanders
Copy link
Owner

See also an earlier discussion in issue #4 .

For Windows 64-bit SEH was chosen over SJLJ because it is the native Windows exception handling and winlibs aims to be as native as possible. It also has less overhead.
For Windows 32-bit Dwarf was chosen on 32-bit as SEH is not available yet on 32-bit and SJLJ introduces more overhead.

Considering the reality of what people actually want by looking at the downloads from old builds at:
https://sourceforge.net/projects/mingw-w64/files/Toolchains%20targetting%20Win32/Personal%20Builds/mingw-builds/8.1.0/threads-posix/
https://sourceforge.net/projects/mingw-w64/files/Toolchains%20targetting%20Win64/Personal%20Builds/mingw-builds/8.1.0/threads-posix/
the SJLJ downloads are less than 4% and 9% of the respective builds.

I don't like the idea of offering both SJLJ and Dwarf or SEH downloads because then a lot of users have no idea what they should be downloading. One reason is that winlibs.com has further plans to make a build ecosystem with thousands of packages, and offering the choise would imply all (C++) packages would need to be built in all the variations of GCC (32-bit/64-bit, Dwarf/SEH/SJLJ)

Can you tell me if there are examples of any open source packages that don't build with non-SJLJ exception handling GCC?

@ZsoltKantor
Copy link
Author

I understand your position, and what you are saying it is true, but consider the following case.
I'm a C programmer developing a C only program without exception handling. As a traditional C program, errors are checked by reading the function return values or the errno variable (or both). In this case dwarf is not a good choice because increases the code size, inserts extra seh sections in the code and calls a function (one with the latest dwarf + gcc versions, but if this changes in future and more functions will be called at init?).
I saw the download statistics. It also could be that most of the people downloaded dwarf because they read it has no overhead compared to sjlj, but maybe they did not know that if you don't use exception handling in your code sjlj produces smaller binaries.
I'm talking about sjlj and dwarf, not seh. I know seh is native exception handling for Windows and that is fine.
I also understand your effort; it is hard/takes much time to produce so many variants. But if somebody writes pure C code I think dwarf is not a good option (as I mentioned as far as I know C can't use dwarf exception handling at all - please do some more research if you can to be sure).
I'm talking only from the C programmer point of view. Of course on 32bit dwarf is a good choice if you want to be generic (C and C++). But for a C programmer the sjlj variant will fits better because of the above arguments.
For 64bit sure, seh is a good choice.
Normally there should be no project that builds only with sjlj, but if there is a project that don't uses exception handling and is compiled with dwarf unnecessary data will be in your object and binary files.

It is up to you how do you produce your releases I just wrote down my opinion, and some facts about dwarf.
Yes, I also know that sjlj is slow, but don't forget I'm talking about programs, projects which are written in C and not C++.

@brechtsanders
Copy link
Owner

Thank you for your insights.
As you say exception handling is for C++ and not for C.
I doubtt the statement that for C where no exception handling exists that GCC would add the overhead for something that isn't used.

To check this I took the C program Mbrowse which - together with all of its dependancies - I built with my 32-bit compiler (which used Dwarf exception handling).
The resulting program called mbrowse.exe has absolutely no dependancy on libgcc_s_dw2-1.dll which should be the case if any Dwarf stuff was compiled/linked into the .exe.
So I don't see evidence of your statement that C is impacted by the exception handling configured in the GCC build.

image

@ZsoltKantor
Copy link
Author

ZsoltKantor commented Jun 26, 2020

No, I did not say dwarf adds dll dependency, but inflates the code by adding additional code sections (the eh_frame sections), and also extra function references.

I did a test for better understanding. Using the same compiler in two variants: gcc 7.3.0 32bit dwarf and gcc 7.3.0 32bit sjlj
Compile command: gcc test.c
test.c contains an empty main function.

Produced binaries:
a_dwarf.exe -> 47710 bytes (46KB)
a_sjlj.exe -> 43235 bytes (42KB)

To see what I'm talking about you need to use the objdump command with the -x option to display all code sections, symbols, functions that are referenced in the code/program. I made output redirection to 2 separate files, a_dwarf.txt and a_sjlj.txt

I will show just the differences what I pointed out earlier (there are minor differences which should not be taken in consideration now)

Below you can see the extra functions referenced in the dwarf binary.
Fortunately from those 4 functions only GetModuleHandleA is actually called. You can see on the right side the sjlj binary don't references such functions.
1

Here you can see the extra section compiled in the dwarf file (eh_frame)
2

And here you can see the symbols which exists only in the dwarf variant. _deregister_frame_fn should be a function.
The .eh_frame symbol is present multiple times with different addresses. I'm displaying only two occurrences of it.
3

These are the differences. So if you are writing C code, than what it is marked with red is useless in your code.
(it is useful only if you write C++ code with exceptions)

Also I did a quick compare between 64-bit seh and sjlj code with the same test conditions. There are no differences between seh symbols and code sections and sjlj symbols and code section, so seh does not bloats the code.

@tomay3000
Copy link

tomay3000 commented Jun 26, 2020

Sorry to interfere, but I felt like joining the party.
So I have some opinions too:

  • Exception handling is just for C++ not C.
  • SEH exception handling is the best compatible for 64 bits.
  • It is said that DWARF is faster than SJLJ for 32 bits, but not that big of a deal in the EYE LEVEL even in big GUI applications.
  • Firebird SQL https://firebirdsql.org/en/about-firebird/ is a cross-platform open source RDBMS project written in C++ (which I personally use as my main RDBMS).
    The thing about it is their dev team dropped support for gcc under windows, so our only option is to generate a .dll.a link library from its .dll shared one:
gendef fbclient.dll
dlltool --dllname fbclient.dll --input-def fbclient.def --output-lib libfbclient.dll.a --kill-at

And if you try to link to that generated libfbclient.dll.a link library using a DWARF exception handling model, then your built application will likely to crash when exceptions are caught, I faced this problem a lot when using DWARF, and when I switched back to SJLJ, the built application will run very smoothly.

  • And , finally, what is the point of using Winlibs GCC tool chains if they provide exactly the same ones (models) as MSYS2!

Thanks for your understanding.

@ZsoltKantor
Copy link
Author

ZsoltKantor commented Jun 26, 2020

for tomay3000

thanks for your intervention. You hit the nail on the head when you mentioned the possible crash issues with DWARF on Windows. I know about this aspect (I read about that), but forgot to tell.

My opinion is that for 32-bit C code SJLJ should be used. Why? SEH is not available for 32-bit. C can't use DWARF exception handling plus (maybe useless to say when the previous statement is true) DWARF bloats C code.
For 32-bit C++: DWARF is faster but code can crash, SJLJ may be slower, but runs without issues.
For 64-bit C/C++ code probably SEH should be used. DWARF is not available for 64-bit, SJLJ is slower than SEH.

@brechtsanders
Copy link
Owner

brechtsanders commented Jun 27, 2020

You guys give compelling arguments. You convinced me. I was never really a fan of Dwarf, but chose it because talking to the end-users seemed to point in its direction.

In an ideal world there would be SEH support for Windows 32-bit in GCC. Unfortunately I believe this was initially not started due to patent issues. These issues no longer exist, but attempts to start GCC support for SEH for the MinGW i386 platform didn't really take off (see: https://gcc.gnu.org/legacy-ml/gcc-help/2008-02/msg00117.html).

For my next build I will look into switching GCC for 32-bit Windows to SJLJ exception handling, but as I also try to build a matching LLVM/CLang I will also need to look into how it's supported over there.

I will keep this issue open to inform you how things are going.

@nenin-sc
Copy link

Dear brechtsanders, please dont abandon DWARF version for 32b. Limitation of DWARF is well-know, however it better than SJLJ by lot of reasons. I have C++ soft which running data acquisition for 365x24x7, linked for 3+ hardware drivers of various vendors, including monstrous NI VISA, and never observed any "crash issues with DWARF" (with SJLJ also, but I dropper it as soon as first versions of MinGW with DWARF support appeared).

Dear ZsoltKantor, size of the empty main exe is ~13K (gcc-8.1.0). You can check it using "-O2 -static -s" options, which are more or less usual for "Release".

@brechtsanders
Copy link
Owner

brechtsanders commented Jun 27, 2020

I also read (including here: https://stackoverflow.com/questions/15670169/what-is-difference-between-sjlj-vs-dwarf-vs-seh) that SJLJ makes execution a lot slower. It's advantage is mainly interoperation with other libraries that have nocall-stack unwinding information, which is not a goal of winlibs.com, see the winlibs philosophy statement for more information.

Does anybody know more about the future of SEH support in GCC for Windows 32-bit?

@nenin-sc
Copy link

nenin-sc commented Jun 27, 2020

Does anybody know more about the future of SEH support in GCC for Windows 32-bit?

I`m afraid that there are no future. It requires sufficient intervention in gcc code: "Note that 32-bit SEH is stack-based and requires code-generation" . So someone with proper skills has to do it, but x86 32b mainly considered as deprecated, and to find such person is not easy business.

@tomay3000
Copy link

tomay3000 commented Jun 27, 2020

I suppose keeping the 3 of em (DWARF, SJLJ and SEH) is an optimal solution, it also overcomes Msys2 and tdm-gcc.
The only reason we don't compile em ourselves is that we don't have strong powerful computers to do so.
For example mine is only (i3 3rd gen CPU and 4 Gigs of RAM).

TIA.

@brechtsanders
Copy link
Owner

For the winlibs.com project the idea for the future is to have a whole ecosystem of packages all build with the same compiler.
Seperate builds are already needed for 32-bit and 64-bit.
Keeping around 2 EH versions would result in 4 seperate builds, and I'm not willing to do that.
Not yet at least. Maybe in the future when the whole thing builds in the cloud :-)

@ZsoltKantor
Copy link
Author

ZsoltKantor commented Jun 28, 2020

Could somebody provide an example code of dwarf exception handling to see how it works in practice?
It is enough to use try-catch in C++? Or you need to do some extra configuration in the source code, and/or at compile time?

@ZsoltKantor
Copy link
Author

ZsoltKantor commented Jun 28, 2020

Surprising results when timing try-catch exception handling in 32-bit code.

Used gcc 7.3.0 32-bit dwarf and sjlj compiler

The test code is:

int main(void)
{
unsigned a = 0;
do {
try {
throw 20;
}
catch (int e) {
; // doing nothing in the catch block
}
} while(++a < 1000000);
return 0;
}

I've used python time function to measure elapsed time of the do-while loop. I made the configuration in the gdbinit file, so the time measurement starts when GDB is started.
In the gdbinit file:

python from time import time
#the executable to load
file a.exe
#this was the address in a.exe where the loop starts
break *0x0040154e
run
python starttime=time()
continue
python print (time()-starttime)

As I read everywhere sjlj should be slower, but what I measured is the opposite

Average execution time for dwarf: 4.461 seconds
Average execution time for sjlj: 2.059 seconds

Somebody else should also try this with different gcc versions for 32-bit

P.S.
I also did a test with your gcc 9.3.0 32-bit dwarf and sjlj.
I used now performance counters to do the measurements (QueryPerformanceCounter). The results are the same.
This is very very surprising, sjlj try-catch code part is faster than dwarf (I'm doing something wrong, shouldn't be dwarf faster??).

Avg time dwarf: 3.885 sec.
Avg time sjlj: 1.573 sec.

@nenin-sc
Copy link

Dear ZsoltKantor, low efficiency of the sjlj is well-defined issue, see for example https://sourceforge.net/p/mingw-w64/mailman/message/30532139/
or one reference from it https://bugreports.qt.io/browse/QTBUG-29653
There are lot of reasons why you code is faster on sjlj. Obvious one is that this code is very simple.
PS. At my work PC (gcc 9.2 winlibs, i7-6700 win 10) this code:


int main(int argc, char** argv)
{
  LARGE_INTEGER startTime, endTime;
   LARGE_INTEGER frequency;
   QueryPerformanceFrequency(&frequency);
   QueryPerformanceCounter(&startTime);
 unsigned a = 0;
 do {
    try {
    throw 20;
     }
catch (int e) {
; // doing nothing in the catch block
}
} while(++a < 1000000);
     QueryPerformanceCounter(&endTime);
   double rt= (endTime.QuadPart - startTime.QuadPart) * 1000.0 / frequency.QuadPart;
   printf("%f\nPress <ENTER>",rt/1000.0);
   getchar();
   return 0;
}

produce output like

1.152894
Press <ENTER>

@ZsoltKantor
Copy link
Author

ZsoltKantor commented Jun 29, 2020

nenin-sc, shouldn't you compare SJLJ vs DWARF code ?
I don't know if it is all right if you display just one value, so I don't know which compiler (sjlj or dwarf) did you used.
The link you posted is old, it is from 2013 . . .
What if SJLJ exception handling mechanism was revised, changed, enhanced in the meantime?

Or it can be that my tests are wrong . . .

@ZsoltKantor
Copy link
Author

ZsoltKantor commented Jun 30, 2020

Made further tests comparing sjlj with dwarf (using gcc 7.3.0 though),
If exception is thrown sjlj is faster (6.630 sec. vs 17.186 sec.). Test done with 1000000 loops, made average from 3 runs.
If exception is not thrown, dwarf is somewhat faster (2.180 sec. vs 2.220 sec.). Test done with 600000000 loops, made average from 3 runs.

The code used for testing is below. Compile command used: g++ except.cpp -s -Wall
except_test.zip

@nenin-sc
Copy link

Your code produced: 6.192514s on gcc version 8.1.0 (i686-win32-dwarf-rev0, Built by MinGW-W64 project) Thread model: win32 , win7 64b i7-4700S

@ZsoltKantor
Copy link
Author

ZsoltKantor commented Jun 30, 2020

Test code updated. Test parameters can be controlled via macros.
except.zip

Your code produced: 6.192514s on gcc version 8.1.0 (i686-win32-dwarf-rev0, Built by MinGW-W64 project) Thread model: win32 , win7 64b i7-4700S

it is better to compare the two gcc variants - DWARF vs SJLJ, otherwise you have only one test value and can't compare against another value.

I will do a test with Brecht's gcc 9.3.0 compiler DWARF and SJLJ variants.

@ZsoltKantor
Copy link
Author

ZsoltKantor commented Jun 30, 2020

Test results

Compiler: gcc 9.3.0 32-bit
Compile command: g++ except.cpp -s -Wall
CPU: Intel Mobile Core 2 Duo T5750

1000000 loops, average calculated from 6 runs, compiled to throw exceptions
sjlj : 6.762414 sec.
dwarf: 23.280274 sec.

900000000 loops, average calculated from 6 runs, compiled to don't throw exceptions
sjlj : 3.241277 sec.
dwarf: 3.232620 sec.

@nenin-sc
Copy link

Intel Mobile Core 2 Duo T5750

It must not be slower i7-4700S by 4 times. Or it can?
Apropos, in your code one can find this:
#define AVG_COUNT 3
and after all this:
std::printf("\n%f", total/3);

When you are using c++ you can enjoy expressions like this:
const size_t avg_count(3);

Likely you need to try throw exception out of function, which was called from different compilation module.
But if you really interested to do something useful, you can offer you expertise for gcc team to finalize Win32-SEH.

@ZsoltKantor
Copy link
Author

ZsoltKantor commented Jun 30, 2020

It must not be slower i7-4700S by 4 times. Or it can?

it makes no relevance the CPU speed if you provide both results (sjlj and dwarf). I was putting it there for informative purposes.

Apropos, in your code one can find this:

from the tests point of view the code change is useless.

Likely you need to try throw exception out of function

That can be done. But you must to remember that dwarf has issues (the code can crash) when it comes to throw exceptions over modules which were not dwarf enabled.
tomay3000 pointed this weakness of dwarf out in the previous comments.
I'm not saying that dwarf should not be used, because of this, but this is a drawback.

But if you really interested to do something useful, you can offer you expertise for gcc team to finalize Win32-SEH.

I'm not interested in SEH.
Also if you read the title of this issue it is about SJLJ vs DWARF, and not SEH.
Plus I've tried gcc SEH. It is tricky to use the MingW-W64 implementation of it. Actually you must to call __try1, and not __try as Microsoft specifies. You also need to define some exception handler function . . . and finally I gave up to test it.

nenin-sc : Thank you for the provided ideas!

P. S.
nenin-sc it could be that you know . . . When gcc C++ library functions are generating exceptions, are they using the same throw keyword (as in my test code)?

@tomay3000
Copy link

Likely you need to try throw exception out of function, which was called from different compilation module.

  • Using SJLJ EH: Just go safe.
  • Using DWARF EH: I dare you try it in production.

@brechtsanders
Copy link
Owner

brechtsanders commented Jul 1, 2020

I have to emphasize: the goal of winlibs is not have the best option for interoperability with modules compiled with other C/C++ compilers, but rather to have the best performance and stability.

@ZsoltKantor
Copy link
Author

ZsoltKantor commented Jul 1, 2020

I have to emphasize: the goal of winlibs is not have the best option for interoperability with modules compiled with other C/C++ compilers, but rather to have the best performance and stability.

I understand. The best performance between sjlj and dwarf, well the test results show interesting things.
If exceptions are thrown sjlj is much faster (I don't know why on the Internet is stated that dwarf is faster). I'm talking now only about 32-bit gcc.
But if exceptions are not thrown, dwarf seems to be a little bit faster (based on the test results).
I updated my test code as pointed out by nenin-sc. Now exceptions are generated/thrown by the C++ library, and not by my code. With the new test code the results are the same.

Updated test code
except2.zip

Compiler: MingW-W64 gcc 7.3.0 32-bit posix
Compile command: g++ except2.cpp -s -Wall
CPU used: Intel Mobile Core 2 Duo T5750

1000000 loops, average calculated from 6 runs, compiled to throw exceptions
sjlj : 6.987930
dwarf: 26.661495

900000000 loops, average calculated from 6 runs, compiled to not throw exceptions
sjlj : 3.327127
dwarf: 3.269322

@nenin-sc
Copy link

nenin-sc commented Jul 1, 2020

Dear tomay3000

* Using `DWARF` EH: I dare you try it in production.

You can look at GoldenDict, for example.

Dear ZsoltKantor
You need to try throw exceptions out of different compilation modules at least.

are they using the same throw keyword (as in my test code)?

in C++ this is a piece of language: http://www.cplusplus.com/doc/tutorial/exceptions/ and there are no other options.

@ZsoltKantor
Copy link
Author

You need to try throw exceptions out of different compilation modules at least.

Hello nenin-sc, thanks for the reply!
Could you please try your proposal? I'm quite busy now and I have no time for that.
(I kind of 'predict' the execution time results will be similar to what I did)

@nenin-sc
Copy link

nenin-sc commented Jul 1, 2020

Could you please try your proposal?

Dear ZsoltKantor,
I dont think it is wise - to install another one version of the compiler because someone has no time to verify his ideas properly.
I absolutely trust opinion of the Qt, mingw-w64 and msys2 developers. Also I somehow still trust myself, and once upon a time I already did similar tests . May be in 2004 or 2006?
Apropos, it is better for usability to replace
std::printf("\n%f", total_time/AVG_COUNT);
with
std::printf("\n%f\n Please press <ENTER> to finish.", total_time/AVG_COUNT); getchar();

@ZsoltKantor
Copy link
Author

nenin-sc: 2004-2006? That was a long time ago. I think from 2006 until now software and hardware evolved a lot, so those . . . old tests are not really relevant now. To install a MingW gcc compiler it takes at most 5 minutes. You download, extract the archive and eventually add the bin directory to the PATH variable - if you have only one compiler. If you have more - as in my case, create a cmd file for every compiler and you configure the PATH there (I have 4 gcc versions installed).
The thing is that I'm busy these days, and I will go to a trip for a week starting from Friday which will be a computer free week. I can do some tests after that.
Yes, the pause at the end of the program is a good idea if you run the program by double click (I always executed the program from the command prompt).
A question. I'm not a C++ programmer, so I'm asking why should the exception handling execution time change if exceptions are thrown from external, foreign libraries? I assume the throw mechanism is the same, no?

@tomay3000
Copy link

tomay3000 commented Jul 1, 2020

Considering the reality of what people actually want by looking at the downloads from old builds at:
https://sourceforge.net/projects/mingw-w64/files/Toolchains%20targetting%20Win32/Personal%20Builds/mingw-builds/8.1.0/threads-posix/
https://sourceforge.net/projects/mingw-w64/files/Toolchains%20targetting%20Win64/Personal%20Builds/mingw-builds/8.1.0/threads-posix/
the SJLJ downloads are less than 4% and 9% of the respective builds.

I think this battle is lost from the beginning because we live in a world where people do just follow and don't study, and there is nothing we can do about it :(

  • First, the main reason for all of this is because the main MinGW-W64 https://sourceforge.net/projects/mingw-w64 has not released an update since more than 2 yeas.
  • Second, regular developers like me don't have such powerful computers to launch the GCC with SJLJ EH build process (which takes forever to complete).
  • Third, why would we chose WinLibs over MSYS2 (No offence here) while the later also uses DWARF EH and has more than 1800 pre-built libraries (updated not daily, but hourly). and there is this list of alternatives too:
Toolchain Website 32-Bit EH flavor GCC/LLVM version
MinGW-w64 https://sourceforge.net/projects/mingw-w64 DWARF and SJLJ GCC v8.1.0
Cygwin https://www.cygwin.com DWARF only GCC v9.3.0
MSYS2 https://www.msys2.org DWARF only GCC v10.1.0
LLVM MinGW https://github.com/mstorsjo/llvm-mingw Uses libc++ instead of libstdc++ LLVM v10.0.0
TDM-GCC https://jmeubank.github.io/tdm-gcc SJLJ only GCC v9.2.0
MinGW Distro - nuwen.net https://nuwen.net/mingw.html SEH x64-native only GCC v9.2.0
WinLibs http://winlibs.com DWARF only GCC v10.1.0

-Finally, If I decide to to build a recent SJLJ EH GCC version, it is very simple, I only have to sacrifice my computer for a day or two for the build process.

Thanks.

@brechtsanders
Copy link
Owner

@tomay3000 I have ported over 1800 sources to Windows using my winlibs.com personal build, so please stay tuned for those packages in the future. I'm working on the package manager, after that I should be ready to release.

I'm not trying to replace MSYS2 - in fact I use it myself as a shell to build everything - but my goal is to build an environment that follows certain principles.

@nenin-sc
Copy link

nenin-sc commented Jul 2, 2020

just follow and don't study, and there is nothing we can do about it :(

Dear tomay3000 ,
I "just follow and don't study" mingw since gcc 2 and C++ since first Borland Turbo. And I have some experience I can share:

  1. C++ as language has no mechanism which allows to adopt alien C++ libraries, neither static (.lib or .a) nor dynamic (.dll or .so). Alien means not only different compiler, but different versions of the same compiler. And it hard to predict will *.a, built by ..1 version of gcc work smoothly with code built with ..2 or not.
  2. C++ as language has no safe mechanism to support statically linked dlls (not linked statically to main). If you took a look into GoldenDict distro, you can see that it includes all mingw runtime to keep dlls dynamically linked . This issue originated from the nature of modern C++ and it makes usage of alien C++ libs even more problematic.
  3. Mingw gcc is alien for Windows. It even depends on MSVC runtime. Native Win exception handler is a SEH, and there are no ways to replace it 100% with SJLJ or DWARF. SJLJ vs. DWARF has more relaxed constrains, and that's all. It not enough to compensate call of the setjump on each try.
    Summarize: If someone needs to go deep in software development under Windows, he has to use native compiler (MSVC, Embarcadero C++Builder, etc.). GCC offers good optimization level, it follows the Standard, it is OpenSource- but it is not native for Windows and barely has chance to become native because of license limitations. SJLJ is not a solution, and it never was.

@brechtsanders
Copy link
Owner

To install a MingW gcc compiler it takes at most 5 minutes.

That's why I try to provide the winlibs.com personal build as a simple folder to be extracted.
Added advantage is that multiple versions (or e.g. 32-bit and 64-bit) can coexist on one system.

Problem is that Windows cmd is pretty ugly. But, there is another one invention: IDE! I can recommend CodeBlocks. OpenSource, full support for Mingw, gdb with Python scripting, profiling...

I use MinGW-w64 mostly from Code::Blocks for developing and from the MSYS2 shell for porting or compiling sources that come with makefiles or configuration scripts (autotools, cmake, meson, ...)

About testing, it may be best to compare apples with apples, so I'm have built both Dwarf/SEH and SJLJ versions of gcc-10.1.1-snapshot20200627 for MinGW-w64. You can find them at:
https://github.com/brechtsanders/winlibs_mingw/releases/tag/10.1.1-snapshot20200627-7.0.0
@ZsoltKantor Maybe it's a good idea to do your performance comparisons with this version as it's the latest version and it's configured exactly the same way except for exception handling.

@brechtsanders
Copy link
Owner

@nenin-sc
Technically I don't entirely agree with your points 1. and 2., but my experience has learned me that for best stability you should just not mix things build with different compilers.

Mingw gcc is alien for Windows

With this statement I quite disagree. For Cygwin I consider that to be true, but MinGW-w64 really does build native Windows libraries. Of course they call functions in the operating system's libraries, but the binaries really can be run on another Windows without shipping any additional dependancies (apart from C/C++ shared runtimes, but this can be solved by statically linking).

@ZsoltKantor
Copy link
Author

ZsoltKantor commented Jul 2, 2020

@brechtsanders

nenin-sc is a little bit impertinent. Please block it if it continues with offensive comments. Otherwise I will close the issue.
I'm not here to fight with a haughty guy. I wanted to talk about sjlj and dwarf, and not to read comments like "did you hear about IDE's" and such pompous comments.

One more comment from nenin-sc, and I close the issue, it is up to you brechtsanders if you open it again.
I don't want to talk with haughty people, not here.

I'm not here to hear somebody's disgracing words.

@ZsoltKantor
Copy link
Author

ZsoltKantor commented Jul 2, 2020

@brechtsanders

I will test the new snapshots.
I found a nice C++ functionality to measure execution time using the steady_clock.
I will made 3 different time measurements.
One with the Windows performance counters.
One with the builtin C++ steady_clock.
And one to measure only the kernel+user times (Windows functionality). This is good if you want to see only the time that the CPU spent behalf the execution of the program. With this idle times like sleeps, waits are excluded.
It is a good idea to read the CPU cycle count also.

@ZsoltKantor
Copy link
Author

ZsoltKantor commented Jul 2, 2020

@brechtsanders
Thank you for your great builds. They work well. The only problem for me is the gdb debugger backspace issue (because I use gdb a lot to inspect code). Everything else works fine!

P.S.
I think you made a wrong configuration for the 10.1.1-snapshot20200627-7.0.0 dwarf variant.
The dwarf variant is in fact sjlj, it links with sjlj and the gcc version (gcc -v) displays sjlj exception handling.

@brechtsanders
Copy link
Owner

Hi, I'm still working on the gdb backspace issue, and I believe I found a solution.
I'm rebuilding the lot now to test it.

I will look into i686-posix-dwarf-gcc-10.1.1-snapshot20200627 to see what went wrong...

@brechtsanders
Copy link
Owner

brechtsanders commented Jul 6, 2020

Juist built the latest snapshot in all EH variants: https://github.com/brechtsanders/winlibs_mingw/releases/tag/10.1.1-snapshot20200704-7.0.0 .
The mixup issue from 10.1.1-snapshot20200627-7.0.0 is not present here.

@ZsoltKantor
Copy link
Author

ZsoltKantor commented Jul 12, 2020

Juist built the latest snapshot in all EH variants: https://github.com/brechtsanders/winlibs_mingw/releases/tag/10.1.1-snapshot20200704-7.0.0 .
The mixum issue from 10.1.1-snapshot20200627-7.0.0 is not present here.

Thank you, backspace in GDB is working now.
There is only one issue related to command history recall (when you use the up arrow on the keyboard).
It looks like this:
Clipboard01

It displays on one line the previous commands used, appending the text one after the other.

Related to the exceptions, do you think it makes any sense to write a simple dll that throws exceptions and use that dll in one exe to check exception time? I think it is useless, I think C++ library exceptions do exactly the same thing.

P.S.
or maybe somebody could provide a real world exception handling example . . .

@brechtsanders
Copy link
Owner

I can't reproduce the command history issue with https://github.com/brechtsanders/winlibs_mingw/releases/tag/10.2.0-11.0.0-9.0.0-r5 . Is it ok for you now too?

Your exception question is beyond the scope of what winlibs.com is trying to do. Maybe you should pose that question on https://stackoverflow.com/ ?

@Scr3amer
Copy link

Scr3amer commented Mar 9, 2021

Oh man, I read the whole thread, even the arrogant comments from the other dude and @ZsoltKantor didn't conclude with his findings :O !!! You can't leave us hanging like that after building so much suspens :P.

@ZsoltKantor Did you try at least your original bench just to see ? I would love to have an answer just to see if things changed compare to all these reports that I am sure were right but with their own context.

I will try too when I will have some time to burn in the upcoming months (yeah, I am super hyper busy lately) and do a whole comprehensive test with your tailored builds @brechtsanders.

@brechtsanders
Copy link
Owner

If it was up to me I would go SEH all the way because it is native to Windows, which improves compatibility and interoperability.
Unfortunately SEH for 32-bit MingW is not available yet, and Dwarf still seems to be the best choice for this platform.

Building SJLJ releases separately would be a lot of extra work for me, and looking at the number of downloads from
https://sourceforge.net/projects/mingw-w64/files/Toolchains%20targetting%20Win64/Personal%20Builds/mingw-builds/8.1.0/threads-posix/
https://sourceforge.net/projects/mingw-w64/files/Toolchains%20targetting%20Win64/Personal%20Builds/mingw-builds/8.1.0/threads-posix/
we can conclude there is hardly any demand for it anyway.

@GitMensch
Copy link

Unfortunately SEH for 32-bit MingW is not available yet

Do you have more info why it is available for 64bit but not for 32bit?
Any news about it being available in future versions?

@brechtsanders
Copy link
Owner

As explained in #4 SEH isn't in GCC yet due to some non-tech reasons such as patents. To check the real reason and current status you would need to check with the GCC development team.

@GitMensch
Copy link

Sadly I've missed the explanation part and only have seen a guess "seems like" and a "likely not, patent expired".
Can you please point me to the explanation and ideally references?

@brechtsanders
Copy link
Owner

I have no information about this, even though I would also very much like to see 32-bit SEH support.
Please check with the GCC developers.

@revelator
Copy link

32 bit SEH will most likely newer get added to gcc. Reason is that sjlj uses allmost the same mechanism (both rely on code injection) so it would not really net us any benefits. I known of some person http://blog.davidegrayson.com/2016/02/windows-32-bit-structured-exception.html who made a test case library providing SEH to 32 bit gcc but the code was newer accepted upstream.

@asheplyakov
Copy link

@ZsoltKantor your benchmark measures an irrelevant quantity

With sjlj every try/catch block always incurs substantial overhead, that is, on every call, even if the exception hasn't been thrown. With dwarf (and x86_64 seh) one has to spend extra cycles only when the exception actually happens.

@revelator
Copy link

wont say it is that bad in most cases sjlj is actually faster than dwarf when an exception is thrown and just a tiny bit slower when not :) not sure how big an impact a 32 bit SEH exception model would impose but i reckon it might be about the same.

@asheplyakov
Copy link

asheplyakov commented Jun 14, 2022

sjlj is actually faster than dwarf when an exception is thrown and just a tiny bit slower when not :)

How slow "just a bit slower" exactly is? Let's measure it!
Consider the following example.

extern "C" int __attribute__((noinline)) doit(int flag) {
    try {
       if (flag) {
          throw 1;
       }
    } catch (int x) {
        return x;
    }
    return 0;
}

int main(int argc, char** argv) {
      volatile int zero = 0;
      for (int i = 0; i < 1000*1000*100; i++) {
            if (doit(zero)) {
                return 1;
            }
      }
      return 0;
}

Notice that this program actually does not throw any exceptions (the condition is always false).

First I'll compile and run it with distro (cross-) compiler, which uses dwarf:

i686-w64-mingw32-gcc -O2 -o except_bench_dwarf.exe except_bench.cc
cp -a /usr/lib/gcc/i686-w64-mingw32-gcc/lib*.dll .
perf stat wine except_bench_dwarf.exe

The result is

Performance counter stats for 'wine except_bench_dwarf.exe':

            929,59 msec task-clock:u              #    0,939 CPUs utilized          
                 0      context-switches:u        #    0,000 /sec                   
                 0      cpu-migrations:u          #    0,000 /sec                   
            71 702      page-faults:u             #   77,133 K/sec                  
     1 676 901 975      cycles:u                  #    1,804 GHz                    
     4 320 492 235      instructions:u            #    2,58  insn per cycle         
       963 024 749      branches:u                #    1,036 G/sec                  
         4 639 936      branch-misses:u           #    0,48% of all branches        
     8 354 117 120      slots:u                   #    8,987 G/sec                  
     3 447 018 435      topdown-retiring:u        #     31,3% retiring              
     5 227 248 388      topdown-bad-spec:u        #     47,4% bad speculation       
     1 137 193 797      topdown-fe-bound:u        #     10,3% frontend bound        
     1 208 898 899      topdown-be-bound:u        #     11,0% backend bound         

       0,990363195 seconds time elapsed

       0,113093000 seconds user
       0,016870000 seconds sys

Next I'll use a home-brew (cross) compiler with sjlj exceptions (which is the default)

/opt/mingw-sjlj/bin/i686-w64-mingw32-gcc -O2 -o except_bench_sjlj.exe except_bench.cc
cp /opt/mingw-sjlj/lib/gcc/i686-w64-mingw32/lib*.dll .
perf stat wine except_bench_sjlj.exe

This gives

 Performance counter stats for 'wine except_bench_sjlj.exe':

          4 092,81 msec task-clock:u              #    0,992 CPUs utilized          
                 0      context-switches:u        #    0,000 /sec                   
                 0      cpu-migrations:u          #    0,000 /sec                   
            71 549      page-faults:u             #   17,482 K/sec                  
    15 269 909 107      cycles:u                  #    3,731 GHz                    
    36 824 660 446      instructions:u            #    2,41  insn per cycle         
     8 364 190 652      branches:u                #    2,044 G/sec                  
         4 699 311      branch-misses:u           #    0,06% of all branches        
    75 988 430 270      slots:u                   #   18,566 G/sec                  
    46 437 132 473      topdown-retiring:u        #     54,4% retiring              
    12 865 554 048      topdown-bad-spec:u        #     15,1% bad speculation       
     8 640 406 320      topdown-fe-bound:u        #     10,1% frontend bound        
    17 359 267 691      topdown-be-bound:u        #     20,4% backend bound         

       4,127270998 seconds time elapsed

       3,257917000 seconds user
       0,016739000 seconds sys

which is more than 30x slower (seconds user counts the time the process has been actually on CPU)

What gives? Let's look at the asm:

/opt/mingw-sjlj/bin/i686-w64-mingw32-gcc -O2 -save-temps -o except_bench_sjlj.exe except_bench.cc
cp except_bench.s except_bench_sjlj.s

	.globl	_doit
	.def	_doit;	.scl	2;	.type	32;	.endef
_doit:
	pushl	%ebp	 #
	pushl	%edi	 #
	pushl	%esi	 #
	pushl	%ebx	 #
	subl	$92, %esp	 #,
	leal	28(%esp), %eax	 #, tmp100
	movl	$___gxx_personality_sj0, 52(%esp)	 #,
	movl	%eax, (%esp)	 # tmp100,
	movl	$LLSDA0, 56(%esp)	 #,
	movl	%ebp, 60(%esp)	 #,
	movl	$L5, 64(%esp)	 #,
	movl	%esp, 68(%esp)	 #,
	call	__Unwind_SjLj_Register	 #
 # except_bench.cc:4: 	       	if (flag)
	movl	112(%esp), %eax	 # flag,
	testl	%eax, %eax	 #
	jne	L13	 #,
L8:
	leal	28(%esp), %eax	 #, tmp102
	movl	%eax, (%esp)	 # tmp102,
	call	__Unwind_SjLj_Unregister	 #
 # except_bench.cc:10: }
	movl	112(%esp), %eax	 # flag,
	addl	$92, %esp	 #,
	popl	%ebx	 #
	popl	%esi	 #
	popl	%edi	 #
	popl	%ebp	 #
	ret	
L13:
 # except_bench.cc:5: 			throw 1;
	movl	$4, (%esp)	 #,
	call	___cxa_allocate_exception	 #
 # except_bench.cc:5: 			throw 1;
	movl	$1, (%eax)	 #, MEM[(int *)_7]
 # except_bench.cc:5: 			throw 1;
	movl	$0, 8(%esp)	 #,
	movl	$__ZTIi, 4(%esp)	 #,
	movl	%eax, (%esp)	 # tmp87,
	movl	$1, 32(%esp)	 #,
	call	___cxa_throw	 #
L5:
 # except_bench.cc:6: 	} catch (int x) {
	movl	40(%esp), %edx	 #, tmp101
	movl	36(%esp), %eax	 #, tmp94
	movl	%edx, 112(%esp)	 # tmp101, flag
	subl	$1, %edx	 #,
 # except_bench.cc:6: 	} catch (int x) {
	movl	%eax, (%esp)	 # tmp94,
 # except_bench.cc:6: 	} catch (int x) {
	jne	L14	 #,
 # except_bench.cc:6: 	} catch (int x) {
	call	___cxa_begin_catch	 #
	call	___cxa_end_catch	 #
	jmp	L8	 #
L14:
	movl	$-1, 32(%esp)	 #,
	call	__Unwind_SjLj_Resume	 #

Roughly speaking every function with try/catch block(s) has to call __Unwind_SjLj_Register/__Unwind_SjLj_Unregister. Calling that 100 million of times might be expensive.

And this is what doit looks like with DWARF exceptions

_doit:
LFB0:
        .cfi_startproc
        .cfi_personality 0,___gxx_personality_v0
        .cfi_lsda 0,LLSDA0
        pushl   %ebx     #
        .cfi_def_cfa_offset 8
        .cfi_offset 3, -8
        subl    $24, %esp        #,
        .cfi_def_cfa_offset 32
 # except_bench.cc:2: extern "C" int __attribute__((noinline)) doit(int flag) {
        movl    32(%esp), %ebx   # flag, flag
 # except_bench.cc:4:           if (flag)
        testl   %ebx, %ebx       # flag
        jne     L10      #,
L7:
 # except_bench.cc:10: }
        addl    $24, %esp        #,
        .cfi_def_cfa_offset 8
        movl    %ebx, %eax       # flag,
        popl    %ebx     #
        .cfi_restore 3
        .cfi_def_cfa_offset 4
        ret
        .def    ___gxx_personality_v0;  .scl    2;      .type   32;     .endef

Here the data necessary to handle an exception (in particular to unwind the stack) is collected at the compile time (and stored into .eh_frame section). Handling an exception actually takes a quite a bit longer (it takes a while to unwind stack, parse tables, etc). This is similar to 64-bit Windows SEH (which is also table based).

@GitMensch
Copy link

sjlj is actually faster than dwarf when an exception is thrown and just a tiny bit slower when not :)

How slow "just a bit slower" exactly is? Let's measure it! Consider the following example:

That's a good check. Which GCC version was used exactly?

While most programs should not expect a lot of exceptions... as you already have that setup:
How fast is "actually faster" when throwing from each of those invocations?
What are the times when there is no try-catch at all?

... and, as this was the main part of the original point: what time differences does it show with "plain c"?

@revelator
Copy link

i was wondering myself ? since i anticipated a C example. But can you even use try catch in C ?.

@revelator
Copy link

revelator commented Jun 14, 2022

no seems try catch wont work with C :S only setjmp / longjmp or goto's (evil thingies xD)
https://stackoverflow.com/questions/10586003/try-catch-statements-in-c

you can emulate them though its not very nice to look at.

@asheplyakov
Copy link

asheplyakov commented Jun 14, 2022

since i anticipated a C example.

There are no exceptions in standard C (C89/C99/C11). Hence dwarf/sjlj choice is irrelevant for a pure C code.

@revelator
Copy link

indeed so the only thing we need be aware of are the extra sections added by dwarf in the libraries.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants