-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Raise stack size to 8MB for pthreads #2303
Conversation
… size instead of relying on linker defaults which may be platform-specific. Also raises default stack size on OSX to current Linux default(8MB) just to be safe. Note: it seems there is no other way to specify stack size on std::thread than linker flags and the effective flags are named differently in many toolchains. No functional change
For Mac OS X default stack size for main and auxiliary threads, a good documentation is the platform_thread_posix.cc file they use at Gggogle to build Chromium: https://chromium.googlesource.com/chromium/chromium/+/master/base/threading/platform_thread_posix.cc I have tested the result of ulimit -s on my two computers, running Mac OS X 10.9.5 and 10.13.6, in both cases the result was 8Mb. |
@snicolet the created threads have a different stack size than the limit one sees with ulimit -s. In particular on macOS (512Kb). That's why we have the code. I think @noobpwnftw patch is a good idea. |
I think too it is a good idea! |
@vondele you are correct , I had tried ulimit -s and although it did return 8mb, it did not stop the crashes until you inserted the specific code for macOS. |
I would have one question on this patch, is there a limit on the stack size for 32bit MINGW ? 2Mb might be a limit on some 32 bit systems. See also https://sourceforge.net/p/mingw-w64/bugs/369/#3fc7
So, I guess we have to check this actually increases stack size as expected. Note that 2Mb would still be enough to run safely. |
OK, I've checked the patch, and it is fine as is, at least on my system it does reserve 8Mb stack. From my point of view, it would be ready to merge. |
Here are the reasons that I didn't take the Makefile approach:
Personally I use MSVC to compile Stockfish on Windows, it builds out-of-box by importing the source folder into an empty solution, while missing all custom flags(bmi2, sse, etc) from the Makefile and using native std::thread implementation(MSVC doesn't have pthread anyway). This is more than fine compared what you'd normally get after skipping Makefile. Such users should be fully responsible for taking care of missing Makefile flags and other compiler-specific flags(the default 1MB stack size may have problems but fixing that would either need a MSVC project file or introduce platform-dependent code). |
The 1MB of MSCV would be really close to the limit I think. For a search leading to a PV line of depth 245, I'm estimating a stack usage of ~.97MB:
|
Yes, in cases of a TB probe, it may go even higher due to some recursion in probing code, that I'm not sure if it is indeed stack intensive, but along with some alignments(the TCEC binary has 4184 on |
I propose to merge the pull request in current form, and we can deal with the MSCV case in a later commit, if necessary (or describe a MSCV project setting precisely in the wiki, if that is the only solution). |
By the way, before I merge the pull request, I would like a confirmation that 8MB is not too much for each thread. a) with 512 threads, that makes 4GB of memory for Stockfish, I suppose that it is peanuts for the monsters with 512 cores? |
8MB per thread is the default size on almost all 64-bit Linux, while this patch only affects OSX and MINGW32/64 so I think it is pretty harmless. |
I beware of MINGW32/64 users these days... Any opinion on points (b) and (c) above? |
Well, for (b), our fearless MINGW users are on Windows, they probably wouldn't care for a typical 8-core machine having 64MB of stack size. For tablets and phones, they don't use MINGW nor OSX, so they are not affected. As for (c), time to buy bigger machines. :) |
I don't think c) is a significant concern. Even if you launch multiple stockfish instances, you don't want more instances than your computer has threads. So it will still be 8MB * max threads of the CPU as with a single instance. And this is much less than the hash table will require for long search efficiency. |
I don't think 8MB is a problem wrt to a), b), or c) SF is not an application that uses 100s of threads, but typically only 1 per core, and consumer hardware typically has >512Mb/core. While 8Mb is more than what is needed (right now ~<2Mb), the value of the patch is that it now has the same value for more OSes (ie. linux default), removing untested corner cases. |
Merged via a858def, thanks! |
This PR hase been merged with master without tests. Perhaps we need to do a least a [-3,1] STC test to be 100% sure there is no problem ? |
Well, most of the test machines are not affected with the change at all. |
TCEC S16 is showing us that we must be very careful with rare configurations. I don't know how to increase the probability of triggering the modification @Fichtest |
For that however, please refer to #2307 for a living example of how bugs affecting all platforms can still slip through. |
The regression/progression of master after this patch against SF10 is underperforming. I have pushed some [-3..1] tests and will try to tell apart 7- and 8+ cores machines if necessary to estimate the situation better. @MJZ1977 Thanks Moez! |
Removed most 8-core testers and there are a few 4-cores ones triggered by accident, removed them too. |
Doubts are fine, paranoia is not. |
I agree. Getting changes for the test positions or bounds considered let alone done is much more complicated than it should be. |
It seems there is no other way to specify stack size on std::thread than linker flags and the effective flags are named differently in many toolchains. On toolchains where pthread is always available, this patch put the stack size change in the C++ code to ensure minimum stack size, instead of relying on linker defaults which may be platform-specific.
Also raises default stack size on OSX to current Linux default (8MB) just to be safe.
No functional change