-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Investigate/fix strncat issue #343
Comments
The edge labels are samples, not calls. See Edge Information. The slowdown is only 2x on rzansel but 1000x on sierra. Why so much bigger? And why are we slow on rzansel and sierra but not other systems like rzgenie and jade? We discovered the issue when regression tests on sierra which normally take 1 minute were killed by ats after running for 60 minutes. Which tests "timeout" differs every night, eg on Monday 10 tests timeout and on Tuesday 8 tests timeout and only 1 test timedout both days. For about 80% of the "timeout" tests, the last message printed to stdout says Writing Restart or Writing Graphics. |
Well, I may be misinterpreting the numbers on the edges. If those are sample and not actual calls, then I apologize. I interpreted them as actual calls. That said, I am able to produce cases on my macOS system where strncat is taking ~30% of the total time. And, HDF5-1.12 is about 2-3x slower than HDF-1.8.14 for the same operations. So, there is definitely some issue that I am sure I can improve. |
I have investigated this issue with a modified
I have then tested Silo 4.11 and 4.10.3 with HDF5-1.8 and HDF5-1.14 on my macOS laptop and on a Lassen login node in both my home dir on Lassen and In my tests, I was doing 10,000 I tried creating directories which contain a slew of long-named objects that differ only in the last few characters. This did have some effect on performance (as would be expected) but again, nothing approaching 1000x slower or even 10x slower. More like 2-3x slower (on Lassen). I did discover that better args to I did compare the stack dumps provided and there is a difference in the The newer Silo includes a couple of additional calls to the HDF5 library to interrogate entries found and decide if they are symlinks or not. These are the calls to Here is some performance data I gathered while in
I have not tested with HDF-1.10 or HDF5-1.12 both of which have known performance issues in certain circumstances. Next, I tested Silo 4.11.1-pre1 with both HDF5-1.8 and HDF5-1.14
|
Ok, so I discovered an issue with the really good results in some of the tests above. I had adjusted args to |
I also used gprof of modified (
|
On lassen, I am unable to reproduce issues observed on sierra. I will close this issue for now but arrange a time to investigate on sierra. |
I got onto sierra and performed similar tests as those above and was not able to identify performance issues with DBGetToc that were originally reported. I did expand my test to create many directories of very long (>200 character) almost identical names (except for last few characters), and although there was some performance degredation for this case (as I would expect), there was nothing even approaching an order of magnitude let alone three orders of magnitude. I did notice that for a similar test involving 10,000 directory names and DBGetToc calls, PDB driver does do about 5x better than HDF5 1.14.0 on sierra. Again, a problem to be investigated but nothing worth holding up the release. |
When we write a restart file in mercury on sierra, it looks like we are spending 61% of our time in DBGetToc and 48% of the time in strncat() way down the call stack.
This only shows up in the "new" silo just released.
This confluence page has 2 .pdf files comparing the profile of the old vs new silo.
https://rzlc.llnl.gov/confluence/display/MER/Release+Issues
The text was updated successfully, but these errors were encountered: