-
Notifications
You must be signed in to change notification settings - Fork 210
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tests segfault on startup #5663
Comments
Not observed in nightly job for 4.1.0-dev.40. |
I just observed the seg fault running core-backend tests on Ubuntu on a branch newly-created from master. |
Dev.41 is using node 18.16.1 and dev.40 is using 18.16.0, not sure if that would be enough to cause the issue.. 18.16.1 also just came out within 24 hours I believe. Chuck also ran into segfaults when updating c-ares, and 18.16.1 appears to have included some c-ares vulnerabilities fixes. The fix in Chuck's case was to hide the symbols from the global space(This is similar to what Affan did to fix our segfaulting OpenSSL) as Node was stepping on the symbols from our version of c-ares in libsrc and causing segfaults. My guess is this is the cause. We haven't produced a new addon with the fix, but it is already in master. https://github.com/iTwin/imodel-native/pull/297/files#diff-88ba601cab81905cdeda950a5c1189da911b5b755ac49e752ab75f19c5031eab:~:text=ifdef%20__unix,%25endif We could possibly pin our node dependency down to 18.16.0 and get around this until we have an addon out. |
I repro'ed with 18.16.0. Only that once though. I suppose it's possible the crash is sporadic but more likely to occur with 18.16.1? I'll update to that. |
It still could be the same problem with a different library; the symbols are weak objects, which means that the linker will choose one and it can change if one of the libraries changes. This article talks about the symbols pretty well although in the context of a different problem. |
Tests passed on macOS after @nick4598 forced them to use 18.16.0. Linux still running - no seg faults yet. |
|
Yep thats the same callstack we got when debugging Chuck's branch which also updated cares. The fix is in master on imodel-native, but not in an addon yet. |
Build to publish new addon keeps hanging on Linux. It stops producing output while running the following parts (same parts both times): ['ECPresentation:UnitTests-NonPublished', 'ECDb:RunGtest', 'iModelPlatform:UnitTests-NonPublished', 'iModelPlatform:BuildIModelEvolutionTests', 'Visualization:UnitTests'] I repro'ed locally on Ubuntu (freezes my shell). I failed to note the list of parts that were running when it hung. I rebuilt single-threaded ( |
The build is taking the Linux boxes offline. I keep rerunning the one build and watch another machine go down. I feel like possibly the tests use a lot more memory than they did previously? What seems to be happening is that the box stops contacting the server so it is offline'd. My suspicion is that it is memory starved. I'm going to look at more logs to see if I can learn anything additional. |
Reopening issue until we have a node addon for 3.x and 4.x which resolves the segfault. Fix is in both main branch and release/3.x branch already. |
Were you able to get any information from the logs? I noticed there are a few new 'Prepare' tests added shortly before we attempted the new addon. Maybe those tests aren't at fault but were just enough to push our memory usage too high and make it more likely that the Linux boxes would crash? |
Nothing new. The logs say that the connection was lost. The boxes don't crash, but they stop talking to the server so they get listed as "offline". |
@nick4598 you mentioned that fix was already in the branch and you reopened to make sure for closing after verifying in 4.x. Can you check and mark this issue accordingly? |
The 3.x fix is in itwinjs-core versions 3.7.11 and greater. Fix is also in all versions of itwinjs-core 4.1.x, and master as well. |
Describe the bug
rush cover
in CI jobs is producing segmentation faults on mac and linux.To Reproduce
Steps to reproduce the behavior:
Screenshots
Desktop (please complete the applicable information):
Additional context
Failing build pipeline.
First observed in #5660. Occurred on all 3 runs of the pipeline.
#5661 produces similar results, with no code changes vs master.
Each test suite crash without completing a single test. Only test suites that use @itwin/core-backend are affected.
The most recent addon included upgrades of several third-party libraries. These failures were not observed at the time the new addon was integrated.
I fail to reproduce the problem running
rush cover
on Ubuntu 22.04.The text was updated successfully, but these errors were encountered: