Frequent clangd test failures on Windows buildbot #1712

HighCommander4 · 2023-07-24T08:13:40Z

It appears there are frequent clangd test failures on the Windows buildbots.

A few samples:

https://lab.llvm.org/buildbot/#/builders/123/builds/20148/steps/4/logs/stdio:

Failed Tests (3):
  Clangd Unit Tests :: ./ClangdTests.exe/LSPTest/DiagnosticsHeaderSaved
  Clangd Unit Tests :: ./ClangdTests.exe/LSPTest/FeatureModulesThreadingTest
  Clangd Unit Tests :: ./ClangdTests.exe/LSPTest/ModulesTest

https://lab.llvm.org/buildbot/#/builders/123/builds/20145/steps/4/logs/stdio

Failed Tests (2):
  Clangd Unit Tests :: ./ClangdTests.exe/LSPTest/FeatureModulesThreadingTest
  Clangd Unit Tests :: ./ClangdTests.exe/LSPTest/ModulesTest

https://lab.llvm.org/buildbot/#/builders/123/builds/20142/steps/4/logs/stdio

Failed Tests (1):
  Clangd Unit Tests :: ./ClangdTests.exe/LSPTest/DiagnosticsHeaderSaved

https://lab.llvm.org/buildbot/#/builders/123/builds/20137/steps/4/logs/stdio

Failed Tests (8):
  Clangd Unit Tests :: ./ClangdTests.exe/12/38
  Clangd Unit Tests :: ./ClangdTests.exe/31/38
  Clangd Unit Tests :: ./ClangdTests.exe/7/38
  Clangd Unit Tests :: ./ClangdTests.exe/ClangdServerTest/ReparseOnHeaderChange
  Clangd Unit Tests :: ./ClangdTests.exe/ClangdServerTest/TestStackOverflow
  Clangd Unit Tests :: ./ClangdTests.exe/ClangdThreadingTest/NoConcurrentDiagnostics
  Clangd Unit Tests :: ./ClangdTests.exe/CompletionTest/DynamicIndexIncludeInsertion
  Clangd Unit Tests :: ./ClangdTests.exe/LSPTest/CDBConfigIntegration

https://lab.llvm.org/buildbot/#/builders/123/builds/20134/steps/4/logs/stdio

Failed Tests (3):
  Clangd Unit Tests :: ./ClangdTests.exe/28/38
  Clangd Unit Tests :: ./ClangdTests.exe/LSPTest/GoToDefinition
  Clangd Unit Tests :: ./ClangdTests.exe/LSPTest/RecordsLatencies

https://lab.llvm.org/buildbot/#/builders/123/builds/20131/steps/4/logs/stdio

Failed Tests (1):
  Clangd Unit Tests :: ./ClangdTests.exe/LSPTest/CDBConfigIntegration

https://lab.llvm.org/buildbot/#/builders/123/builds/20022/steps/4/logs/stdio

Failed Tests (2):
  Clangd Unit Tests :: ./ClangdTests.exe/28/38
  Clangd Unit Tests :: ./ClangdTests.exe/LSPTest/GoToDefinition

A common element seems to be the following message printed before the failures:

No result from call after 10 seconds!

cc @kadircet @hokein @sam-mccall

The text was updated successfully, but these errors were encountered:

AaronBallman · 2023-07-26T11:34:48Z

I'm noticing a new message that might explain some of the instability.

https://lab.llvm.org/buildbot/#/builders/123/builds/20277

has a failure:

C:\b\slave\clang-x64-windows-msvc\llvm-project\llvm\include\llvm/Support/PointerLikeTypeTraits.h(58): fatal error C1088: Cannot flush compiler intermediate file: 'C:\Users\buildbot\AppData\Local\Temp\_CL_3eb71680ex': No space left on device

so I'm contacting the bot owner to see if they can address that; hopefully that will resolve the mysterious failures. However, if the only error you get is "no result from call after 10 seconds" but the root cause is disk space issues, perhaps the diagnostic can be improved in clangd?

AaronBallman · 2023-08-09T18:46:23Z

Any update on this?

Spot-checking https://lab.llvm.org/buildbot/#/builders/clang-x64-windows-msvc shows most of the failures are spurious ones from clangd tests. Given that this has been happening for a few weeks it's become kind of disruptive; should we be exploring disabling these tests?

kadircet · 2023-08-21T14:49:47Z

Considering that even the Clangd Unit Tests :: ./ClangdTests.exe/LSPTest/RecordsLatencies test is failing, for which I can't get the logs. I am inclined to believe that load on this build-bot is just too high and our 10 seconds default timeouts waiting for async work to finish is not enough (This test literally calls a non-existent function, and gets a reply from the ClangdLSPServer, without even reaching ClangdServer).

So sending out a patch to bump the deadlines. It'd be great to get rid of them completely, but they're the only thing preventing clangd from hanging the buildbots forever if things go wrong.

kadircet · 2023-08-21T15:03:05Z

patch for LSP tests in https://reviews.llvm.org/D158426, will take a look at ManyUpdates separately as it's a little bit more delicate.

We seem to be hitting limits in some windows build bots, see clangd/clangd#1712 (comment). So bumping the timeouts to 60 seconds and completely dropping them for sync requests. As mentioned in the comment above, this should improve things, considering even the tests that don't touch any complicated scheduler is failing. Differential Revision: https://reviews.llvm.org/D158426

We started seeing a lot of timeouts that align with the change in lit to execute gtests in shards. The logic there assumes tests are single-threaded, which is the case for most of the LLVM, hence they pick #shards ~ #cores (by slightly overshooting). There are enough unittests in clangd that rely on multi-threading, they can create arbitrarily many threads but we limit amount of meaningful work to ~4 thread per process. This change ensures that we're accounting for that paralelism when executing clangd tests and not overloading test executors. In theory the change overestimates the requirements, not all tests are multi-threaded, but it doesn't seem to be resulting in any regressions on my local runs. Fixes llvm#64964. Fixes clangd/clangd#1712.

We seem to be hitting limits in some windows build bots, see clangd/clangd#1712 (comment). So bumping the timeouts to 60 seconds and completely dropping them for sync requests. As mentioned in the comment above, this should improve things, considering even the tests that don't touch any complicated scheduler is failing. Differential Revision: https://reviews.llvm.org/D158426

We started seeing a lot of timeouts that align with the change in lit to execute gtests in shards. The logic there assumes tests are single-threaded, which is the case for most of the LLVM, hence they pick #shards ~ #cores (by slightly overshooting). There are enough unittests in clangd that rely on multi-threading, they can create arbitrarily many threads but we limit amount of meaningful work to ~4 thread per process. This change ensures that we're accounting for that paralelism when executing clangd tests and not overloading test executors. In theory the change overestimates the requirements, not all tests are multi-threaded, but it doesn't seem to be resulting in any regressions on my local runs. Fixes llvm/llvm-project#64964. Fixes clangd/clangd#1712.

HighCommander4 mentioned this issue Aug 19, 2023

ClangdTests.exe/TUSchedulerTests.ManyUpdates flaky on windows llvm/llvm-project#50117

Closed

HighCommander4 mentioned this issue Aug 24, 2023

clangd/unittests/TUSchedulerTests.cpp fails and timesout frequently llvm/llvm-project#64964

Closed

kadircet mentioned this issue Sep 6, 2023

[clangd][unittests] Limit paralelism for clangd unittests llvm/llvm-project#65444

Closed

kadircet closed this as completed in llvm/llvm-project@9a26d2c Sep 6, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Frequent clangd test failures on Windows buildbot #1712

Frequent clangd test failures on Windows buildbot #1712

HighCommander4 commented Jul 24, 2023

AaronBallman commented Jul 26, 2023

AaronBallman commented Aug 9, 2023

kadircet commented Aug 21, 2023

kadircet commented Aug 21, 2023

Frequent clangd test failures on Windows buildbot #1712

Frequent clangd test failures on Windows buildbot #1712

Comments

HighCommander4 commented Jul 24, 2023

AaronBallman commented Jul 26, 2023

AaronBallman commented Aug 9, 2023

kadircet commented Aug 21, 2023

kadircet commented Aug 21, 2023