-
Couldn't load subscription status.
- Fork 15k
Avoid stalls when MainLoop::Interrupt fails to wake up the MainLoop #164905
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@llvm/pr-subscribers-lldb Author: None (jimingham) ChangesTurns out there's a bug in the current lldb sources that if you fork, set the stdio file handles to close on exec and then exec lldb with some commands and the But I would like to keep lldb from stalling on the way out when this happens. The reason we stall is that we have a MainLoop waiting for signals, and we try to Interrupt it, but because stdio was closed, the interrupt pipe for the MainLoop gets the file descriptor 0, which gets closed by the Python session handler if you run some script command. So the Interrupt fails. We were running the Write to the interrupt pipe wrapped in I made Interrupt (and AddCallback & AddPendingCallback) return a bool for "interrupt success" instead. All the places where code was requesting termination, I added checks for that failure, and skip the std::thread::join call on the MainLoop thread, since that is almost certainly going to stall at this point. I didn't do the same for the Windows MainLoop, as I don't know if/when the WSASetEvent call can fail, so I always return true here. I also didn't turn the test off for Windows. According to the Python docs all the API's I used should work on Windows... If that turns out not to be true I'll make the test Darwin/Unix only. Full diff: https://github.com/llvm/llvm-project/pull/164905.diff 14 Files Affected:
diff --git a/lldb/include/lldb/Host/MainLoopBase.h b/lldb/include/lldb/Host/MainLoopBase.h
index be9a2676e7443..9529f2c214784 100644
--- a/lldb/include/lldb/Host/MainLoopBase.h
+++ b/lldb/include/lldb/Host/MainLoopBase.h
@@ -57,18 +57,23 @@ class MainLoopBase {
// Add a pending callback that will be executed once after all the pending
// events are processed. The callback will be executed even if termination
// was requested.
- void AddPendingCallback(const Callback &callback) {
- AddCallback(callback, std::chrono::steady_clock::time_point());
+ // Returns false if an interrupt was needed to get the loop to act on the new
+ // callback, but the interrupt failed, true otherwise. Mostly used when the
+ // pending callback is a RequestTermination, since if the interrupt fails for
+ // that callback, waiting for the MainLoop thread to terminate could stall.
+ bool AddPendingCallback(const Callback &callback) {
+ return AddCallback(callback, std::chrono::steady_clock::time_point());
}
// Add a callback that will be executed after a certain amount of time has
- // passed.
- void AddCallback(const Callback &callback, std::chrono::nanoseconds delay) {
- AddCallback(callback, std::chrono::steady_clock::now() + delay);
+ // passed. See AddPendingCallback comment for the return value.
+ bool AddCallback(const Callback &callback, std::chrono::nanoseconds delay) {
+ return AddCallback(callback, std::chrono::steady_clock::now() + delay);
}
// Add a callback that will be executed after a given point in time.
- void AddCallback(const Callback &callback, TimePoint point);
+ // See AddPendingCallback comment for the return value.
+ bool AddCallback(const Callback &callback, TimePoint point);
// Waits for registered events and invoke the proper callbacks. Returns when
// all callbacks deregister themselves or when someone requests termination.
@@ -85,8 +90,9 @@ class MainLoopBase {
virtual void UnregisterReadObject(IOObject::WaitableHandle handle) = 0;
- // Interrupt the loop that is currently waiting for events.
- virtual void Interrupt() = 0;
+ /// Interrupt the loop that is currently waiting for events. Return true if
+ /// the interrupt succeeded, false if it failed.
+ virtual bool Interrupt() = 0;
void ProcessCallbacks();
diff --git a/lldb/include/lldb/Host/posix/MainLoopPosix.h b/lldb/include/lldb/Host/posix/MainLoopPosix.h
index e9ac798b948df..92cdbe9d87ec3 100644
--- a/lldb/include/lldb/Host/posix/MainLoopPosix.h
+++ b/lldb/include/lldb/Host/posix/MainLoopPosix.h
@@ -54,7 +54,7 @@ class MainLoopPosix : public MainLoopBase {
void UnregisterReadObject(IOObject::WaitableHandle handle) override;
void UnregisterSignal(int signo, std::list<Callback>::iterator callback_it);
- void Interrupt() override;
+ bool Interrupt() override;
private:
void ProcessReadObject(IOObject::WaitableHandle handle);
diff --git a/lldb/include/lldb/Host/windows/MainLoopWindows.h b/lldb/include/lldb/Host/windows/MainLoopWindows.h
index 705e7e78ba48a..65b44aa1582c3 100644
--- a/lldb/include/lldb/Host/windows/MainLoopWindows.h
+++ b/lldb/include/lldb/Host/windows/MainLoopWindows.h
@@ -50,7 +50,7 @@ class MainLoopWindows : public MainLoopBase {
protected:
void UnregisterReadObject(IOObject::WaitableHandle handle) override;
- void Interrupt() override;
+ bool Interrupt() override;
private:
llvm::Expected<size_t> Poll();
diff --git a/lldb/source/Host/common/MainLoopBase.cpp b/lldb/source/Host/common/MainLoopBase.cpp
index 64a57e65849e9..232b9bc0aa354 100644
--- a/lldb/source/Host/common/MainLoopBase.cpp
+++ b/lldb/source/Host/common/MainLoopBase.cpp
@@ -12,8 +12,9 @@
using namespace lldb;
using namespace lldb_private;
-void MainLoopBase::AddCallback(const Callback &callback, TimePoint point) {
+bool MainLoopBase::AddCallback(const Callback &callback, TimePoint point) {
bool interrupt_needed;
+ bool interrupt_succeeded = true;
{
std::lock_guard<std::mutex> lock{m_callback_mutex};
// We need to interrupt the main thread if this callback is scheduled to
@@ -22,7 +23,8 @@ void MainLoopBase::AddCallback(const Callback &callback, TimePoint point) {
m_callbacks.emplace(point, callback);
}
if (interrupt_needed)
- Interrupt();
+ interrupt_succeeded = Interrupt();
+ return interrupt_succeeded;
}
void MainLoopBase::ProcessCallbacks() {
diff --git a/lldb/source/Host/posix/MainLoopPosix.cpp b/lldb/source/Host/posix/MainLoopPosix.cpp
index 19a7128fbe407..e5fa32e4da2b7 100644
--- a/lldb/source/Host/posix/MainLoopPosix.cpp
+++ b/lldb/source/Host/posix/MainLoopPosix.cpp
@@ -387,10 +387,13 @@ void MainLoopPosix::ProcessSignal(int signo) {
}
}
-void MainLoopPosix::Interrupt() {
+bool MainLoopPosix::Interrupt() {
if (m_interrupting.exchange(true))
- return;
+ return true;
char c = '.';
- cantFail(m_interrupt_pipe.Write(&c, 1));
+ llvm::Expected<size_t> result = m_interrupt_pipe.Write(&c, 1);
+ if (result && *result != 0)
+ return true;
+ return false;
}
diff --git a/lldb/source/Host/windows/MainLoopWindows.cpp b/lldb/source/Host/windows/MainLoopWindows.cpp
index 9b7df10258bcd..c973e1c61162f 100644
--- a/lldb/source/Host/windows/MainLoopWindows.cpp
+++ b/lldb/source/Host/windows/MainLoopWindows.cpp
@@ -272,4 +272,7 @@ Status MainLoopWindows::Run() {
return Status();
}
-void MainLoopWindows::Interrupt() { WSASetEvent(m_interrupt_event); }
+bool MainLoopWindows::Interrupt() {
+ WSASetEvent(m_interrupt_event);
+ return true;
+}
diff --git a/lldb/source/Plugins/Protocol/MCP/ProtocolServerMCP.cpp b/lldb/source/Plugins/Protocol/MCP/ProtocolServerMCP.cpp
index 390cf3eeb16a5..77a3ba6574cde 100644
--- a/lldb/source/Plugins/Protocol/MCP/ProtocolServerMCP.cpp
+++ b/lldb/source/Plugins/Protocol/MCP/ProtocolServerMCP.cpp
@@ -133,11 +133,12 @@ llvm::Error ProtocolServerMCP::Stop() {
}
// Stop the main loop.
- m_loop.AddPendingCallback(
+ bool addition_succeeded = m_loop.AddPendingCallback(
[](lldb_private::MainLoopBase &loop) { loop.RequestTermination(); });
- // Wait for the main loop to exit.
- if (m_loop_thread.joinable())
+ // Wait for the main loop to exit, but not if we didn't succeed in inserting
+ // our pending callback or we'll wait forever.
+ if (addition_succeeded && m_loop_thread.joinable())
m_loop_thread.join();
m_accept_handles.clear();
diff --git a/lldb/test/API/driver/stdio_closed/TestDriverWithClosedSTDIO.py b/lldb/test/API/driver/stdio_closed/TestDriverWithClosedSTDIO.py
new file mode 100644
index 0000000000000..304e548c8c8a1
--- /dev/null
+++ b/lldb/test/API/driver/stdio_closed/TestDriverWithClosedSTDIO.py
@@ -0,0 +1,46 @@
+"""
+Test that if you exec lldb with the stdio file handles
+closed, it is able to exit without hanging.
+"""
+
+
+import lldb
+import os
+import sys
+import socket
+import fcntl
+
+import lldbsuite.test.lldbutil as lldbutil
+from lldbsuite.test.lldbtest import *
+
+
+class RenameThisSampleTestTestCase(TestBase):
+ # If your test case doesn't stress debug info, then
+ # set this to true. That way it won't be run once for
+ # each debug info format.
+ NO_DEBUG_INFO_TESTCASE = True
+
+ def test_run_lldb_and_wait(self):
+ """This test forks, closes the stdio channels and exec's lldb.
+ Then it waits for it to exit and asserts it did that successfully"""
+ pid = os.fork()
+ if pid == 0:
+ fcntl.fcntl(sys.stdin, fcntl.F_SETFD, fcntl.FD_CLOEXEC)
+ fcntl.fcntl(sys.stdout, fcntl.F_SETFD, fcntl.FD_CLOEXEC)
+ fcntl.fcntl(sys.stderr, fcntl.F_SETFD, fcntl.FD_CLOEXEC)
+ lldb = lldbtest_config.lldbExec
+ print(f"About to run: {lldb}")
+ os.execlp(lldb, lldb, "-x", "-b", "-o", "script print(lldb.debugger.GetNumTargets())", "--batch")
+ else:
+ if pid == -1:
+ print("Couldn't fork a process.")
+ return
+ ret_pid, status = os.waitpid(pid, 0)
+ # We're really just checking that lldb doesn't stall.
+ # At the time this test was written, if you close stdin
+ # in an asserts build, lldb aborts. So handle both
+ # of those cases. The failure will just be that the
+ # waitpid doesn't return, and the test times out.
+ self.assertFalse(os.WIFSTOPPED(status), "We either exited or crashed.")
+
+
diff --git a/lldb/tools/driver/Driver.cpp b/lldb/tools/driver/Driver.cpp
index ba0041111045b..f9cf4c41bf682 100644
--- a/lldb/tools/driver/Driver.cpp
+++ b/lldb/tools/driver/Driver.cpp
@@ -902,9 +902,10 @@ int main(int argc, char const *argv[]) {
}
#if !defined(_WIN32)
- signal_loop.AddPendingCallback(
- [](MainLoopBase &loop) { loop.RequestTermination(); });
- signal_thread.join();
+ // Try to interrupt the signal thread. If that succeeds, wait for it to exit.
+ if (signal_loop.AddPendingCallback(
+ [](MainLoopBase &loop) { loop.RequestTermination(); }))
+ signal_thread.join();
#endif
return exit_code;
diff --git a/lldb/tools/lldb-dap/DAP.cpp b/lldb/tools/lldb-dap/DAP.cpp
index 3c4f2253d1ad5..c54d1a2b43092 100644
--- a/lldb/tools/lldb-dap/DAP.cpp
+++ b/lldb/tools/lldb-dap/DAP.cpp
@@ -1104,9 +1104,11 @@ llvm::Error DAP::Loop() {
"unhandled packet");
}
- m_loop.AddPendingCallback(
- [](MainLoopBase &loop) { loop.RequestTermination(); });
- thread.join();
+ // Don't wait to join the mainloop thread if our callback wasn't added
+ // successfully, or we'll wait forever.
+ if (m_loop.AddPendingCallback(
+ [](MainLoopBase &loop) { loop.RequestTermination(); }))
+ thread.join();
if (m_error_occurred)
return llvm::createStringError(llvm::inconvertibleErrorCode(),
diff --git a/lldb/unittests/DAP/TestBase.cpp b/lldb/unittests/DAP/TestBase.cpp
index 83a303554ad6b..8cb459964f7d8 100644
--- a/lldb/unittests/DAP/TestBase.cpp
+++ b/lldb/unittests/DAP/TestBase.cpp
@@ -55,8 +55,9 @@ void TransportBase::SetUp() {
}
void TransportBase::Run() {
- loop.AddPendingCallback(
+ bool addition_succeeded = loop.AddPendingCallback(
[](lldb_private::MainLoopBase &loop) { loop.RequestTermination(); });
+ EXPECT_TRUE(addition_succeeded);
EXPECT_THAT_ERROR(loop.Run().takeError(), llvm::Succeeded());
}
diff --git a/lldb/unittests/Host/JSONTransportTest.cpp b/lldb/unittests/Host/JSONTransportTest.cpp
index 54f1372ca0fff..26d8913b20898 100644
--- a/lldb/unittests/Host/JSONTransportTest.cpp
+++ b/lldb/unittests/Host/JSONTransportTest.cpp
@@ -269,12 +269,13 @@ template <typename T> class JSONTransportTest : public PipePairTest {
loop.RequestTermination();
});
}
- loop.AddCallback(
+ bool addition_succeeded = loop.AddCallback(
[](MainLoopBase &loop) {
loop.RequestTermination();
FAIL() << "timeout";
},
timeout);
+ EXPECT_TRUE(addition_succeeded);
auto handle = transport->RegisterMessageHandler(loop, message_handler);
if (!handle)
return handle.takeError();
@@ -367,7 +368,9 @@ class TransportBinderTest : public testing::Test {
}
void Run() {
- loop.AddPendingCallback([](auto &loop) { loop.RequestTermination(); });
+ bool addition_succeeded = loop.AddPendingCallback([](auto &loop)
+ { loop.RequestTermination(); });
+ EXPECT_TRUE(addition_succeeded);
EXPECT_THAT_ERROR(loop.Run().takeError(), Succeeded());
}
};
@@ -435,8 +438,9 @@ TEST_F(HTTPDelimitedJSONTransportTest, ReadPartialMessage) {
EXPECT_CALL(message_handler, Received(Request{5, "foo", std::nullopt}));
ASSERT_THAT_EXPECTED(input.Write(part1.data(), part1.size()), Succeeded());
- loop.AddPendingCallback(
+ bool addition_succeeded = loop.AddPendingCallback(
[](MainLoopBase &loop) { loop.RequestTermination(); });
+ EXPECT_TRUE(addition_succeeded);
ASSERT_THAT_ERROR(Run(/*close_stdin=*/false), Succeeded());
ASSERT_THAT_EXPECTED(input.Write(part2.data(), part2.size()), Succeeded());
input.CloseWriteFileDescriptor();
@@ -454,15 +458,17 @@ TEST_F(HTTPDelimitedJSONTransportTest, ReadWithZeroByteWrites) {
ASSERT_THAT_EXPECTED(input.Write(part1.data(), part1.size()), Succeeded());
// Run the main loop once for the initial read.
- loop.AddPendingCallback(
+ bool addition_succeeded = loop.AddPendingCallback(
[](MainLoopBase &loop) { loop.RequestTermination(); });
+ EXPECT_TRUE(addition_succeeded);
ASSERT_THAT_ERROR(Run(/*close_stdin=*/false), Succeeded());
// zero-byte write.
ASSERT_THAT_EXPECTED(input.Write(part1.data(), 0),
Succeeded()); // zero-byte write.
- loop.AddPendingCallback(
+ addition_succeeded = loop.AddPendingCallback(
[](MainLoopBase &loop) { loop.RequestTermination(); });
+ EXPECT_TRUE(addition_succeeded);
ASSERT_THAT_ERROR(Run(/*close_stdin=*/false), Succeeded());
// Write the remaining part of the message.
@@ -569,8 +575,9 @@ TEST_F(JSONRPCTransportTest, ReadPartialMessage) {
EXPECT_CALL(message_handler, Received(Request{42, "foo", std::nullopt}));
ASSERT_THAT_EXPECTED(input.Write(part1.data(), part1.size()), Succeeded());
- loop.AddPendingCallback(
+ bool addition_succeeded = loop.AddPendingCallback(
[](MainLoopBase &loop) { loop.RequestTermination(); });
+ EXPECT_TRUE(addition_succeeded);
ASSERT_THAT_ERROR(Run(/*close_input=*/false), Succeeded());
ASSERT_THAT_EXPECTED(input.Write(part2.data(), part2.size()), Succeeded());
diff --git a/lldb/unittests/Host/MainLoopTest.cpp b/lldb/unittests/Host/MainLoopTest.cpp
index ae16d02101819..b546b0f71286a 100644
--- a/lldb/unittests/Host/MainLoopTest.cpp
+++ b/lldb/unittests/Host/MainLoopTest.cpp
@@ -179,9 +179,13 @@ TEST_F(MainLoopTest, PipeDelayBetweenRegisterAndRun) {
ASSERT_THAT_EXPECTED(pipe.Write(&X, len), llvm::HasValue(1));
};
// Add a write that triggers a read events.
- loop.AddCallback(cb, std::chrono::milliseconds(500));
- loop.AddCallback([](MainLoopBase &loop) { loop.RequestTermination(); },
+ bool addition_succeeded = loop.AddCallback(cb,
+ std::chrono::milliseconds(500));
+ ASSERT_TRUE(addition_succeeded);
+ addition_succeeded = loop.AddCallback([](MainLoopBase &loop)
+ { loop.RequestTermination(); },
std::chrono::milliseconds(1000));
+ ASSERT_TRUE(addition_succeeded);
ASSERT_TRUE(error.Success());
ASSERT_TRUE(handle);
@@ -310,8 +314,10 @@ TEST_F(MainLoopTest, NoSpuriousSocketReads) {
error);
ASSERT_THAT_ERROR(error.ToError(), llvm::Succeeded());
// Terminate the loop after one second.
- loop.AddCallback([](MainLoopBase &loop) { loop.RequestTermination(); },
+ bool addition_succeeded = loop.AddCallback([](MainLoopBase &loop)
+ { loop.RequestTermination(); },
std::chrono::seconds(1));
+ ASSERT_TRUE(addition_succeeded);
ASSERT_THAT_ERROR(loop.Run().ToError(), llvm::Succeeded());
// Make sure the callback was called only once.
@@ -388,10 +394,11 @@ TEST_F(MainLoopTest, PendingCallbackTrigger) {
MainLoop loop;
std::promise<void> add_callback2;
bool callback1_called = false;
- loop.AddPendingCallback([&](MainLoopBase &loop) {
+ bool addition_succeeded = loop.AddPendingCallback([&](MainLoopBase &loop) {
callback1_called = true;
add_callback2.set_value();
});
+ EXPECT_TRUE(addition_succeeded);
Status error;
ASSERT_THAT_ERROR(error.ToError(), llvm::Succeeded());
bool callback2_called = false;
@@ -416,9 +423,11 @@ TEST_F(MainLoopTest, ManyPendingCallbacks) {
// caused a deadlock when the pipe filled up (either because the main loop was
// not running, because it was slow, or because it was busy/blocked doing
// something else).
- for (int i = 0; i < 65536; ++i)
- loop.AddPendingCallback(
+ for (int i = 0; i < 65536; ++i) {
+ bool addition_succeeded = loop.AddPendingCallback(
[&](MainLoopBase &loop) { loop.RequestTermination(); });
+ EXPECT_TRUE(addition_succeeded);
+ }
ASSERT_TRUE(loop.Run().Success());
}
@@ -444,8 +453,10 @@ TEST_F(MainLoopTest, TimedCallbacksRunInOrder) {
add_cb(2);
add_cb(4);
add_cb(1);
- loop.AddCallback([](MainLoopBase &loop) { loop.RequestTermination(); },
+ bool addition_succeeded = loop.AddCallback([](MainLoopBase &loop)
+ { loop.RequestTermination(); },
start + 5 * epsilon);
+ EXPECT_TRUE(addition_succeeded);
ASSERT_THAT_ERROR(loop.Run().takeError(), llvm::Succeeded());
EXPECT_GE(std::chrono::steady_clock::now() - start, 5 * epsilon);
ASSERT_THAT(order, testing::ElementsAre(1, 2, 3, 4));
@@ -455,22 +466,24 @@ TEST_F(MainLoopTest, TimedCallbackShortensSleep) {
MainLoop loop;
auto start = std::chrono::steady_clock::now();
bool long_callback_called = false;
- loop.AddCallback(
+ bool addition_succeeded = loop.AddCallback(
[&](MainLoopBase &loop) {
long_callback_called = true;
loop.RequestTermination();
},
std::chrono::seconds(30));
+ EXPECT_TRUE(addition_succeeded);
std::future<Status> async_run =
std::async(std::launch::async, &MainLoop::Run, std::ref(loop));
std::this_thread::sleep_for(std::chrono::milliseconds(100));
bool short_callback_called = false;
- loop.AddCallback(
+ addition_succeeded = loop.AddCallback(
[&](MainLoopBase &loop) {
short_callback_called = true;
loop.RequestTermination();
},
std::chrono::seconds(1));
+ EXPECT_TRUE(addition_succeeded);
ASSERT_THAT_ERROR(async_run.get().takeError(), llvm::Succeeded());
EXPECT_LT(std::chrono::steady_clock::now() - start, std::chrono::seconds(10));
EXPECT_TRUE(short_callback_called);
diff --git a/lldb/unittests/Protocol/ProtocolMCPServerTest.cpp b/lldb/unittests/Protocol/ProtocolMCPServerTest.cpp
index 45464db958e04..97f32e2fbb1bf 100644
--- a/lldb/unittests/Protocol/ProtocolMCPServerTest.cpp
+++ b/lldb/unittests/Protocol/ProtocolMCPServerTest.cpp
@@ -150,8 +150,9 @@ class ProtocolServerMCPTest : public testing::Test {
/// Runs the MainLoop a single time, executing any pending callbacks.
void Run() {
- loop.AddPendingCallback(
+ bool addition_succeeded = loop.AddPendingCallback(
[](MainLoopBase &loop) { loop.RequestTermination(); });
+ EXPECT_TRUE(addition_succeeded);
EXPECT_THAT_ERROR(loop.Run().takeError(), Succeeded());
}
|
You can test this locally with the following command:git-clang-format --diff origin/main HEAD --extensions h,cpp -- lldb/include/lldb/Host/MainLoopBase.h lldb/include/lldb/Host/posix/MainLoopPosix.h lldb/include/lldb/Host/windows/MainLoopWindows.h lldb/source/Host/common/MainLoopBase.cpp lldb/source/Host/posix/MainLoopPosix.cpp lldb/source/Host/windows/MainLoopWindows.cpp lldb/source/Plugins/Protocol/MCP/ProtocolServerMCP.cpp lldb/tools/driver/Driver.cpp lldb/tools/lldb-dap/DAP.cpp lldb/unittests/DAP/TestBase.cpp lldb/unittests/Host/JSONTransportTest.cpp lldb/unittests/Host/MainLoopTest.cpp lldb/unittests/Protocol/ProtocolMCPServerTest.cpp --diff_from_common_commit
View the diff from clang-format here.diff --git a/lldb/source/Host/windows/MainLoopWindows.cpp b/lldb/source/Host/windows/MainLoopWindows.cpp
index 5e5888aee..96e871c8a 100644
--- a/lldb/source/Host/windows/MainLoopWindows.cpp
+++ b/lldb/source/Host/windows/MainLoopWindows.cpp
@@ -272,6 +272,4 @@ Status MainLoopWindows::Run() {
return Status();
}
-bool MainLoopWindows::Interrupt() {
- return WSASetEvent(m_interrupt_event);
-}
+bool MainLoopWindows::Interrupt() { return WSASetEvent(m_interrupt_event); }
|
|
✅ With the latest revision this PR passed the Python code formatter. |
| from lldbsuite.test.lldbtest import * | ||
|
|
||
|
|
||
| class RenameThisSampleTestTestCase(TestBase): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the test name be updated?
| lldb, | ||
| lldb, | ||
| "-x", | ||
| "-b", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it matter that -b and --batch are both present? Should we remove one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't matter to lldb, each time we see it we set a boolean to true... But it's more regular to mention it once. I'll fix that.
|
|
||
| char c = '.'; | ||
| cantFail(m_interrupt_pipe.Write(&c, 1)); | ||
| llvm::Expected<size_t> result = m_interrupt_pipe.Write(&c, 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we get a errno EAGAIN or EINTR during this write? Looking at the ::write
| ssize_t result = ::write(fd, buf, size); |
Maybe we can wrap the write call in
| inline decltype(auto) RetryAfterSignal(const FailT &Fail, const Fun &F, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error in the particular case I was addressing was that the interrupt pipe that we were trying to write to got closed out from under us. The error was "bad file handle". Since Write wasn't actually at fault here, I wasn't setting out to make the Write more robust. In this patch, I'm just trying to make sure the error gets reported so we don't stall waiting to join a thread that's never going to exit.
Making Write more robust in against interrupts would be a fine thing to do but is outside the scope of this patch.
|
|
||
| void MainLoopWindows::Interrupt() { WSASetEvent(m_interrupt_event); } | ||
| bool MainLoopWindows::Interrupt() { | ||
| WSASetEvent(m_interrupt_event); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WSASetEvent returns a bool that indicates if it succeeded or not, so we could return that here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made that change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Premise makes sense to me. With the limited scope of this change, I don't have any objections. I left one minor suggestion otherwise.
| llvm::Expected<size_t> result = m_interrupt_pipe.Write(&c, 1); | ||
| if (result && *result != 0) | ||
| return true; | ||
| return false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor Suggestion: return result && *result != 0
|
Note the buildbot lldb-x86_64-win is broken after this patch. The fcntl module is not available on Windows. |
https://github.com/jimingham/from-apple-llvm-project/pull/new/no-fcntl-on-windows It's okay to apply this sort of fix yourself if you know the reason for it... |
…lvm#164905) Turns out there's a bug in the current lldb sources that if you fork, set the stdio file handles to close on exec and then exec lldb with some commands and the `--batch` flag, lldb will stall on exit. The first cause of the bug is that the Python session handler - and probably other places in lldb - think 0, 1, and 2 HAVE TO BE the stdio file handles, and open and close and dup them as needed. NB: I am NOT trying to fix that bug. I'm not convinced running the lldb driver headless is worth a lot of effort, it's just as easy to redirect them to /dev/null, which does work. But I would like to keep lldb from stalling on the way out when this happens. The reason we stall is that we have a MainLoop waiting for signals, and we try to Interrupt it, but because stdio was closed, the interrupt pipe for the MainLoop gets the file descriptor 0, which gets closed by the Python session handler if you run some script command. So the Interrupt fails. We were running the Write to the interrupt pipe wrapped in `llvm::cantFail`, but in a no asserts build that just drops the error on the floor. So then lldb went on to call std::thread::join on the still active MainLoop, and that stalls I made Interrupt (and AddCallback & AddPendingCallback) return a bool for "interrupt success" instead. All the places where code was requesting termination, I added checks for that failure, and skip the std::thread::join call on the MainLoop thread, since that is almost certainly going to stall at this point. I didn't do the same for the Windows MainLoop, as I don't know if/when the WSASetEvent call can fail, so I always return true here. I also didn't turn the test off for Windows. According to the Python docs all the API's I used should work on Windows... If that turns out not to be true I'll make the test Darwin/Unix only.
|
Please approve #165417 to fix the buildbot lldb-remote-linux-win. |
Turns out there's a bug in the current lldb sources that if you fork, set the stdio file handles to close on exec and then exec lldb with some commands and the
--batchflag, lldb will stall on exit. The first cause of the bug is that the Python session handler - and probably other places in lldb - think 0, 1, and 2 HAVE TO BE the stdio file handles, and open and close and dup them as needed. NB: I am NOT trying to fix that bug. I'm not convinced running the lldb driver headless is worth a lot of effort, it's just as easy to redirect them to /dev/null, which does work.But I would like to keep lldb from stalling on the way out when this happens. The reason we stall is that we have a MainLoop waiting for signals, and we try to Interrupt it, but because stdio was closed, the interrupt pipe for the MainLoop gets the file descriptor 0, which gets closed by the Python session handler if you run some script command. So the Interrupt fails.
We were running the Write to the interrupt pipe wrapped in
llvm::cantFail, but in a no asserts build that just drops the error on the floor. So then lldb went on to call std::thread::join on the still active MainLoop, and that stallsI made Interrupt (and AddCallback & AddPendingCallback) return a bool for "interrupt success" instead. All the places where code was requesting termination, I added checks for that failure, and skip the std::thread::join call on the MainLoop thread, since that is almost certainly going to stall at this point.
I didn't do the same for the Windows MainLoop, as I don't know if/when the WSASetEvent call can fail, so I always return true here. I also didn't turn the test off for Windows. According to the Python docs all the API's I used should work on Windows... If that turns out not to be true I'll make the test Darwin/Unix only.