Skip to content
This repository has been archived by the owner on Aug 10, 2021. It is now read-only.

[runtime] Fix possible race in terminate handler #4194

Merged
merged 5 commits into from Sep 28, 2020

Conversation

knebekaizer
Copy link
Contributor

No description provided.

@knebekaizer knebekaizer force-pushed the vi/fix_termination_race branch 3 times, most recently from 7065123 to d3c5515 Compare June 4, 2020 14:49
Copy link
Member

@projedi projedi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incidentally, is it possible to write a test that triggers concurrent termination?

runtime/src/main/cpp/Exceptions.cpp Outdated Show resolved Hide resolved
runtime/src/main/cpp/Exceptions.cpp Outdated Show resolved Hide resolved
runtime/src/main/cpp/Exceptions.cpp Show resolved Hide resolved
runtime/src/main/cpp/Exceptions.cpp Outdated Show resolved Hide resolved
runtime/src/main/cpp/Exceptions.cpp Show resolved Hide resolved
runtime/src/main/cpp/Exceptions.cpp Show resolved Hide resolved
RuntimeCheck(oldTerminateHandler != nullptr, "Underlying exception handler is not set.");
oldTerminateHandler();
}
typedef __attribute__((noreturn)) void (*QH)();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: What about using using instead of typedef? The former would look like:

using QH = __attribute__((noreturn)) void(*)();

Which I personally find more readable (the name of a thing comes before the thing definition). But if you prefer typedefs then it's fine.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, maybe use RUNTIME_NORETURN instead of __attribute__((noreturn)) to keep consistency?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would agree, but macro RUNTIME_NORETURN has its own issue: it hides the fact it is language extension, not the standard [[noreturn]] as one may expect (or as it may change some day). However, in that particular case [[noreturn]] will not work (function pointer), so I have to stay with gcc extension here, or use explicit cast. So at this point I want to may the choice clear (gcc extension vs. standard language feature). Moreover, I believe the macro RUNTIME_NORETURN is a sort of abuse: it reinvents language feature, hides important aspects and does not add value.

runtime/src/main/cpp/Exceptions.cpp Outdated Show resolved Hide resolved
runtime/src/main/cpp/Exceptions.cpp Outdated Show resolved Hide resolved
: queuedHandler_((QH)std::set_terminate(kotlinHandler)) {}

static TerminateHandler *instance() {
static TerminateHandler *singleton = new TerminateHandler();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the way, it's possible to use clang's no_destroy here:

static TerminateHandler& instance() {
  [[clang::no_destroy]]
  static TerminateHandler singleton;
  return singleton;
}

It'll avoid heap allocation, but that's about it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this attribute standard or compiler-specific?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this attribute standard or compiler-specific?

I think it's clang only for now (there's a proposal but I can't find it's current status). However, I don't think it's a huge problem: we are tied to LLVM, and so unlikely to use any compiler but clang in the foreseeable future.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, we are generally free to compile runtime with any C++ compiler.
There are indeed number of blockers, but none of them are fundamental.
So I would prefer moving to compiler-agnostic code, not the opposite.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the moment it's clang extension and WG21 proposal (P1247) for C++ standard which probably will be accepted. A think it is safe anyway, as this is an optimization technique only and may be ignored if not supported.

@knebekaizer
Copy link
Contributor Author

knebekaizer commented Jun 9, 2020

Incidentally, is it possible to write a test that triggers concurrent termination?

@projedi Yes I wrote a standalone test, i.e. c++ app which does a sort of concurrent stress test for the same implementation. Not related to kotlin per se but sufficient to test the logic. I haven't found a simple way to include such test into our auto test environment.
Here is the gist: https://gist.github.com/knebekaizer/a9fd742d092fa1ba7c37d127b9c14907
You can see there two variants of implementation, both seems to be correct but I choose TerminateHandler2 for better encapsulation.

: queuedHandler_((QH)std::set_terminate(kotlinHandler)) {}

static TerminateHandler* instance() {
static TerminateHandler* singleton [[clang::no_destroy]] = new TerminateHandler();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are pointers necessary here still? Because I don't think no_destroy means anything for primitive types.


static SimpleMutex konanTerminateHandlerInitializationMutex;
// Copy, move and assign would be safe, but not much useful, so let's delete all (rule of 5)
TerminateHandler(const TerminateHandler&) = delete;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Formatting: tab indent here, but 2-space indent in the code just above.

}


Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Formatting: a stray new line?

@projedi
Copy link
Member

projedi commented Jun 9, 2020

I haven't found a simple way to include such test into our auto test environment.
Here is the gist: https://gist.github.com/knebekaizer/a9fd742d092fa1ba7c37d127b9c14907

@knebekaizer What about something like this: interop test with objc?

  1. create two threads from objc;
  2. initialize kotlin runtime in both;
  3. install an unhandled exception handler (Kotlin's setUnhandledExceptionHook) that prints some string x and just loops indefinitely;
  4. do @try { throw [NSException …] } @catch (...) { objc_terminate() } trick in both threads.

We expect this to print x and terminate at some point. Also, if there'd be diagnostic messages for (1) concurrent termination detected, and (2) force exiting by timeout while waiting for the concurrent termination; we could expect them in output also.

// will not reconstruct handler anyway, so let's keep dtor deleted to avoid confusion.
~TerminateHandler() = delete;
public:
/// First call will do the job, all consecuent will do nothing.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: consequent. Sorry for not spotting this earlier.


// Use one public funuction to limit access to the class declaration
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: function

sleep(timeoutSec);
// We come here when another terminate handler hangs for 5 sec, that looks fatally broken. Go to forced exit now.
}
_Exit(EXIT_FAILURE); // force exit
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm afraid one of my questions got lost. What about logging to stderr facts (1) that there's a concurrent termination attempt and (2) that one of them got tired of waiting and is force quitting?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may be dangerous, as output itself is a sort of conspicuous regarding races, etc. I consider hanging termination (which already happens here) as extremely emergence case, something is awfully wrong - it may be heap corruption or whatever that affects any printing. So I'd prefer to minimize any actions.

@projedi projedi self-requested a review June 19, 2020 07:48
@knebekaizer
Copy link
Contributor Author

I haven't found a simple way to include such test into our auto test environment.
Here is the gist: https://gist.github.com/knebekaizer/a9fd742d092fa1ba7c37d127b9c14907

@knebekaizer What about something like this: interop test with objc?

1. create two threads from objc;

2. initialize kotlin runtime in both;

3. install an unhandled exception handler (Kotlin's `setUnhandledExceptionHook`) that prints some string `x` and just loops indefinitely;

4. do `@try { throw [NSException …] } @catch (...) { objc_terminate() }` trick in both threads.

We expect this to print x and terminate at some point. Also, if there'd be diagnostic messages for (1) concurrent termination detected, and (2) force exiting by timeout while waiting for the concurrent termination; we could expect them in output also.

I suggest stress-test (a number of threads) as a separate commit

public:
template <class Fun> RUNTIME_NORETURN void operator()(Fun block) {
if (!compareAndSet(&terminatingFlag, 0, 1)) {
block();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does it invoke the block if compareAndSet failed?


void reportUnhandledException(KRef throwable) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file now has two function with names different only by case of first letters. Doesn't seem ok.

@SvyatoslavScherbina
Copy link
Collaborator

Incidentally, is it possible to write a test that triggers concurrent termination?

I guess we need at least simple non-concurrent tests. Because nothing else prevented us from having the barely noticeable but fatal typo.


---

int test_ConcurrentTerminate();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
int test_ConcurrentTerminate();
int test_ConcurrentTerminate(void);

(probably in async.h too)

for (size_t i = 0; i < 100; ++i) {
futures.emplace_back(std::async(std::launch::async,
[](size_t param) {
std::this_thread::sleep_for(std::chrono::milliseconds(param));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for the record (i.e. not doubting your approach), I used the following pattern to induce races:

int launchedThreads = 0;
bool threadsCanContinue = false;
for (int i = 0; i < numberOfThreads; ++i) {
  startAThread([]() {
    launchedThreads += 1; // increment atomically
    while (!threadsCanContinue) {} // read atomically
    // Do a racy thing here.
  });
}
while (launchedThreads < numberOfThreads) {} // read atomically
threadsCanContinue = true;  // write atomically

On my machine this pattern turned out to be quite reliable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Indeedfuture seems to be a bit "indirect" way, in comparison with startAtThread.
My snippet is derived from more complicate test involving exception propagation with promise and set_exception.

Copy link
Member

@projedi projedi Sep 25, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think std::async with futures would work just fine either way. My point was in using spinlocks for synchronisation as opposed to sleeping.

@knebekaizer knebekaizer merged commit 0058928 into master Sep 28, 2020
@knebekaizer knebekaizer deleted the vi/fix_termination_race branch September 28, 2020 12:18
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants