Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

scripted-diff: Restore AssertLockHeld after #19668, remove LockAssertion #19865

Closed
wants to merge 2 commits into from

Conversation

ryanofsky
Copy link
Contributor

@ryanofsky ryanofsky commented Sep 3, 2020

This PR is a simple 3-line scripted diff followed by a documentation cleanup. The scripted diff does three things:

  1. Deletes all existing AssertLockHeld calls. Since Do not hide compile-time thread safety warnings #19668, AssertLockHeld has had an EXCLUSIVE_LOCKS_REQUIRED annotation which has guaranteed (though ongoing travis and cirrus CI builds since Do not hide compile-time thread safety warnings #19668 was merged) that these AssertLockHeld calls are redundant with EXCLUSIVE_LOCKS_REQUIRED or EXCLUSIVE_LOCK_FUNCTION annotations already present in surrounding code, and specifically that:
  • There is no way the assert calls can trigger any behavior at runtime.
  • The assert calls provide no new thread safety information to the compiler.
  1. Reverts AssertLockHeld implementation which got changed in Do not hide compile-time thread safety warnings #19668 to it's original pre-Do not hide compile-time thread safety warnings #19668 state. In Do not hide compile-time thread safety warnings #19668, AssertLockHeld was changed to have an EXCLUSIVE_LOCKS_REQUIRED annotation instead of having an ASSERT_EXCLUSIVE_LOCK annotation. As described above, having an EXCLUSIVE_LOCKS_REQUIRED annotation on an assert statement is only useful as proof that the assert statement doesn't do anything or convey any new information. By contrast, having an ASSERT_EXCLUSIVE_LOCK annotation on an assert statement can be an actually useful way of conveying to the compiler than a specific mutex is locked at a specific place in the code, when the compiler thread safety analysis can't determine that on its own (because of lost type information).

  2. Removes LockAssertion class, replacing all current uses with calls to AssertLockHeld. Ever since the LockAssertion class was first introduced in Refactor: Start to separate wallet from node #14437 (as LockAnnotation), it has always been an inferior alternative to AssertLockHeld and not had a reason to exist. (Refactor: Start to separate wallet from node #14437 was originally written before [net] Thread safety annotations in net_processing #13423 which added ASSERT_EXCLUSIVE_LOCK annotation. It was justified when the code was first written, but no longer necessary by the time it was merged).

Motivation for this PR is to get rid of confusion between different types of lock assertions and only keep one assertion: AssertLockHeld which goes back to the implementation it's had since #13423 was merged until #19668 was merged. After this PR, AssertLockHeld only needs to be used sparingly to augment compile-time thread safety annotations, which are a superior alternative to runtime checks for guaranteeing thread safety.

PR is compared with alternatives in https://github.com/bitcoin-core/bitcoin-devwiki/wiki/AssertLockHeld-PRs

@DrahtBot
Copy link
Contributor

DrahtBot commented Sep 4, 2020

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

@ajtowns
Copy link
Contributor

ajtowns commented Sep 4, 2020

-0.5 on concept, I don't think dropping the runtime checks has any advantage.

For approach, I think just doing git grep -l AssertLockHeld src | grep -v 'sync.h$' | xargs sed -i '/^ *AssertLockHeld(.*);/d' would be better. The current code has AssertLockHeld and AssertLockNotHeld behave in the same way (a runtime check that the mutex is held by this thread or not), which is worth preserving imo.

The updated docs in doc/developer-notes.md would also need updating if this change were to be merged.

@hebasto
Copy link
Member

hebasto commented Sep 4, 2020

I agree with @ajtowns' comment:

We have three types of assertions related for locking:

  • marking a function as EXCLUSIVE_LOCKS_REQUIRED has a compile time check with clang that the caller has already obtained the lock

  • declaring a LockAssertion instance overrides the clang compile time checks by saying that we've already somehow acquired the lock in a way that we can't prove via clang thread safety annotations. This is useful for lambda functions (which could be annotated) that are called from some generic dispatcher like ForEach (that can't be annotated).

  • adding a call to AssertLockHeld does a runtime check if DEBUG_LOCKORDER is enabled. it's annotated with EXCLUSIVE_LOCKS_REQUIRED so is redundant with the compile time checks, but the compile time checks aren't available outside of clang. it's automatically called from LockAssertion to ensure that the compile time checks aren't overridden incorrectly.

As a result:

  • using AssertLockHeld everywhere remains fine, and no longer needlessly disables the compile time checks

  • LockAssertion should only be used very rarely -- and in fact it's only used in net_processing in some ForEach/ForEachNode functions.

Probably, AssertLockHeld deserves a better name, but its functionality and usage are ok.

EXCLUSIVE_LOCKS_REQUIRED is used in a header file. If a function definition is placed in a *.cpp file, using AssertLockHeld has the following benefits (besides a run time check):

  • it shows to a code reader the expected state of lock without referencing to a header file
  • it will warn about missed proper EXCLUSIVE_LOCKS_REQUIRED annotation

AssertLockHeld is a great tool to transit from RecursiveMutex to Mutex in a safe and proven manner. See: #19303, #19833, #19854.

Concept NACK.

@DrahtBot DrahtBot mentioned this pull request Sep 4, 2020
3 tasks
@ryanofsky
Copy link
Contributor Author

If a function definition is placed in a *.cpp file, using AssertLockHeld has the following benefits (besides a run time check):

  • it shows to a code reader the expected state of lock without referencing to a header file
  • it will warn about missed proper EXCLUSIVE_LOCKS_REQUIRED annotation

You are literally talking about adding an annotation to check for the presence of another annotation. This is an absurd idea to me, but to take it seriously, what should the developer guidelines say about using AssertLockHeld this way? Should every function that is annotated with EXCLUSIVE_LOCKS_REQUIRED also have an AssertLockHeld at the top? Is there going to be a linter to check for this, or is this going to be another source of nits in review comments?

AssertLockHeld is a great tool to transit from RecursiveMutex to Mutex in a safe and proven manner. See: #19303, #19833, #19854.

Before the lock annotations added in #19668 this was true. It was a good way to discover where to add EXCLUSIVE_LOCKS_REQUIRED annotations. But now those annotations are added AssertLockHeld is only functioning as annotation checking the presence of another annotation, and doesn't impact the work in those other PRs or improve thread safety in any way.

@hebasto
Copy link
Member

hebasto commented Sep 4, 2020

You are literally talking about adding an annotation to check for the presence of another annotation.

This is one benefit among others (run time check is the main purpose of AssertLockHeld).

This is an absurd idea to me...

Annotations that was missed and added in 3ddc150 and 2ee7743 justify the #19668 approach.

... but to take it seriously, what should the developer guidelines say about using AssertLockHeld this way? Should every function that is annotated with EXCLUSIVE_LOCKS_REQUIRED also have an AssertLockHeld at the top?

Why not?

... is this going to be another source of nits in review comments?

I think it is desirable for a new code.

@ryanofsky
Copy link
Contributor Author

Good! So we agree this PR has no detrimental effects on thread safety, and the NACK is based on a style preference?

@hebasto
Copy link
Member

hebasto commented Sep 4, 2020

Good! So we agree this PR has no detrimental effects on thread safety, and the NACK is based on a style preference?

No. It is based on thread safety.

While migrating from RecursiveMutex to Mutex how one could be confident in the fact that a mutex is actually locked without run time assertion?

Try to remove confusing and no longer useful lock asserts.

At least, could this change be postponed until getting rid of RecursiveMutexs?

@ryanofsky
Copy link
Contributor Author

While migrating from RecursiveMutex to Mutex how one could be confident in the fact that a mutex is actually locked without run time assertion?

If it is annotated with EXCLUSIVE_LOCKS_REQUIRED, it seems you should be confident either that the mutex is actually locked or that LockAssertion was used earlier and would have triggered a runtime error where it was used. This PR isn't removing all runtime checks, just runtime checks redundant with compile time checks. AssertLockHeld is still available whenever you want to use it. It just returns to functioning like a normal runtime check, and not a strange compile time check enforcing the presence of a different compile time check. I can see how the strange check was useful during development of #19668, but it doesn't serve a purpose for thread safety going forward or help with future PRs.

If you want to make an argument for keeping all AssertLockHelds based on readability, that's fine, but then I think you should make a developer guideline saying that AssertLockHeld should be called first thing in any function annotated with EXCLUSIVE_LOCKS_REQUIRED, and ideally have a linter to enforce this. Otherwise if the assert is only used in some places but not others, that is just adding confusion and inconsistency.

@jonatack
Copy link
Member

jonatack commented Sep 4, 2020

I need to review this PR just for what I'll learn. 🐳

@maflcko
Copy link
Member

maflcko commented Sep 5, 2020

I think by default ./configure will pick up gcc, which does not check lock annotations, so the current AssertLockHeld in master have a slight benefit of telling ./configure --enable-debug devs (with gcc) who run the tests before creating a pull that something with their locks is wrong. Though, you correctly say that travis will compile with clang and fail if there is an inconsistency.

The redundant run time checks also serve as a insurance against bugs in clang.

Ideally, they'd be inserted by the compiler whenever a function is annotated. Though, I don't see a way to do this in C++ without wrapping everything into more macros. Another option would be to have a preprocessing step in our ci scripts to insert the redundant run-time checks in enable-debug builds. At least that would make me feel more comfortable removing them.

@maflcko maflcko removed the Wallet label Sep 5, 2020
@ryanofsky
Copy link
Contributor Author

At least that would make me feel more comfortable removing them.

Can you clarify what is uncomfortable? If a function is annotated with EXCLUSIVE_LOCKS_REQUIRED(mutex), and the developer is not using clang, and the developer makes a change that calls the function without locking mutex, having a redundant AssertLockHeld(mutex) isn't going to impact the thread safety of the codebase, because CI ensures we will not merge this code.

So you are uncomfortable about the inconvenience that removing asserts which are already haphazardly and inconsistently placed will cause for developers who are not using clang, but who are building in debug mode, and who are removing locks or calling functions in new places, and who are doing some kind of manual or automated testing that would happen to trigger these assertions at runtime?

Or uncomfortable about something different?

@maflcko
Copy link
Member

maflcko commented Sep 5, 2020

I simply wouldn't put too much trust into clang, since it may have bugs.

@vasild
Copy link
Contributor

vasild commented Sep 7, 2020

We managed to misuse the compiler directives:

  • Our AssertLockHeldInternal() does a runtime check and would not return (aka abort()) if it fails. There is an attribute exactly for that: ASSERT_EXCLUSIVE_LOCK, but we don't use it. We use EXCLUSIVE_LOCKS_REQUIRED, meant for something else. This is a misuse because AssertLockHeldInternal() does not do anything that requires holding the mutex (like reading variables protected by that mutex).

  • Our LockAssertion::LockAssertion() does the same as AssertLockHeldInternal(), but is tagged with EXCLUSIVE_LOCK_FUNCTION. A misuse because LockAssertion() does not acquire the mutex.

I think we shouldn't be doing that for no matter what reason - misleading the compiler that we do one thing while we do another.

The OP in #19668 boils down to:

 1 int x GUARDED_BY(cs_main);
 2
 3 void f() // not annotated with EXCLUSIVE_LOCKS_REQUIRED(cs_main)
 4 {
 5     // AssertLockHeld was properly annotated with ASSERT_EXCLUSIVE_LOCK() before #19668
 6     AssertLockHeld(cs_main);
 7
 8     // no warning here that we access x without holding cs_main
 9     x = 5;
10 }

The compiler does not issue a compile-time warning for line 9 because it knows line 9 is unreachable at run-time if cs_main is not held - AssertLockHeld() from line 6 would not return in that case. IMO we shouldn't try to extort the compiler to produce a warning here. Mis-labeling AssertLockHeld() with EXCLUSIVE_LOCKS_REQUIRED() only to ensure that f() has that attribute indeed looks like adding an annotation to check for the presence of another annotation.

I agree with @MarcoFalke that clang may have bugs and we better not have it as our sole protection. The thread safety analysis look a bit immature to me - for example the warnings produced depend on the order in which attributes are defined (:-O). IMO compile time checks are good and they are an addition to runtime checks, not a replacement.

What about going back to using the proper attributes? Tag AssertLockHeldInternal() with ASSERT_EXCLUSIVE_LOCK when DEBUG_LOCKORDER is defined and no attributes otherwise (because it is a noop, don't fool the compiler) and remove the confusing LockAssertion()?

@hebasto
Copy link
Member

hebasto commented Sep 7, 2020

@vasild

The compiler does not issue a compile-time warning for line 9 because it knows line 9 is unreachable at run-time if cs_main is not held - AssertLockHeld() from line 6 would not return in that case.

But it is not known for a code reader.

@ajtowns
Copy link
Contributor

ajtowns commented Sep 7, 2020

The compiler does not issue a compile-time warning for line 9 because it knows line 9 is unreachable at run-time if cs_main is not held

Line 9 is reachable at run-time if cs_main is not held and DEBUG_LOCKORDER is not specified. If there is not a test that exercises that code path, CI will not detect the bug and prevent the merge.

@vasild
Copy link
Contributor

vasild commented Sep 7, 2020

@ajtowns this is because of another misuse (or "trying to fool the compiler") - in master before #19668 in the non-DEBUG_LOCKORDER case AssertLockHeldInternal() was defined like this:

void static inline AssertLockHeldInternal(...) ASSERT_EXCLUSIVE_LOCK(cs) {}

So, we lied the compiler that we will check and abort(), but we did not do that.

In this case, IMO ASSERT_EXCLUSIVE_LOCK() should not be present. Then line 9 is reachable and we get a compilation warning.

@hebasto
Copy link
Member

hebasto commented Sep 8, 2020

@vasild

@ajtowns this is because of another misuse (or "trying to fool the compiler") - in master before #19668 in the non-DEBUG_LOCKORDER case AssertLockHeldInternal() was defined like this:

void static inline AssertLockHeldInternal(...) ASSERT_EXCLUSIVE_LOCK(cs) {}

So, we lied the compiler that we will check and abort(), but we did not do that.

In this case, IMO ASSERT_EXCLUSIVE_LOCK() should not be present. Then line 9 is reachable and we get a compilation warning.

So, the 23d71d1 commit from the #19668 is a correct change, and it should not be reverted, right?

@vasild
Copy link
Contributor

vasild commented Sep 8, 2020

23d71d1 contains 2 changes. IMO the first one should be reverted and the second change should stay.

@ajtowns
Copy link
Contributor

ajtowns commented Sep 14, 2020

So, we lied the compiler that we will check and abort(), but we did not do that.

I don't think it's useful to put this in moral terms. We're trying to prevent buggy code, by in order of preference, (a) making it impossible to write (eg RAII so locks are free automatically); (b) making the compiler complain about it (eg thread safety annotations); (c) getting predictable safe errors at runtime rather than crashes, hangs or undefined behaviour that only happen randomly (eg lock order checks).

In our code, it's never correct to call AssertLockHeld (that is as it stands prior to this PR, not after LockAssertion is renamed) without already holding the lock, so the additional behaviours allowed by ASSERT_EXCLUSIVE_LOCK aren't helpful, and make it easier to write buggy code.

…kAssertion

-BEGIN VERIFY SCRIPT-
git grep -l AssertLockHeldInternal | xargs sed -i /AssertLockHeldInternal/s/EXCLUSIVE_LOCKS_REQUIRED/ASSERT_EXCLUSIVE_LOCK/
git grep -l AssertLockHeld ':!src/sync.h' | xargs sed -i '/^ *AssertLockHeld(.*);/d'
git grep -l LockAssertion | xargs sed -i 's/LockAssertion lock(\(.*\));/AssertLockHeld(\1);/g'
-END VERIFY SCRIPT-
@DrahtBot
Copy link
Contributor

🐙 This pull request conflicts with the target branch and needs rebase.

Want to unsubscribe from rebase notifications on this pull request? Just convert this pull request to a "draft".

@maflcko
Copy link
Member

maflcko commented Oct 19, 2020

Can be closed?

@maflcko
Copy link
Member

maflcko commented Oct 25, 2020

Let me know if I should reopen

@maflcko maflcko closed this Oct 25, 2020
@bitcoin bitcoin locked as resolved and limited conversation to collaborators Feb 15, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants