Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: sync: prohibit unlocking mutex in a different goroutine #9201

Open
dvyukov opened this issue Dec 4, 2014 · 26 comments

Comments

@dvyukov
Copy link
Member

commented Dec 4, 2014

sync.Mutex allows lock/unlock in different goroutines:

http://golang.org/pkg/sync/#Mutex.Unlock
"A locked Mutex is not associated with a particular goroutine. It is allowed for
one goroutine to lock a Mutex and then arrange for another goroutine to unlock it."

And the same for RWMutex:
http://golang.org/pkg/sync/#RWMutex.Unlock

The possibility to unlock the mutex in a different goroutine is very rarely used in real
code. And if you really need something more complex than lock/unlock in a single
goroutine, you can always use channels.

But it creates several problems:

  1. Deadlock detection becomes impossible, as there is no notion of "critical
    sections".
  2. Similarly static lock annotations becomes impossible for the same reason.
  3. Optimizations like hardware lock elision (see e.g. Intel HLE) become impossible.
  4. Another potential optimization that becomes impossible is priority boosting inside of critical sections. Namely, if a goroutine is preempted inside of a critical sections, scheduler may give it another tiny time slot to finish the critical section (or perhaps it can then voluntarily switch in Unlock). Solaris did this for a long time.

We should prohibit possibility to unlock in a different goroutine in Go2.

@davecheney

This comment has been minimized.

Copy link
Contributor

commented Dec 4, 2014

Comment 1:

For Go 1.x, could detecting unlocking from another goroutine be added as a runtime
experimental features, similar to the memory fence code that was added a while back ?
@dvyukov

This comment has been minimized.

Copy link
Member Author

commented Dec 4, 2014

Comment 2:

Detect and then what?
@davecheney

This comment has been minimized.

Copy link
Contributor

commented Dec 4, 2014

Comment 3:

Panic I guess. This would only be an experimental mode like the memory fence checker
@dvyukov

This comment has been minimized.

Copy link
Member Author

commented Dec 4, 2014

Comment 4:

What's the profit here?
@davecheney

This comment has been minimized.

Copy link
Contributor

commented Dec 4, 2014

Comment 5:

Its sounds like this is already a programming error, I would like an opportunity to
build a version of Go that detects this.
@cznic

This comment has been minimized.

Copy link
Contributor

commented Dec 4, 2014

Comment 6:

Dmitry, I think that the idea is too close to what the situation is in some other
languages when they prohibit calling stuff from other than a specific/privileged thread.
IMHO, gouroutines having no implicit user-facing identity is an important property and I
suggest to keep its value.
@dvyukov

This comment has been minimized.

Copy link
Member Author

commented Dec 4, 2014

Comment 7:

It's not a programming error today. The docs say:
http://golang.org/pkg/sync/#Mutex.Unlock
"A locked Mutex is not associated with a particular goroutine. It is allowed for one
goroutine to lock a Mutex and then arrange for another goroutine to unlock it."
Also it can be in third-party code. So even if you detect it, you may not be able to
change it.
@dvyukov

This comment has been minimized.

Copy link
Member Author

commented Dec 4, 2014

Comment 8:

Re #6: what you say makes sense, but I think that benefits do not outweigh drawbacks in
this case. It does not look exactly the same as restricting all GUI calls to the main
thread. And we do not take this possibility away, more complex patterns are still
possible with channels. Mutex is for:
mu.Lock()
defer mu.Unlock
// something
@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

commented Dec 4, 2014

Comment 9:

It should be easy enough for the compiler and/or static analysis tools to see, for most
uses of mutexes, that the mutex is locked and unlocked in the same goroutine.  So I
believe that HLE is entirely implementable in the common case, which is the only case in
which it can be plausibly be used anyhow.
It should also be straightforward for the compiler to annotate lock/unlock in the same
goroutine for the benefit of dynamic analysis, if that seems useful.
Not that I'm necessarily opposed to this idea, but I think you need stronger arguments.

Labels changed: added repo-main, release-none.

@dvyukov

This comment has been minimized.

Copy link
Member Author

commented Dec 4, 2014

Comment 10:

Is it really easy to do?
I see at least 3 problems:
1. If you do obj.subobj.mu.Lock/obj.subobj.mu.Unlock, compiler must prove that it's the
same mutex in both cases, that is, that obj.subobj pointer has not changed. Since these
objects are shared (they contain a mutex), it may be not that simple.
2. If compiler sees:
mu.Lock()
...
mu.Unlock()
and proves mu refers to the same object, it can be the case that the situation is
actually:
mu.Lock()
...
// pass responsibility to unlock to a different goroutine
...
// receive responsibility to unlock from yet another goroutine
...
mu.Unlock()
3. Non-trivial control flow like:
func foo() *T {
  t := ...
  ...
  if ... {
    t.mu.Lock()
  }
  ...
  return t
}
t := foo()
...
if ... {
  t.mu.Unlock()
}
...
@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

commented Dec 4, 2014

Comment 11:

If you can't resolve those issues anyhow, I don't understand how you can use HLE.  That
is, I don't see how the restriction that the mutex is manipulated from a single
goroutine is sufficient to use HLE in cases where you can't resolve the issues you
describe.
@dvyukov

This comment has been minimized.

Copy link
Member Author

commented Dec 4, 2014

Comment 12:

There is nothing to resolve. With the new rules, there must be pairing lock/unlock. Even
if identity of obj.subobj.mu has changed in the function, there still must be a pairing
unlock for the original value of obj.subobj.mu and a pairing lock for the new value of
obj.subobj.mu somewhere else. For HLE and dynamic analysis you don't need to find pairs
statically, they will just manifest themselves at runtime.
@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

commented Dec 4, 2014

Comment 13:

I guess I don't grasp how that would work.  You lock a mutex presumably with an XACQUIRE
instruction.  You make a system call.  You block.  The scheduler runs.  You unblock. 
You come back in a different thread.  You unlock the mutex with an XRELEASE instruction.
 The unlock fails.  Execution resumes back at the XACQUIRE lock.  But you're running on
a different core.  Can that really work?
My point is that you need to know what happens between the XACQUIRE and the XRELEASE. 
Or so it seems to me.
@dvyukov

This comment has been minimized.

Copy link
Member Author

commented Dec 4, 2014

Comment 14:

XACQUIRE/XRELEASE transparently fallback on real locking if transactions fail using some
heuristics.
If you use RTM (restricted transaction memory), then you fallback to locking manually
using explicit heuristics if you see something that you can't handle (e.g. a syscall).
But this happens at runtime, no prior static knowledge required.
Intel prototyped libpthread support for HLE where you could switch any C program that
uses pthread_mutex to HLE. And it worked.
@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

commented Dec 4, 2014

Comment 15:

OK, if that is how it works, then why can't we use the same heuristics to fallback if
the mutex unlock happens in a different goroutine?  By definition we must be on the same
core/thread, or we would have already done a fallback when we entered the scheduler.
(To be clear, I'm not saying you're wrong, I'm just saying that I don't understand.)
@dvyukov

This comment has been minimized.

Copy link
Member Author

commented Dec 4, 2014

Comment 16:

Yes, we can fallback on normal locking with current mutex semantics. It will work.
But good heuristics are critical for efficient HLE. With the proposed new semantics we
permanently fall back to locking iff encounter something that we can't handle within
transaction (e.g. a syscall). But unlock in a different thread will look like normal
contention (it will be detected earlier when the goroutine passes responsibility to
unlock mutex to a different goroutine), so condition for permanent fallback to locking
becomes moot; and if we don't fallback (always try to execute transactionally several
times first), we will waste lots of work.
Static analysis don't have dynamic information, so it won't work for static analysis.
Dynamic analysis may or may not work, I don't know yet how to implement it.
@dvyukov

This comment has been minimized.

Copy link
Member Author

commented Dec 4, 2014

Comment 17:

Yes, we can fallback on normal locking with current mutex semantics. It will work.
But good heuristics are critical for efficient HLE. With the proposed new semantics we
permanently fall back to locking iff encounter something that we can't handle within
transaction (e.g. a syscall). But unlock in a different thread will look like normal
contention (it will be detected earlier when the goroutine passes responsibility to
unlock mutex to a different goroutine), so condition for permanent fallback to locking
becomes moot; and if we don't fallback (always try to execute transactionally several
times first), we will waste lots of work.
Static analysis don't have dynamic information, so it won't work for static analysis.
Dynamic analysis may or may not work, I don't know yet how to implement it.

@dvyukov dvyukov added new Go2 labels Dec 4, 2014

@bradfitz bradfitz removed the new label Dec 18, 2014

@rsc rsc added this to the Unplanned milestone Apr 10, 2015

@rsc rsc removed release-none labels Apr 10, 2015

@rsc rsc changed the title sync: prohibit unlocking mutex in a different goroutine proposal: sync: prohibit unlocking mutex in a different goroutine Jun 17, 2017

@ghasemloo

This comment has been minimized.

Copy link

commented Jul 12, 2017

@dvyukov, we had a case for passing locked objects around in C++, we didn't want the object to be unlocked while passed around. So there are cases where we might want to lock in one goroutine and unlock in another one I think. Feel free to email me for details.

@dvyukov

This comment has been minimized.

Copy link
Member Author

commented Jul 12, 2017

@ghasemloo I believe there are cases like this, but it does not mean you have to use Mutex for them. chan will do just fine.
Note: it is absolutely illegal to lock/unlock almost all C++ mutexes in different threads (e.g. pthread_mutex_lock, EnterCriticalSection, std::mutex, etc). And if you built an own semaphore and call it a mutex, you can do the same in Go as well.

@pciet

This comment has been minimized.

Copy link
Contributor

commented Dec 20, 2017

Here's a solution to a deadlock case I had recently that requires unlocking in a different goroutine:

	// if the lock can't be acquired we have to try a read
	// here because the notifier could be holding it waiting to send
	//
	// this won't work serially: lock blocks us from 
	// trying to read and trying to read first gives 
	// the notifier a chance to lock before we do
	acq := make(chan struct{})
	go func() {
		gameMonitorsLock.Lock()
		acq <- struct{}{}
	}()
OUTER2:
	for {
		select {
		case <-channels.move:
		case <-channels.done:
		case <-acq:
			break OUTER2
		}
	}
	delete(gameMonitors, gameid)
	gameMonitorsLock.Unlock()

One of the senders called in an independent HTTP POST handling function:

go func() {
	gameMonitorsLock.RLock()
	c, has := gameMonitors[g.ID]
	if has {
		c.move <- time.Now()
	}
	gameMonitorsLock.RUnlock()
}()

Normally a send on c.move causes some reloads and checks of a changed database row, but in this case the race is where a move happens while a timeout detection (triggered by a timer instead of an HTTP request) is causing a teardown of the goroutine.

I don't know the quality of my design, but here's the one case I have where lock/unlock has to happen in separate goroutines.

@dvyukov

This comment has been minimized.

Copy link
Member Author

commented Dec 21, 2017

@pciet no, it doesn't require unlocking a Mutex in a different goroutine. See the previous comment:
#9201 (comment)

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

commented Jan 3, 2018

@griesemer suggests that perhaps instead of changing the existing sync.Mutex type, we should introduce a different kind of construct. For example, we could add a new method to sync.Mutex, Critical(section func()), which evaluates the function with the mutex held. That would provide the guarantee you are looking for, and in some cases might be easier and safer for people to use than separate Lock and Unlock calls.

@dvyukov

This comment has been minimized.

Copy link
Member Author

commented Jan 4, 2018

Unless switching to Go2 includes rewriting all of Go code from scratch (as far as I understand it doesn't), this won't give any significant benefit. If Lock/Unlock stay and provide the current semantics, we may not do this at all. The point is that 99% Mutex uses already comply.

@dvyukov

This comment has been minimized.

Copy link
Member Author

commented Jan 4, 2018

Another potential optimization that becomes possible is priority boosting inside of critical sections. Namely, if a goroutine is preempted inside of a critical sections, scheduler may give it another tiny time slot to finish the critical section (or perhaps it can then voluntarily switch in Unlock). Solaris did this for a long time.

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

commented Jan 4, 2018

If people convert over time to a Critical method, then it seems to me that all of your suggested improvements apply to uses of that method.

On the other hand, if we change the meaning of Lock and Unlock, then we break the 1% of programs that do not follow the new guidelines. That's only going to be OK if we can reliably statically detect the programs that will break, which as far as I can see we can't. So I don't see how we can implement this proposal as written. We can introduce a Critical method. We can introduce a new kind of Mutex that only permits locking and unlocking in the same goroutine. But I don't think we can change sync.Mutex. Or, to put it a different way, the cost of changing sync.Mutex is high; we need a correspondingly high benefit, and I don't see it here.

@nathanjsweet

This comment has been minimized.

Copy link

commented Oct 8, 2018

You could create your own trylock pretty simply:
https://play.golang.org/p/DcITzWTNlrD

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
9 participants
You can’t perform that action at this time.