Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid deadlock in button renderer by aquiring read lock twice. #5115

Merged
merged 1 commit into from
Sep 5, 2024

Conversation

williambrode
Copy link

@williambrode williambrode commented Sep 5, 2024

Description:

Fixes #5114

Checklist:

  • Tests included.
  • Lint and formatter run with no errors.
  • Tests all pass.

Where applicable:

  • Public APIs match existing style and have Since: line.
  • Any breaking changes have a deprecation path or have been discussed.
  • Check for binary size increases when importing new modules.

@coveralls
Copy link

Coverage Status

coverage: 66.055% (+0.002%) from 66.053%
when pulling 110aba7 on williambrode:issue-5114
into 5fb3d75 on fyne-io:develop.

@dweymouth
Copy link
Contributor

Acquiring a read lock twice should not cause a deadlock - RWMutex allows for multiple read locks at the same time. Are you sure this PR is actually solving a deadlock or have you just not seen it because of timing luck when you try to reproduce it again?

@williambrode
Copy link
Author

Yes, it will cause deadlock - it's a bit hard for me to explain but this answer seems to do a better job: https://stackoverflow.com/a/30549188/1139197

The TLDR is you can never acquire the same read lock twice in the same goroutine.

@dweymouth dweymouth changed the base branch from develop to release/v2.5.x September 5, 2024 15:57
@dweymouth dweymouth changed the base branch from release/v2.5.x to develop September 5, 2024 15:57
@dweymouth dweymouth merged commit 367ea0a into fyne-io:develop Sep 5, 2024
12 checks passed
@beeblebrox
Copy link

@dweymouth these can be tricky to spot. Do you think there might be any other double read locks hidden in the engine? Coincidently I think I hit this deadlock using the demo app yesterday and was coming to report it and why I felt obliged to comment :) However I lost the stack trace and not 100% sure it was the same deadlock.

But here is some clarity if anyone else stumbles upon this and asks why you can't take RLock() twice in the same goroutine without a potential deadlock situation.

Taking the read lock twice (RLock) in the same routine won't cause issues unless there is another go routine that calls Lock on the same mutex between the multiple RLock's. Go uses a writer prefer RW lock strategy and in addition will cause Lock() to force all other future RLocks to block until the Lock() obtains the lock and releases it.

The problem here is with this timeline:

go1 RLock()'A - takes the lock
go2 Lock() - blocks waiting for go1 RLock()'A release.
go1 RLock()'B - blocks until go2 Lock() is released.

Deadlock because:

  1. go1 RLock()'A will never release waiting for go1 RLock()'B to obtain the lock and finish.
  2. go1 RLock()'B always blocked waiting for go2 Lock()/
  3. go2 Lock() will always be blocked waiting for RLock()'A

When go2 Lock occurs anywhere else go won't deadlock so this kind of issue can hide for quite awhile. I suspect that this is only seen in Fyne when someone has implemented a GUI doing some dynamic updates to the button icon's asynchronously with some other mechanisms force a re-render.

Most UI systems get around this issue (and locks in general) requiring all UI property state changes to happen on a single thread, or go routine in the case of go. The databinding is a nice new feature which I think helps prevents these bugs due to the channels and maybe that can be extended to more properties?

One last thing, I thought maybe go run -race might catch this RLock() called twice without the need of a Lock() running at the same time but I just checked and it doesn't seem to. Funny enough though there is another race condition I detected though just when updating a button with an icon, but I'll save that for a different issue and I am unsure if it has any real impact in that case.

@williambrode
Copy link
Author

@beeblebrox I opened this one last year: #3886 and doing a quick review of the code I see its still an issue.

@beeblebrox
Copy link

beeblebrox commented Sep 6, 2024 via email

@williambrode
Copy link
Author

@beeblebrox as I understand it the fyne team is working on a single-threaded approach which likely will resolve most race conditions. I wasn't able to wait on this one because it was actively affecting my customers.

@andydotxyz
Copy link
Member

That's right. More info to come, will be discussed in Fyne Conf 2024 at the latest :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants