Skip to content

Fix WaitForLock() and config.Set() lock#1370

Merged
mdrohmann merged 25 commits intomasterfrom
martind/lock-read-and-save-together-177913373
May 19, 2021
Merged

Fix WaitForLock() and config.Set() lock#1370
mdrohmann merged 25 commits intomasterfrom
martind/lock-read-and-save-together-177913373

Conversation

@mdrohmann
Copy link
Copy Markdown
Contributor

@mdrohmann mdrohmann commented May 18, 2021

@mdrohmann mdrohmann requested a review from MDrakos May 18, 2021 18:32
Copy link
Copy Markdown
Member

@MDrakos MDrakos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me just some minor cleanup

Comment thread internal/osutils/lockfile/pidlock.go
Comment thread internal/config/instance.go Outdated
@mdrohmann mdrohmann requested a review from MDrakos May 18, 2021 19:17
MDrakos
MDrakos previously approved these changes May 18, 2021
Comment thread changelog.md Outdated
Naatan
Naatan previously approved these changes May 18, 2021
@mdrohmann mdrohmann force-pushed the martind/lock-read-and-save-together-177913373 branch from 74890f2 to e2aa435 Compare May 18, 2021 23:20
@mdrohmann mdrohmann force-pushed the martind/lock-read-and-save-together-177913373 branch from e8838b3 to 412038e Compare May 19, 2021 20:10
@mdrohmann
Copy link
Copy Markdown
Contributor Author

mdrohmann commented May 19, 2021

Okay, this works now, I had to re-iterate because:

  • there was another bug in the pidlock implementation that only happened on Windows:
    When a process successfully locks the lockfile, but the file contents have a PID that is still running, TryLock returned false, but the file was locked (and never released).
  • Next problem: I realized that I wanted to just do file locking without the PID check. I remember that when I initially wrote the pidlock.go code, I added the PID check in order to ensure that we can recognize lock files that try to lock processes that are not running anymore. But that is already handled by flock or LockProcessEx. So, it is unnecessary, and unnecessary complexity leads to errors as we have seen, BUT:
    • in process.NewActivation we use the PID functionality of the pidlock.go implementation (but not the actually locking). That leads to follow-up story: https://www.pivotaltracker.com/story/show/178223007
    • the PID check actually did something: it ensured some INNER process synchronization. If TryLock was run in the same process, the flock.TryLock call always succeeds (that's only on Linux and Mac), because it allows several locks by the same process(!). But as soon as the first TryLock call writes its PID in there, the second one notices that there is already a running PID. => So, when we remove the PID check, our locking process needs a different kind of inner-process mutex to synchronize for these cases.

The suggested solution implemented in this PR is as follows:

  1. As the external package gofrs/flock already ensures inner process synchronization, I used that for the config synchronization
  2. I fixed the pidlock implementation as much as I could
  3. I filed a story to get rid of the pidlock implementation: https://www.pivotaltracker.com/story/show/178223007
  4. Because gofrs/flock provides shared and exclusive file locking, I changed the lock for Reload() to a shared lock ie., it can be run in parallel by several processes. It was easy and should speed things up, especially as we know have several processes contending for resources.

@mdrohmann mdrohmann requested review from MDrakos and Naatan May 19, 2021 22:02
Comment thread changelog.md Outdated
Co-authored-by: Nathan Rijksen <nathanr@activestate.com>
@mdrohmann mdrohmann merged commit fde7b19 into master May 19, 2021
@mdrohmann mdrohmann deleted the martind/lock-read-and-save-together-177913373 branch May 19, 2021 23:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants