Remove recursive read locks #177

migopsdinesh · 2021-05-16T07:43:01Z

What does this do?

Removing recursive read locks, which are causing the scheduler get stuck

List any changes that modify/break current functionality

No behavioral changes

Have you included tests for your changes?

No Tests Included, the existing tests will sufficient

Did you document any new/modified functionality?

Notes

Would like to say thank you very much for the wonderful library, and we are planning to use this library in one of our tools.
While we are testing this library(Stress testing), we found that the scheduler engine is getting stuck due to recursive read locks in scheduler.go: jobPresent() function. In this jobPresent() function, we are calling read lock 2 times, one is using direct call, and other one is using s.Jobs() function.

After going through the code, I do not believe we need explicit read lock call in jobPresent(), and once we remove the explicit read locks(This is the PR), the scheduler is not getting stuck.

To demonstrate this behavior, we tried the same with below simple go program(go version 1.16).


package main

var lock sync.RWMutex

func init() {
	lock = sync.RWMutex{}
}

func Reader1() {
	lock.RLock()
	defer lock.RUnlock()

	fmt.Println("reader1")
}

func Reader2() {
	lock.RLock()
	defer lock.RUnlock()
	Reader1()
	fmt.Println("reader2")
}

func Writer1() {
	lock.Lock()
	defer lock.Unlock()
	fmt.Println("writer1")
}

func main() {

	go func() {
		for {
			Reader2()
		}
	}()

	fmt.Println("acquire write lock after 2 seconds")
	time.Sleep(2 * time.Second)

	go func() {
		for {
			Writer1()
		}
	}()

	time.Sleep(100 * time.Second)
}

$ go build . ; ./main
...
...
...
reader1
reader2
writer1 (Stuck here)

Here, it is getting stuck. Now, remove
lock.RLock()
defer lock.RUnlock()
from Reader2() and then build, and run, and the process won't get stuck.

This is what is happening with scheduler too, and it is getting stuck when we do the stress testing.
Hence, proposing a fix for this problem where we removed the unnecessary/recursive read locks.

Once again, thank you so much for the wonderful library.

JohnRoesler

Thanks for your kind words and the contribution!

JohnRoesler · 2021-05-17T17:52:41Z

scheduler.go

@@ -576,8 +574,6 @@ func (s *Scheduler) TaskPresent(j interface{}) bool {
 }

 func (s *Scheduler) jobPresent(j *Job) bool {
-	s.jobsMutex.RLock()


Actually looks like removing this lock is breaking the Update() func. Since it’s a private function we should be able to control any reads to it and if the race you’re seeing is related we can accomplish a different way

@JohnRoesler, Yes jobPresent() is a private function and the read access of it is already controlled by s.Job() function.
An another way to solve this problem is, have another mutex which can be utilized but I don't see it is required though.

I am unable to locate the Update() func usage about this lock, and it is already using the s.Job() function to get a snapshot of the current running jobs. Please correct me if I am wrong, and happy to submit a new patch for this problem.

There is a trace in the test results when running with test -race. You can see it in the actions output. I’m traveling and on mobile so I can’t dog on at the moment, but if you look there you’ll see why it’s failing the test with your change

Using this approach, we are avoiding the recursive read locks

dineshkumar02 added 3 commits May 16, 2021 12:02

removing unnecessary read locks, which are blocking the whole engine

221bebe

Restoring the TaskPresent read lock

1d2a1d2

Removing recursive readlock in TaskFind function

95bd5cd

migopsdinesh changed the title ~~Remove duplicate read~~ Remove reursive read locks May 16, 2021

migopsdinesh changed the title ~~Remove reursive read locks~~ Remove recursive read locks May 16, 2021

JohnRoesler approved these changes May 17, 2021

View reviewed changes

Merge branch 'master' into remove_duplicate_read

51bfca3

JohnRoesler requested changes May 17, 2021

View reviewed changes

dineshkumar02 and others added 2 commits May 25, 2021 17:17

Created a new function _jobPresent, and calling jobPresent

c7012c2

Using this approach, we are avoiding the recursive read locks

Merge branch 'master' into remove_duplicate_read

7e25dc1

JohnRoesler approved these changes May 25, 2021

View reviewed changes

JohnRoesler merged commit 092c5ec into go-co-op:master May 25, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove recursive read locks #177

Remove recursive read locks #177

migopsdinesh commented May 16, 2021

JohnRoesler left a comment

JohnRoesler May 17, 2021

migopsdinesh May 19, 2021

JohnRoesler May 19, 2021

Remove recursive read locks #177

Remove recursive read locks #177

Conversation

migopsdinesh commented May 16, 2021

What does this do?

List any changes that modify/break current functionality

Have you included tests for your changes?

Did you document any new/modified functionality?

Notes

JohnRoesler left a comment

Choose a reason for hiding this comment

JohnRoesler May 17, 2021

Choose a reason for hiding this comment

migopsdinesh May 19, 2021

Choose a reason for hiding this comment

JohnRoesler May 19, 2021

Choose a reason for hiding this comment