Ensure that status lockfile is closed before trying to release work #320

shanemcd · 2021-05-13T15:40:58Z

For context, see:

chrismeyersfsu · 2021-05-13T17:16:32Z

I can't put my finger on it. But this feels like the wrong way to solve this.

For context, see: - ansible#319 - ansible/awx#9961

ghjm · 2021-05-13T18:17:11Z

pkg/workceptor/workunitbase.go

@@ -366,8 +368,14 @@ func (bwu *BaseWorkUnit) UnredactedStatus() *StatusFileData {
 func (bwu *BaseWorkUnit) Release(force bool) error {
 	bwu.statusLock.Lock()
 	defer bwu.statusLock.Unlock()
+	if bwu.status.LockFile != nil {
+		// There seems to be a race condition with the `defer`s that


If we're right here, having already passed the != nil check, and some other goroutine or thread gains control and sets bwu.status.LockFile to nil, then the following call to Close() will panic because it is trying to call a function on a nil pointer.

eek good point. My first instinct is that we ought to make more use of the statusLock mutex. There are some methods that are reading and writing to status file without getting a lock. monitorLocalStatus calls Load, which grabs status file lock, without getting the mutex lock, for example.

I used way too many mutex locks in Receptor. A better and more Go-idiomatic way to do this would be to have a single goroutine that does all the status file changes, with channels for other processes to send updates or receive current data.

I remember there was a lot of subtlety to this, that I spent a lot of time on it, and that the absence of mutex locks in the status file updates is not just an accident - I had something in mind. But looking at it now, it seems pretty reasonable that you should hold a read lock on BaseWorkUnit.statusLock while performing a file update, or a full lock if you're reading back from the file to memory (ie, changing the in-memory data).

shanemcd · 2021-05-15T23:06:41Z

Replaced by #321

shanemcd force-pushed the really-close-lockfile branch from 513920a to 71fab47 Compare May 13, 2021 15:43

Ensure that status lockfile is closed before trying to release work

a3134f8

For context, see: - ansible#319 - ansible/awx#9961

shanemcd force-pushed the really-close-lockfile branch from 71fab47 to a3134f8 Compare May 13, 2021 17:23

ghjm reviewed May 13, 2021

View reviewed changes

shanemcd closed this May 15, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ensure that status lockfile is closed before trying to release work #320

Ensure that status lockfile is closed before trying to release work #320

shanemcd commented May 13, 2021

chrismeyersfsu commented May 13, 2021

ghjm May 13, 2021

fosterseth May 13, 2021

ghjm May 13, 2021

ghjm May 13, 2021

shanemcd commented May 15, 2021

Ensure that status lockfile is closed before trying to release work #320

Ensure that status lockfile is closed before trying to release work #320

Conversation

shanemcd commented May 13, 2021

chrismeyersfsu commented May 13, 2021

ghjm May 13, 2021

Choose a reason for hiding this comment

fosterseth May 13, 2021

Choose a reason for hiding this comment

ghjm May 13, 2021

Choose a reason for hiding this comment

ghjm May 13, 2021

Choose a reason for hiding this comment

shanemcd commented May 15, 2021