Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Allow provisionerd to cleanup acquired job #159

Merged
merged 2 commits into from
Feb 4, 2022
Merged

Conversation

kylecarbs
Copy link
Member

If a job is acquired from the database, then provisionerd was
killed, the job would be left in an idle state where it was
technically in-progress.

@kylecarbs kylecarbs self-assigned this Feb 4, 2022
Comment on lines -187 to -189
if p.isClosed() {
return
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So in this case, if provisionerd is killed - we let the job run if it was just acquired?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like it could make it to runJob at the bottom - would we want it to cancelActiveJob in this case?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed it so cancel or acquire can run at once, so it would properly cancel now!

@codecov
Copy link

codecov bot commented Feb 4, 2022

Codecov Report

Merging #159 (10e8ae2) into main (94f71fe) will decrease coverage by 0.33%.
The diff coverage is 52.17%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #159      +/-   ##
==========================================
- Coverage   67.47%   67.14%   -0.34%     
==========================================
  Files         103      103              
  Lines        5132     5120      -12     
  Branches       68       68              
==========================================
- Hits         3463     3438      -25     
- Misses       1359     1371      +12     
- Partials      310      311       +1     
Flag Coverage Δ
unittest-go-macos-latest 64.63% <52.17%> (+0.13%) ⬆️
unittest-go-ubuntu-latest 66.20% <43.47%> (-0.18%) ⬇️
unittest-go-windows-latest 63.30% <45.65%> (-0.52%) ⬇️
unittest-js 64.21% <ø> (ø)
Impacted Files Coverage Δ
provisionerd/provisionerd.go 71.25% <52.17%> (-1.09%) ⬇️
coderd/provisionerdaemons.go 44.06% <0.00%> (-4.23%) ⬇️
peer/conn.go 81.28% <0.00%> (+2.30%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 94f71fe...10e8ae2. Read the comment docs.

@kylecarbs kylecarbs force-pushed the provisionerdebug branch 2 times, most recently from 97f31fa to 942795f Compare February 4, 2022 02:03
@@ -67,26 +66,28 @@ func New(clientDialer Dialer, opts *Options) io.Closer {
type provisionerDaemon struct {
opts *Options

mutex sync.Mutex
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, I like that you consolidated the closeMutex and jobMutex into a single one - seems simpler!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest adding a comment to state when this lock needs to be acquired, and what it's protecting against, so that people know when to acquire/release this lock

@kylecarbs kylecarbs force-pushed the provisionerdebug branch 2 times, most recently from 7b5ca01 to 9985ff3 Compare February 4, 2022 02:24
close(p.jobRunning)
}()
// It's safe to cast this ProvisionerType. This data is coming directly from coderd.
provisioner, hasProvisioner := p.opts.Provisioners[job.Provisioner]
if !hasProvisioner {
go p.cancelActiveJob(fmt.Sprintf("provisioner %q not registered", job.Provisioner))
go p.lockCancelActiveJob(fmt.Sprintf("provisioner %q not registered", job.Provisioner))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

given all the call sites that immediately invoke fmt.Sprintf, what do you think about adding a variant (e.g. p.lockCancelActiveJobf(format string, args ...interface{})) that can do this?

@kylecarbs kylecarbs force-pushed the provisionerdebug branch 6 times, most recently from 1792a01 to c548133 Compare February 4, 2022 16:03
If a job is acquired from the database, then provisionerd was
killed, the job would be left in an idle state where it was
technically in-progress.
@kylecarbs kylecarbs merged commit 2eab1b5 into main Feb 4, 2022
@kylecarbs kylecarbs deleted the provisionerdebug branch February 4, 2022 16:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants