-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Context instead of timeout #7
Comments
"Under the hood" the timeout is using context but I do agree this is a better solution. Thank you for your feedback! |
@gammazero Again, I cannot thank you enough for your feedback! I have been toying with this today and I think I understand what you are getting at now. I should have some updated code later today - whenever you are free, I'd love another review :) |
@gammazero I was finally able to get this knocked out. While I still need to clean up the code, am I using context correctly or could it still use some work? Thanks again! |
I went ahead and cleaned up the code, please, in your own time, open a new issue if you find something that looks fishy or could be improved. This was an excellent idea and I can clearly see how it improves things a ton. Cannot thank you enough. |
The implementation is not quite what I think is needed. You need to have a separate context per request (per job), so that individual requests can time out, and given that, there should probably not be a "default timeout" at all whether in the form of a duration or a context. IMO, the job timeout should be determined by a context passed into call to Then func (wp *WorkerPoolXT) SubmitXT(ctx context.Context, job *Job) {
wp.Submit(wp.wrap(ctx, job))
} Then your func (wp *WorkerPoolXT) wrap(ctx context.Context, job *Job) func() {
// This is the func we ultimately pass to workerpool
return func() {
// Allow job options to override default pool options
if j.Options == nil {
j.Options = p.options
}
j.result = make(chan Result)
j.startedAt = time.Now()
go j.run()
// Wait for job to complete or context to timeout/cancel
select {
case r := <-j.result:
p.result <- r
case <- ctx.Done():
p.result <- j.errResult(j.childCtx.Err())
}
}
} There are a few things to note above:
|
@gammazero I seriously cannot thank you enough. Your response and explanation mean more to me than you could imagine! This makes so much sense to me now. One day I will be like you lol. Sad to admit, but this has been kicking my a** for like 10 hours... Seriously. Thank you. You are a life saver. |
FYI - this fix demonstrated in playground |
@gammazero - I think that the tricky part here is how to combine the ctx.timeout with retry, from my POV the timeout should win always, and in case that there is a timeout the job should be killed to free resources, no matter how much retries you have in queue, WDYT? |
@BredSt if we solve the issue with "lingering jobs" that also solves it for retry. As of now, the timeout does win out over retry. There are tests for this as well. Just that jobs will keep running behind the scenes, while appearing to cancel/timeout. |
@BredSt see these links: |
@oze4 @BredSt Comments here may be helpful: gammazero/workerpool#40 (comment) For handling the retries, the problem is the same whether workerpoolxt is involved or not. Consider that you are given a function by some outside caller. You have no idea what the function does, or if it will ever return. All you can do is to call it in a separate goroutine and wait for that goroutine to complete or give up waiting. The retry can happen when the task function completes and returns an error, but while the task function is running you are waiting on that completion or for timeout. If a timeout happens then there will be no retry when (if ever) the task completes. Even if an error is reported at this point, it is best to keep waiting for the task goroutine so that the pooled goroutine is still occupied until the task finishes. Note that tracking task information (execution time, number or retries, etc.) is a separate concern from limiting concurrency. In other words, most of the things that workerpoolxt does could also be done if run in any goroutine. As a design idea, consider creating a |
Instead of handling the timeout yourself, by waiting for |
Excellent idea. I really like this. I had that thought yesterday - instead of having the caller return something from a Task, pass a "done" func to each Task which the caller can use to aggregate results. Like I also plan on testing all of the solutions you've provided. I also like your idea of returning a chan a lot. Cannot thank you enough. |
I just had an idea... What if instead of (or in addition to) passing in the context, the caller passes in the cancel func? Wouldn't this offer us more control over things? |
@oze4 Do not pass in the cancel function. The task has no idea when to call cancel. Only the caller (or where ever ctx was created) knowns when to cancel. |
When creating a new
WorkerPoolXT
let the caller pass in aContext
instead of specifying a timeout. The direct caller may need to use a context because:Consider the case where a server is processing a client's request, and receives a context with the request. The server wants to use WorkerPoolXT to process the request, and needs to use the request's context to know if processing should be abandoned.
Also, this makes the user choose a timeout, or non at all - whichever is most appropriate. This means that the default timeout should also probably go away. A default timeout is just a guess at how long something should take, and WPXT has know way to know this. Often having no timeout is the correct thing, and is typically the expected default behavior.
The text was updated successfully, but these errors were encountered: