-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
When the context is cancelled the node is marked dead #484
Comments
Thanks for the helpful issue report. I will look into it asap. |
@AndreKR Can you review the above code, please? I think it's the correct fix, but maybe I've been missing something.
|
The second result in your example should also fail because you reused the canceled/deadlined context from request 1. If you specify a new context with a timeout of 4s, the output would be:
|
Yep, works like a charm. (With the fixed test code and in my real application.) From looking at the code it seems that if an error were to happen during a retry it would still mark the node dead, but that might actually be a sensible thing to do. |
Will be fixed in 3.0.67 and 5.0.31. |
This issue has returned. I'm not sure why it worked back then but maybe we both were using Go 1.7 at that point? Anyway, the problem is that Fortunately the res, err := c.c.Do((*http.Request)(req).WithContext(ctx))
if uerr, ok := err.(*url.Error); ok {
if uerr.Err == context.Canceled || uerr.Err == context.DeadlineExceeded {
// Proceed, but don't mark the node as dead
return nil, err
}
} |
The problem seems to occur on e.g. when redirect fails. I will add another check for that case. I've already used 1.8 back then. |
In certain cases, the returned error on a canceled or deadlined request is not `context.Canceled` or `context.DeadlineExceeded`, but a `*url.Error` whose `Error` field carries the above context errors. In the standard library, there is a specific test for redirects that checks this case (https://golang.org/src/net/http/client_test.go#L329), so we fix this in the same way in `PerformRequest`. See #484
The origin of this seems to be 5 years ago :-) |
Just pushed 5.0.44. |
Fantastic. :) For the record, this is not related to redirects, try this: client := http.Client{}
ctx, _ := context.WithTimeout(context.Background(), 1*time.Second) // requests will time out after 1 second
req, _ := http.NewRequest("GET", "https://httpbin.org/delay/3", nil) // every request will take about 3 seconds
_, err := client.Do(req.WithContext(ctx))
fmt.Println(err)
fmt.Println(reflect.TypeOf(err).String())
fmt.Println(err == context.Canceled)
fmt.Println(err == context.DeadlineExceeded) Output:
|
@olivere it seems that we are facing the same problem on elastic.v6. Version How to reproduce: package main
import (
"context"
"log"
"os"
"time"
"gopkg.in/olivere/elastic.v6"
)
func main() {
var err error
client, err := elastic.NewClient(
elastic.SetURL(
"http://localhost:9200",
),
elastic.SetSniff(false),
elastic.SetErrorLog(log.New(os.Stderr, "", log.LstdFlags)),
elastic.SetInfoLog(log.New(os.Stdout, "", log.LstdFlags)),
)
if err != nil {
log.Fatal(err)
}
for i := 0; i < 50; i++ {
func(i int) {
log.Println("Running request ", i)
ctx, cancelFunc := context.WithTimeout(context.Background(), 1*time.Millisecond)
defer cancelFunc()
_, err := client.Get().Index("index-name").Type("_doc").Id("35642796").Do(ctx)
if err != nil {
log.Println("Err: " + err.Error())
}
log.Println("Finished request ", i)
log.Print("\n\n\n")
}(i)
}
} Output log
|
I'm on vacation now, but re-opening to look into it when I'm back. |
Thank you, disabling HealthCheck seems like the better solution for now. The code below will not produce this error: client, err := elastic.NewClient(
elastic.SetURL(
"http://localhost:9200",
),
elastic.SetSniff(false),
elastic.SetHealthcheck(false),
elastic.SetErrorLog(log.New(os.Stderr, "", log.LstdFlags)),
elastic.SetInfoLog(log.New(os.Stdout, "", log.LstdFlags)),
) |
@wedneyyuri So the problem seems to be that the healthcheck context runs into a timeout, hence the context is canceled. If the healthcheck runs into a timeout, what is it going to do? I think it's correct to mark the connection as dead if the healthcheck doesn't return in time. Am I missing something? Notice that even if all nodes are dead, |
@olivere Seems like I guess it's causing safe requests to be cancelled before they hit the server. |
When the retrier fails due to a context timeout, don't mark the node as dead. This is a possible fix to #484.
@wedneyyuri Can you try the context-canceled.issue-484 branch to see if that fixes the problem? |
Will merge this in the next release then. Thanks for your support. |
This commit ignores *url.Error errors that are marked as temporary. It also removes the context timeout checks for retrier added earlier because that cannot happen in that code path. Fix #484
I still get the following error no available connection: no Elasticsearch node available I am running my code in a multi threaded env (10 goroutines). Each go routine has an instance of a client. In each goroutine, I do get documents and insert documents using context.Background(). If the code is fixed, i am not sure where i am at fault. I have disabled sniffing and healthcheck (elastic.SetSniff(false), elastic.SetHealthcheck(false)) Any insights would help |
If you get |
Get http://127.0.0.1:9200/instruments/doc/IDBBGLOBAL138357_IDBBUNIQUE138357: dial tcp 127.0.0.1:9200:connect: can't assign requested address. This is the message i get before the "no available connection" error. Also i changed my code, so that i use a single client for all by goroutines, instead of individual clients per goroutine. I dont get the error message all the time, but i do get it off and on. |
@arunplayground When seeing |
You are absolutely right. There were rouge connections alive to Redis. I implemented a proper pooling of connections to Redis, and the client connections to elasticsearch stopped complaining about the dial tcp errors. I have to look into how to increase the number of available sockets to a user in Mac OS |
I am still getting the same problem/behaviour. I can make it working just disabling health check. I have imported the library using [[projects]]
digest = "1:995fe8c9729e94587361e836896824337bbceab8030f698a40552b1a63bf2c59"
name = "github.com/olivere/elastic"
packages = [
".",
"config",
"uritemplates",
]
pruneopts = "UT"
revision = "1619150b007041b6dba8aa447f0e2d151cc2b4c5"
version = "v6.2.14" My golang version is Any advices @olivere ? |
Version
elastic.v5 (for Elasticsearch 5.x)
How to reproduce:
Actual
Expected
Something like (I edited that "log" myself):
The text was updated successfully, but these errors were encountered: