Description
Go version
go version go1.22.7 linux/amd64
Output of go env
in your module/workspace:
GO111MODULE=''
GOARCH='amd64'
GOBIN='/home/monitoring/go/bin'
GOCACHE='/home/monitoring/.cache/go-build'
GOENV='/home/monitoring/.config/go/env'
GOEXE=''
GOEXPERIMENT=''
GOFLAGS=''
GOHOSTARCH='amd64'
GOHOSTOS='linux'
GOINSECURE=''
GOMODCACHE='/home/monitoring/go/pkg/mod'
GONOPROXY=''
GONOSUMDB=''
GOOS='linux'
GOPATH='/home/monitoring/go'
GOPRIVATE=''
GOPROXY='https://proxy.golang.org,direct'
GOROOT='/usr/lib/go'
GOSUMDB='sum.golang.org'
GOTMPDIR=''
GOTOOLCHAIN='auto'
GOTOOLDIR='/usr/lib/go/pkg/tool/linux_amd64'
GOVCS=''
GOVERSION='go1.22.7'
GCCGO='gccgo'
GOAMD64='v1'
AR='ar'
CC='gcc'
CXX='g++'
CGO_ENABLED='1'
GOMOD='/home/monitoring/go-test/go.mod'
GOWORK=''
CGO_CFLAGS='-O2 -g'
CGO_CPPFLAGS=''
CGO_CXXFLAGS='-O2 -g'
CGO_FFLAGS='-O2 -g'
CGO_LDFLAGS='-O2 -g'
PKG_CONFIG='pkg-config'
GOGCCFLAGS='-fPIC -m64 -pthread -Wl,--no-gc-sections -fmessage-length=0 -ffile-prefix-map=/tmp/go-build4029050489=/tmp/go-build -gno-record-gcc-switches'
What did you do?
I have a bunch of tools written in golang that use net/http to do queries in parallel. Recently something happened, not sure yet what exactly, but all of the queries started to time out.
All of my tools were using http.DefaultTransport, either explicitly or implicitly.
After investigating, it seems like replacing http.DefaultTransport with its .Clone(), all the problems went away. So I assume there's something mutating http.DefaultTransport or in some other way making it unusable.
This also seem to happen only on one of my servers, and I was not able to reproduce it on my local laptop, not sure what is the difference between the two.
I used this script to reproduce the behaviour and I was successfully able to reproduce it on my server:
package main
import (
"net/http"
"fmt"
"io/ioutil"
"sync"
)
func main() {
urls := []string{
"https://rest.provider-sentry-01.rs-testnet.polypore.xyz/cosmos/slashing/v1beta1/signing_infos/cosmosvalcons1d8h57qw3d2upngacprs4v2gdztq2k3q3ztrqkm",
"https://rest.provider-sentry-01.rs-testnet.polypore.xyz/interchain_security/ccv/provider/consumer_validators/pion-1",
"https://rest.provider-sentry-01.rs-testnet.polypore.xyz/cosmos/staking/v1beta1/params",
"https://rest.provider-sentry-01.rs-testnet.polypore.xyz/interchain_security/ccv/provider/consumer_chains_per_validator/cosmosvalcons1d8h57qw3d2upngacprs4v2gdztq2k3q3ztrqkm",
"https://rest.provider-sentry-01.rs-testnet.polypore.xyz/interchain_security/ccv/provider/validator_consumer_addr?chain_id=pion-1&provider_address=cosmosvalcons1d8h57qw3d2upngacprs4v2gdztq2k3q3ztrqkm",
"https://rest.provider-sentry-01.rs-testnet.polypore.xyz/cosmos/staking/v1beta1/validators/cosmosvaloper1zyqledl5wr5rr8l4sx7lkuf9x4qrudmmm72xxp/delegations/cosmos1zyqledl5wr5rr8l4sx7lkuf9x4qrudmm727n2j",
"https://rest.provider-sentry-01.rs-testnet.polypore.xyz/cosmos/distribution/v1beta1/validators/cosmosvaloper1zyqledl5wr5rr8l4sx7lkuf9x4qrudmmm72xxp/commission",
"https://rest.provider-sentry-01.rs-testnet.polypore.xyz/cosmos/bank/v1beta1/supply?pagination.limit=10000&pagination.offset=0",
"https://rest.provider-sentry-01.rs-testnet.polypore.xyz/cosmos/staking/v1beta1/validators?pagination.count_total=true&pagination.limit=1000",
"https://rest.provider-sentry-01.rs-testnet.polypore.xyz/cosmos/distribution/v1beta1/delegators/cosmos1zyqledl5wr5rr8l4sx7lkuf9x4qrudmm727n2j/rewards/cosmosvaloper1zyqledl5wr5rr8l4sx7lkuf9x4qrudmmm72xxp",
"https://rest.provider-sentry-01.rs-testnet.polypore.xyz/interchain_security/ccv/provider/consumer_chains",
"https://rest.provider-sentry-01.rs-testnet.polypore.xyz/cosmos/staking/v1beta1/validators/cosmosvaloper1zyqledl5wr5rr8l4sx7lkuf9x4qrudmmm72xxp/delegations?pagination.count_total=true&pagination.limit=1",
"https://rest.provider-sentry-01.rs-testnet.polypore.xyz/cosmos/slashing/v1beta1/params",
"https://rest.provider-sentry-01.rs-testnet.polypore.xyz/interchain_security/ccv/provider/consumer_commission_rate/pion-1/cosmosvalcons1d8h57qw3d2upngacprs4v2gdztq2k3q3ztrqkm",
"https://rest.provider-sentry-01.rs-testnet.polypore.xyz/cosmos/mint/v1beta1/inflation",
"https://rest.provider-sentry-01.rs-testnet.polypore.xyz/cosmos/staking/v1beta1/validators/cosmosvaloper1zyqledl5wr5rr8l4sx7lkuf9x4qrudmmm72xxp/unbonding_delegations?pagination.count_total=true&pagination.limit=1",
"https://rest.provider-sentry-01.rs-testnet.polypore.xyz/cosmos/base/tendermint/v1beta1/node_info",
"https://rest.provider-sentry-01.rs-testnet.polypore.xyz/cosmos/bank/v1beta1/balances/cosmos1zyqledl5wr5rr8l4sx7lkuf9x4qrudmm727n2j",
}
var wg sync.WaitGroup
for {
for _, url := range urls {
wg.Add(1)
go func (url string) {
defer wg.Done()
client := &http.Client{
Timeout: 10 * 1000000000,
Transport: http.DefaultTransport,
}
req, err := http.NewRequest(http.MethodGet, url, nil)
if err != nil {
fmt.Printf("query failed: %s\n", err)
return
}
fmt.Printf("doing query\n")
res, err := client.Do(req)
if err != nil {
fmt.Printf("query failed: %s\n", err)
return
}
body, _ := ioutil.ReadAll(res.Body)
res.Body.Close()
fmt.Printf("query done: %s\n", body)
}(url)
}
wg.Wait()
}
}
What did you see happen?
After running it for some time, I started seeing this repeatedly (later it starts working correctly, then stops, and all over again):
doing query
doing query
doing query
doing query
doing query
doing query
doing query
doing query
doing query
doing query
doing query
doing query
doing query
doing query
doing query
doing query
doing query
doing query
query failed: Get "https://rest.provider-sentry-01.rs-testnet.polypore.xyz/cosmos/staking/v1beta1/validators?pagination.count_total=true&pagination.limit=1000": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
query failed: Get "https://rest.provider-sentry-01.rs-testnet.polypore.xyz/cosmos/staking/v1beta1/validators/cosmosvaloper1zyqledl5wr5rr8l4sx7lkuf9x4qrudmmm72xxp/delegations/cosmos1zyqledl5wr5rr8l4sx7lkuf9x4qrudmm727n2j": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
query failed: Get "https://rest.provider-sentry-01.rs-testnet.polypore.xyz/interchain_security/ccv/provider/consumer_chains_per_validator/cosmosvalcons1d8h57qw3d2upngacprs4v2gdztq2k3q3ztrqkm": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
query failed: Get "https://rest.provider-sentry-01.rs-testnet.polypore.xyz/interchain_security/ccv/provider/consumer_validators/pion-1": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
query failed: Get "https://rest.provider-sentry-01.rs-testnet.polypore.xyz/cosmos/staking/v1beta1/validators/cosmosvaloper1zyqledl5wr5rr8l4sx7lkuf9x4qrudmmm72xxp/unbonding_delegations?pagination.count_total=true&pagination.limit=1": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
query failed: Get "https://rest.provider-sentry-01.rs-testnet.polypore.xyz/cosmos/staking/v1beta1/params": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
query failed: Get "https://rest.provider-sentry-01.rs-testnet.polypore.xyz/cosmos/mint/v1beta1/inflation": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
query failed: Get "https://rest.provider-sentry-01.rs-testnet.polypore.xyz/cosmos/bank/v1beta1/balances/cosmos1zyqledl5wr5rr8l4sx7lkuf9x4qrudmm727n2j": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
query failed: Get "https://rest.provider-sentry-01.rs-testnet.polypore.xyz/cosmos/bank/v1beta1/supply?pagination.limit=10000&pagination.offset=0": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
query failed: Get "https://rest.provider-sentry-01.rs-testnet.polypore.xyz/cosmos/distribution/v1beta1/validators/cosmosvaloper1zyqledl5wr5rr8l4sx7lkuf9x4qrudmmm72xxp/commission": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
query failed: Get "https://rest.provider-sentry-01.rs-testnet.polypore.xyz/interchain_security/ccv/provider/validator_consumer_addr?chain_id=pion-1&provider_address=cosmosvalcons1d8h57qw3d2upngacprs4v2gdztq2k3q3ztrqkm": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
query failed: Get "https://rest.provider-sentry-01.rs-testnet.polypore.xyz/cosmos/base/tendermint/v1beta1/node_info": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
query failed: Get "https://rest.provider-sentry-01.rs-testnet.polypore.xyz/cosmos/slashing/v1beta1/signing_infos/cosmosvalcons1d8h57qw3d2upngacprs4v2gdztq2k3q3ztrqkm": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
query failed: Get "https://rest.provider-sentry-01.rs-testnet.polypore.xyz/interchain_security/ccv/provider/consumer_commission_rate/pion-1/cosmosvalcons1d8h57qw3d2upngacprs4v2gdztq2k3q3ztrqkm": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
query failed: Get "https://rest.provider-sentry-01.rs-testnet.polypore.xyz/cosmos/staking/v1beta1/validators/cosmosvaloper1zyqledl5wr5rr8l4sx7lkuf9x4qrudmmm72xxp/delegations?pagination.count_total=true&pagination.limit=1": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
query failed: Get "https://rest.provider-sentry-01.rs-testnet.polypore.xyz/cosmos/slashing/v1beta1/params": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
query failed: Get "https://rest.provider-sentry-01.rs-testnet.polypore.xyz/interchain_security/ccv/provider/consumer_chains": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
query failed: Get "https://rest.provider-sentry-01.rs-testnet.polypore.xyz/cosmos/distribution/v1beta1/delegators/cosmos1zyqledl5wr5rr8l4sx7lkuf9x4qrudmm727n2j/rewards/cosmosvaloper1zyqledl5wr5rr8l4sx7lkuf9x4qrudmmm72xxp": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
This seems to happen on all (or almost all) hosts that are queried by a client that is utilising this http.DefaultTransport.
What did you expect to see?
All the queries should work and not time out.