Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Object PUT operations fail after hard shut down one of the storage nodes #1746

Closed
anatoly-bogatyrev opened this issue Sep 1, 2022 · 4 comments · Fixed by #1762
Closed

Object PUT operations fail after hard shut down one of the storage nodes #1746

anatoly-bogatyrev opened this issue Sep 1, 2022 · 4 comments · Fixed by #1762
Assignees
Labels
bug Something isn't working U0 Needs to be resolved immediately

Comments

@anatoly-bogatyrev
Copy link

Object PUT operations fail after hard shut down one of the storage nodes.
Cluster contains 4 nodes (az, buky, vedi, glagoli).
A load flow is generated to az, buky, vedi.
Glagoli has been shut down by turning off the power.
Put objects flow to other nodes freezes with the completion of attempts by timeout.

@fyrchik
Copy link
Contributor

fyrchik commented Sep 1, 2022

Could you, please, attach logs and configs? Also what was the hang duration (timeout) on the client side?

@anatoly-bogatyrev
Copy link
Author

Yes, sure, logs and details will be added asap

@andreysbazhenov andreysbazhenov added the bug Something isn't working label Sep 1, 2022
@fyrchik
Copy link
Contributor

fyrchik commented Sep 2, 2022

Here are the relevant parts of hanging goroutines. It seems that because we use the same pool for client operations and policer, we hang in a HEAD call, and can't really enforce the PUT timeout because it cannot even start to execute.

257 @ 0x43b6b6 0x44b132 0x83e8fc 0x8b83be 0x8b83ab 0x8b7825 0x8b6493 0x8b753f 0x9cdf4d 0x9ce389 0x46b901
#	0x83e8fb	google.golang.org/grpc/internal/transport.(*Stream).waitOnHeader+0x7b			google.golang.org/grpc@v1.48.0/internal/transport/transport.go:324
#	0x8b83bd	google.golang.org/grpc/internal/transport.(*Stream).RecvCompress+0xbd			google.golang.org/grpc@v1.48.0/internal/transport/transport.go:339
#	0x8b83aa	google.golang.org/grpc.(*csAttempt).recvMsg+0xaa					google.golang.org/grpc@v1.48.0/stream.go:980
#	0x8b7824	google.golang.org/grpc.(*clientStream).RecvMsg.func1+0x24				google.golang.org/grpc@v1.48.0/stream.go:845
#	0x8b6492	google.golang.org/grpc.(*clientStream).withRetry+0xd2					google.golang.org/grpc@v1.48.0/stream.go:709
#	0x8b753e	google.golang.org/grpc.(*clientStream).RecvMsg+0x11e					google.golang.org/grpc@v1.48.0/stream.go:844
#	0x9cdf4c	github.com/nspcc-dev/neofs-api-go/v2/rpc/grpc.streamWrapper.ReadMessage.func1+0x2c	github.com/nspcc-dev/neofs-api-go/v2@v2.13.2-0.20220827080658-9e17cdfc7647/rpc/grpc/init.go:39
#	0x9ce388	github.com/nspcc-dev/neofs-api-go/v2/rpc/grpc.(*streamWrapper).withTimeout.func1+0x28	github.com/nspcc-dev/neofs-api-go/v2@v2.13.2-0.20220827080658-9e17cdfc7647/rpc/grpc/init.go:56

256 @ 0x43b6b6 0x44b132 0x9ce28f 0x9cdeca 0xb83868 0xb83011 0xbc0ac6 0xbcd51a 0xcc86bc 0xcc7924 0xcc77fb 0xcc776f 0xcc85a7 0xe0f31a 0xbd6055 0xdda29c 0xddf286 0xddeaf3 0xde08e5 0xbbe057 0x46b901
#	0x9ce28e	github.com/nspcc-dev/neofs-api-go/v2/rpc/grpc.(*streamWrapper).withTimeout+0x10e		github.com/nspcc-dev/neofs-api-go/v2@v2.13.2-0.20220827080658-9e17cdfc7647/rpc/grpc/init.go:62
#	0x9cdec9	github.com/nspcc-dev/neofs-api-go/v2/rpc/grpc.streamWrapper.ReadMessage+0xe9			github.com/nspcc-dev/neofs-api-go/v2@v2.13.2-0.20220827080658-9e17cdfc7647/rpc/grpc/init.go:38
#	0xb83867	github.com/nspcc-dev/neofs-api-go/v2/rpc/client.rwGRPC.ReadMessage+0x67				github.com/nspcc-dev/neofs-api-go/v2@v2.13.2-0.20220827080658-9e17cdfc7647/rpc/client/init.go:59
#	0xb83010	github.com/nspcc-dev/neofs-api-go/v2/rpc/client.SendUnary+0xd0					github.com/nspcc-dev/neofs-api-go/v2@v2.13.2-0.20220827080658-9e17cdfc7647/rpc/client/flows.go:25
#	0xbc0ac5	github.com/nspcc-dev/neofs-api-go/v2/rpc.HeadObject+0x125					github.com/nspcc-dev/neofs-api-go/v2@v2.13.2-0.20220827080658-9e17cdfc7647/rpc/object.go:152
#	0xbcd519	github.com/nspcc-dev/neofs-sdk-go/client.(*Client).ObjectHead+0x299				github.com/nspcc-dev/neofs-sdk-go@v1.0.0-rc.6.0.20220829114550-ee92df32032e/client/object_get.go:441
#	0xcc86bb	github.com/nspcc-dev/neofs-node/pkg/network/cache.(*multiClient).ObjectHead.func1+0x9b		github.com/nspcc-dev/neofs-node/pkg/network/cache/multi.go:168
#	0xcc7923	github.com/nspcc-dev/neofs-node/pkg/network/cache.(*multiClient).iterateClients.func1+0xc3	github.com/nspcc-dev/neofs-node/pkg/network/cache/multi.go:107
#	0xcc77fa	github.com/nspcc-dev/neofs-node/pkg/network.AddressGroup.IterateAddresses+0xba			github.com/nspcc-dev/neofs-node/pkg/network/group.go:37
#	0xcc776e	github.com/nspcc-dev/neofs-node/pkg/network/cache.(*multiClient).iterateClients+0x2e		github.com/nspcc-dev/neofs-node/pkg/network/cache/multi.go:95
#	0xcc85a6	github.com/nspcc-dev/neofs-node/pkg/network/cache.(*multiClient).ObjectHead+0x86		github.com/nspcc-dev/neofs-node/pkg/network/cache/multi.go:167
#	0xe0f319	main.(*reputationClient).ObjectHead+0x79							github.com/nspcc-dev/neofs-node/cmd/neofs-node/object.go:494
#	0xbd6054	github.com/nspcc-dev/neofs-node/pkg/services/object/internal/client.HeadObject+0x1f4		github.com/nspcc-dev/neofs-node/pkg/services/object/internal/client/client.go:239
#	0xdda29b	github.com/nspcc-dev/neofs-node/pkg/services/object/head.(*RemoteHeader).Head+0x43b		github.com/nspcc-dev/neofs-node/pkg/services/object/head/remote.go:93
#	0xddf285	github.com/nspcc-dev/neofs-node/pkg/services/policer.(*Policer).processNodes+0x4e5		github.com/nspcc-dev/neofs-node/pkg/services/policer/check.go:127
#	0xddeaf2	github.com/nspcc-dev/neofs-node/pkg/services/policer.(*Policer).processObject+0xad2		github.com/nspcc-dev/neofs-node/pkg/services/policer/check.go:79
#	0xde08e4	github.com/nspcc-dev/neofs-node/pkg/services/policer.(*Policer).shardPolicyWorker.func1+0x1a4	github.com/nspcc-dev/neofs-node/pkg/services/policer/process.go:65
#	0xbbe056	github.com/panjf2000/ants/v2.(*goWorker).run.func1+0x96						github.com/panjf2000/ants/v2@v2.4.0/worker.go:68

144 @ 0x43b6b6 0x44c213 0x44c1ed 0x4677a5 0x472dd2 0xbd7c9c 0xbd7870 0xbbb31a 0xbbc627 0xbbb5de 0xbdbb63 0xbdb468 0xddac6c 0xdcb213 0xdbbf22 0xac26a6 0xdbbc48 0xdbd082 0xa43a19 0xdbcd45 0xdbb86d 0xdc0ebb 0xa1405f 0x8aefe6 0x8b0816 0x8a9a98 0x46b901
#	0x4677a4	sync.runtime_Semacquire+0x24											runtime/sema.go:56
#	0x472dd1	sync.(*WaitGroup).Wait+0x51											sync/waitgroup.go:136
#	0xbd7c9b	github.com/nspcc-dev/neofs-node/pkg/services/object/put.(*distributedTarget).iteratePlacement+0x21b		github.com/nspcc-dev/neofs-node/pkg/services/object/put/distributed.go:213
#	0xbd786f	github.com/nspcc-dev/neofs-node/pkg/services/object/put.(*distributedTarget).Close+0x10f			github.com/nspcc-dev/neofs-node/pkg/services/object/put/distributed.go:127
#	0xbbb319	github.com/nspcc-dev/neofs-node/pkg/services/object_manager/transformer.(*formatter).Close+0x3d9		github.com/nspcc-dev/neofs-node/pkg/services/object_manager/transformer/fmt.go:103
#	0xbbc626	github.com/nspcc-dev/neofs-node/pkg/services/object_manager/transformer.(*payloadSizeLimiter).release+0x126	github.com/nspcc-dev/neofs-node/pkg/services/object_manager/transformer/transformer.go:197
#	0xbbb5dd	github.com/nspcc-dev/neofs-node/pkg/services/object_manager/transformer.(*payloadSizeLimiter).Close+0x1d	github.com/nspcc-dev/neofs-node/pkg/services/object_manager/transformer/transformer.go:76
#	0xbdbb62	github.com/nspcc-dev/neofs-node/pkg/services/object/put.(*validatingTarget).Close+0x82				github.com/nspcc-dev/neofs-node/pkg/services/object/put/validation.go:129
#	0xbdb467	github.com/nspcc-dev/neofs-node/pkg/services/object/put.(*Streamer).Close+0x47					github.com/nspcc-dev/neofs-node/pkg/services/object/put/streamer.go:263
#	0xddac6b	github.com/nspcc-dev/neofs-node/pkg/services/object/put/v2.(*streamer).CloseAndRecv+0x6b			github.com/nspcc-dev/neofs-node/pkg/services/object/put/v2/streamer.go:113
#	0xdcb212	github.com/nspcc-dev/neofs-node/pkg/services/object/acl/v2.putStreamBasicChecker.CloseAndRecv+0x32		github.com/nspcc-dev/neofs-node/pkg/services/object/acl/v2/service.go:476
#	0xdbbf21	github.com/nspcc-dev/neofs-node/pkg/services/object.(*ResponseService).Put.func2+0x21				github.com/nspcc-dev/neofs-node/pkg/services/object/response.go:87
#	0xac26a5	github.com/nspcc-dev/neofs-node/pkg/services/util/response.(*ClientMessageStreamer).CloseAndRecv+0x25		github.com/nspcc-dev/neofs-node/pkg/services/util/response/client_stream.go:30
#	0xdbbc47	github.com/nspcc-dev/neofs-node/pkg/services/object.(*putStreamResponser).CloseAndRecv+0x27			github.com/nspcc-dev/neofs-node/pkg/services/object/response.go:67
#	0xdbd081	github.com/nspcc-dev/neofs-node/pkg/services/object.(*SignService).Put.func2+0x21				github.com/nspcc-dev/neofs-node/pkg/services/object/sign.go:98
#	0xa43a18	github.com/nspcc-dev/neofs-node/pkg/services/util.(*RequestMessageStreamer).CloseAndRecv+0x38			github.com/nspcc-dev/neofs-node/pkg/services/util/sign.go:99
#	0xdbcd44	github.com/nspcc-dev/neofs-node/pkg/services/object.(*putStreamSigner).CloseAndRecv+0x24			github.com/nspcc-dev/neofs-node/pkg/services/object/sign.go:78
#	0xdbb86c	github.com/nspcc-dev/neofs-node/pkg/services/object.putStreamMetric.CloseAndRecv+0xec				github.com/nspcc-dev/neofs-node/pkg/services/object/metrics.go:163
#	0xdc0eba	github.com/nspcc-dev/neofs-node/pkg/network/transport/object/grpc.(*Server).Put+0x1ba				github.com/nspcc-dev/neofs-node/pkg/network/transport/object/grpc/service.go:38
#	0xa1405e	github.com/nspcc-dev/neofs-api-go/v2/object/grpc._ObjectService_Put_Handler+0x9e				github.com/nspcc-dev/neofs-api-go/v2@v2.13.2-0.20220827080658-9e17cdfc7647/object/grpc/service_grpc.pb.go:657
#	0x8aefe5	google.golang.org/grpc.(*Server).processStreamingRPC+0xf45							google.golang.org/grpc@v1.48.0/server.go:1558
#	0x8b0815	google.golang.org/grpc.(*Server).handleStream+0x9d5								google.golang.org/grpc@v1.48.0/server.go:1640
#	0x8a9a97	google.golang.org/grpc.(*Server).serveStreams.func1.2+0x97							google.golang.org/grpc@v1.48.0/server.go:932

119 @ 0x43b6b6 0x44b132 0x81e0ae 0x82ab35 0x8b805c 0x8b7353 0x8b66b6 0x8b7025 0x9ce0cd 0x9ce389 0x46b901
#	0x81e0ad	google.golang.org/grpc/internal/transport.(*writeQuota).get+0x6d			google.golang.org/grpc@v1.48.0/internal/transport/flowcontrol.go:59
#	0x82ab34	google.golang.org/grpc/internal/transport.(*http2Client).Write+0x134			google.golang.org/grpc@v1.48.0/internal/transport/http2_client.go:970
#	0x8b805b	google.golang.org/grpc.(*csAttempt).sendMsg+0x2db					google.golang.org/grpc@v1.48.0/stream.go:954
#	0x8b7352	google.golang.org/grpc.(*clientStream).SendMsg.func2+0x52				google.golang.org/grpc@v1.48.0/stream.go:823
#	0x8b66b5	google.golang.org/grpc.(*clientStream).withRetry+0x2f5					google.golang.org/grpc@v1.48.0/stream.go:705
#	0x8b7024	google.golang.org/grpc.(*clientStream).SendMsg+0x324					google.golang.org/grpc@v1.48.0/stream.go:825
#	0x9ce0cc	github.com/nspcc-dev/neofs-api-go/v2/rpc/grpc.streamWrapper.WriteMessage.func1+0x2c	github.com/nspcc-dev/neofs-api-go/v2@v2.13.2-0.20220827080658-9e17cdfc7647/rpc/grpc/init.go:45
#	0x9ce388	github.com/nspcc-dev/neofs-api-go/v2/rpc/grpc.(*streamWrapper).withTimeout.func1+0x28	github.com/nspcc-dev/neofs-api-go/v2@v2.13.2-0.20220827080658-9e17cdfc7647/rpc/grpc/init.go:56

119 @ 0x43b6b6 0x44b132 0x9ce28f 0x9ce04a 0xb8393a 0xbc01cb 0xbcffdb 0xbd6ca5 0xbd9038 0xbd79a2 0xbd8517 0xbbe057 0x46b901
#	0x9ce28e	github.com/nspcc-dev/neofs-api-go/v2/rpc/grpc.(*streamWrapper).withTimeout+0x10e				github.com/nspcc-dev/neofs-api-go/v2@v2.13.2-0.20220827080658-9e17cdfc7647/rpc/grpc/init.go:62
#	0x9ce049	github.com/nspcc-dev/neofs-api-go/v2/rpc/grpc.streamWrapper.WriteMessage+0xe9					github.com/nspcc-dev/neofs-api-go/v2@v2.13.2-0.20220827080658-9e17cdfc7647/rpc/grpc/init.go:44
#	0xb83939	github.com/nspcc-dev/neofs-api-go/v2/rpc/client.rwGRPC.WriteMessage+0x59					github.com/nspcc-dev/neofs-api-go/v2@v2.13.2-0.20220827080658-9e17cdfc7647/rpc/client/init.go:67
#	0xbc01ca	github.com/nspcc-dev/neofs-api-go/v2/rpc.(*PutRequestWriter).Write+0x2a						github.com/nspcc-dev/neofs-api-go/v2@v2.13.2-0.20220827080658-9e17cdfc7647/rpc/object.go:32
#	0xbcffda	github.com/nspcc-dev/neofs-sdk-go/client.(*ObjectWriter).WritePayloadChunk+0x19a				github.com/nspcc-dev/neofs-sdk-go@v1.0.0-rc.6.0.20220829114550-ee92df32032e/client/object_put.go:168
#	0xbd6ca4	github.com/nspcc-dev/neofs-node/pkg/services/object/internal/client.PutObject+0x364				github.com/nspcc-dev/neofs-node/pkg/services/object/internal/client/client.go:398
#	0xbd9037	github.com/nspcc-dev/neofs-node/pkg/services/object/put.(*remoteTarget).Close+0x477				github.com/nspcc-dev/neofs-node/pkg/services/object/put/remote.go:81
#	0xbd79a1	github.com/nspcc-dev/neofs-node/pkg/services/object/put.(*distributedTarget).sendObject+0x101			github.com/nspcc-dev/neofs-node/pkg/services/object/put/distributed.go:139
#	0xbd8516	github.com/nspcc-dev/neofs-node/pkg/services/object/put.(*distributedTarget).iteratePlacement.func1+0x136	github.com/nspcc-dev/neofs-node/pkg/services/object/put/distributed.go:189
#	0xbbe056	github.com/panjf2000/ants/v2.(*goWorker).run.func1+0x96								github.com/panjf2000/ants/v2@v2.4.0/worker.go:68

cthulhu-rider pushed a commit that referenced this issue Sep 7, 2022
Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
@fyrchik
Copy link
Contributor

fyrchik commented Sep 7, 2022

The expected behaviour now: some requests can hang for apiclient.stream_timeout time, throughput can decrease a little. After this it should come back to normal, may be with a little degradation.

aprasolova pushed a commit to aprasolova/neofs-node that referenced this issue Oct 19, 2022
Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working U0 Needs to be resolved immediately
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants