Excessive gRPC Executor Threads Without Resource Quota Limits

 **Problem**

  During SH testing, a Storage Manager (SM) process exhausted thread resources with 541 threads (normally ~80+). Thread dump analysis shows:

  Thread Distribution:
  - 209 threads: grpc_core::Executor::ThreadMain (gRPC's global executor pool, all idle)
  - 246 threads: Empty stack (0x0000000000000000, likely leaked/zombie threads)
  - 32 threads: grpc_threadpool (gRPC internal thread pool)
  - 54 threads: Application threads (folly, nuraft, iomanager, etc.) - normal

  **AI Analysis - Possible Causes**
  1. gRPC executor unbounded growth: gRPC's internal thread pool auto-scales with load but never shrinks. Without ResourceQuota limits, high concurrent RPC calls or connection churn causes accumulation.
  2. Thread lifecycle issue: 246 empty-stack threads suggest cleanup problems, possibly in system libraries (folly/nuraft/boost.asio) or OS-level issues.

  **Current State**

  The sisl gRPC wrapper (sisl/src/grpc/rpc_server.cpp:44-79) does not set resource quotas:

```
  m_builder.SetMaxReceiveMessageSize(max_receive_msg_size);
  m_builder.SetMaxSendMessageSize(max_send_msg_size);
  // Missing: ResourceQuota to limit executor threads
```

  **Why Not Fixing Now**

  - Root cause unclear: Need to double confirm by human
  -Impact minimal: Pod auto-restarts when hitting limits, no persistent service degradation
  - Optimal limit unknown: Need to determine appropriate MaxThreads value


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Excessive gRPC Executor Threads Without Resource Quota Limits #291

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Excessive gRPC Executor Threads Without Resource Quota Limits #291

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions