-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Description
We recently observed that reloading segments, even reloading a small set of segments one by one for a partial upsert table (~90 segment replicas and each replica is around 650MB) can cause a significant increase in CPU usage (~30-40%).
We have looked into one server and even we only reloaded 11 segments during the time, the CPU usage is pretty high.
Server (# of cores):

We are using default max.parallel.refresh.threads = 1 in HelixInstanceDataManagerConfig. It means there is no parallel during segment reloading. In other words, the server load mainly comes from refreshing a single segment.
Is there a way to improve the performance? This is blocking our schema evolution for such upsert tables and any suggestions or ideas are appreciated. Thanks!
