Skip to content

Conversation

@BewareMyPower
Copy link
Contributor

Documentation

  • doc
  • doc-required
  • doc-not-needed
  • doc-complete

Matching PR in forked repository

PR in forked repository:

@github-actions github-actions bot added PIP doc Your PR contains doc changes, no matter whether the changes are in markdown or code files. labels Jan 21, 2026
@BewareMyPower
Copy link
Contributor Author

"metadata-store-10-1" #27 [72] prio=5 os_prio=0 cpu=17707.20ms elapsed=5258.47s
"configuration-metadata-store-13-1" #31 [76] prio=5 os_prio=0 cpu=17575.64ms elapsed=5257.48s
"bookkeeper-ml-scheduler-OrderedScheduler-0-0" #54 [98] prio=5 os_prio=0 cpu=105.04ms elapsed=5255.36s
"bookkeeper-ml-scheduler-OrderedScheduler-1-0" #55 [99] prio=5 os_prio=0 cpu=138.76ms elapsed=5255.36s
"bookkeeper-ml-scheduler-OrderedScheduler-2-0" #56 [100] prio=5 os_prio=0 cpu=113.28ms elapsed=5255.36s

The metadata store thread even spends much more CPU time than the bookkeeper-ml worker threads.

@lhotari
Copy link
Member

lhotari commented Jan 22, 2026

The metadata store thread even spends much more CPU time than the bookkeeper-ml worker threads.

It's not related to this PIP, but there's also a possibility to save CPU in derialization. Due to consistency reasons, the MetadataStore cache entries expire after 10 minutes. There's a background refresh in use which means that if the entry has been used before it expires, a new refresh will happen in the background between 5 to 10 minutes from the last refresh.
In many cases, there haven't been any changes since the last refresh. Therefore the deserialization step is completely unnecessary when there haven't been any changes. The previous deserialized value could be used instead of deserializing again.

Another detail related to wasted CPU. When an entry gets modified, it would get refreshed 2 times:

@Override
public CompletableFuture<Void> put(String path, T value, EnumSet<CreateOption> options) {
final byte[] bytes;
try {
bytes = serde.serialize(path, value);
} catch (IOException e) {
return CompletableFuture.failedFuture(e);
}
if (storeExtended != null) {
return storeExtended.put(path, bytes, Optional.empty(), options).thenAccept(__ -> refresh(path));
} else {
return store.put(path, bytes, Optional.empty()).thenAccept(__ -> refresh(path));
}
}

public void accept(Notification t) {
String path = t.getPath();
switch (t.getType()) {
case Created:
case Modified:
refresh(path);
break;

@BewareMyPower
Copy link
Contributor Author

There are much room to improve for metadata store. I will open a series of PRs in next few weeks. Regarding the cache, I think it should be okay because the cache refresh interval is 5 minutes, which is long enough. Actually I don't think the cache here makes sense. The metadata store listener is able to update the cache.

What I can think of is that the cache can prevent outdated metadata in case the listener didn't work correctly. But from such perspective, 5 minutes would be too long.

@BewareMyPower
Copy link
Contributor Author

BewareMyPower commented Jan 22, 2026

BTW, I just ran a round of test with the new threading model (as well as a few improvements to move the compute sensitive tasks out of the metadata store thread).

"metadata-store-serdes-OrderedExecutor-0-0" #27 [83] prio=5 os_prio=0 cpu=288.75ms
"metadata-store-serdes-OrderedExecutor-1-0" #28 [84] prio=5 os_prio=0 cpu=165.31ms
"metadata-store-serdes-OrderedExecutor-2-0" #29 [85] prio=5 os_prio=0 cpu=252.60ms
"metadata-store-batch-flusher-12-1" #30 [86] prio=5 os_prio=0 cpu=1217.19ms
"metadata-store-events-10-1" #59 [114] prio=5 os_prio=0 cpu=333.75ms
"main-EventThread" #32 [88] daemon prio=5 os_prio=0 cpu=89.03ms

Before this change, the tasks executed by batch-flusher (as well as other 3 serdes threads) would be executed by events thread.

@nodece
Copy link
Member

nodece commented Jan 23, 2026

Makes sense. If multiple operators (flush/serialization/deserialization) depend on the same thread, that thread becomes a bottleneck. Using separate threads here looks good to me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

doc Your PR contains doc changes, no matter whether the changes are in markdown or code files. PIP release/4.0.9 release/4.1.3

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants