GC: Fix GC deleted stats not correctly updated for data stores#18890
GC: Fix GC deleted stats not correctly updated for data stores#18890agarwal-navin merged 4 commits intomicrosoft:mainfrom
Conversation
⯅ @fluid-example/bundle-size-tests: +1.21 KB
Baseline commit: 90c4ad2 |
| loaderProps: { configProvider: mockConfigProvider(settings) }, | ||
| }; | ||
| mainContainer = await provider.makeTestContainer(testContainerConfig); | ||
| mainDataStore = (await mainContainer.getEntryPoint()) as ITestDataObject; |
There was a problem hiding this comment.
Same for summarizerDataStore below
| beforeEach(async function () { | ||
| provider = getTestObjectProvider({ syncSummarizer: true }); | ||
| // These tests validate the GC stats in summary by calling summarize directly on the container runtime. | ||
| // These tests validate the GC stats in summary by calling summarize directly on the mainContainer runtime. |
There was a problem hiding this comment.
I think this should not say "directly on the mainContainer". Maybe update this comment to say more about disabling summarizer heurestics and only calling it/GC explicitly, on a separate container.
There was a problem hiding this comment.
Updated the comment.
| const summarizerDataStore = (await summarizerContainer.getEntryPoint()) as ITestDataObject; | ||
| summarizerRuntime = summarizerDataStore._context.containerRuntime as ContainerRuntime; | ||
| summarizerDataStore._root.set("write", "mode"); | ||
| await waitForContainerWriteModeConnectionWrite(summarizerContainer); |
There was a problem hiding this comment.
Is this necessary? ensureSynchronized calls throughout should be sufficient, I would think. Seems fine to leave it since it's more explicit, but wanted to ask.
There was a problem hiding this comment.
It is needed. I added a comment to explain why.
| ); | ||
|
|
||
| // Deleted stats. Wait for sweep timeout and send an op to update the current reference timestamp. Usually, | ||
| // GC wouldn't run without ops so this step is not needed but its needed here because we are explicitly |
There was a problem hiding this comment.
typo
| // GC wouldn't run without ops so this step is not needed but its needed here because we are explicitly | |
| // GC wouldn't run without ops so this step is not needed but it's needed here because we are explicitly |
| mainDataStore._root.set("update", "timestamp"); | ||
| await provider.ensureSynchronized(); | ||
|
|
||
| mainContainer.close(); |
There was a problem hiding this comment.
Added a comment to explain why.
| expectedGCStats.dataStoreCount -= 2; | ||
| expectedGCStats.unrefDataStoreCount -= 2; | ||
| expectedGCStats.deletedDataStoreCount += 2; | ||
| expectedGCStats.updatedDataStoreCount = 0; |
There was a problem hiding this comment.
I wonder if it makes more sense to include deletions in the "updated" count? It's fine either way as long as we remember how it works when looking at the data :)
There was a problem hiding this comment.
The updated count tells whether a node's reference state changed. It's mainly used to validate that data store are not re-summarized unexpectedly here.
If a data store is deleted, it's reference state doesn't change and it won't be re-summarized so there is no need to update it.
Like you said, we could update its meaning so that deleted in included but I don't think it adds anything yet.
…soft#18890) ## Bug microsoft#18506 added deleted stats to the GarbageCollection_end logs. However, the calculation for getting deleted data stores count is incorrect. It calls `this.runtime.getNodeType(nodeId)` to get the node's type which checks if the data store with the given nodeId exists. Since the data store has been deleted, it ends up returning the type as other. ## Fix In case of deleted nodes, the garbage collector gets the node type itself by looking the path. This should be fine because it is only used for stats and for the current node types (data store, DDS and blob), this will work fine.
…enable sweep (#19087) This PR ports the following changes to internal-8.0 branch. internal-8.0 contains changes that will help enable sweep and these are follow ups that did not make it to the branch. - [Initialize GC state as soon as possible and update it on (re)connecti… · 40ad2d4 (github.com)](40ad2d4) - [GC: Fix GC deleted stats not correctly updated for data stores by agarwal-navin · Pull Request #18890 · microsoft/FluidFramework (github.com)](#18890) - [GC: Added whether tombstone is enabled to GarbageCollection_end event by agarwal-navin · Pull Request #18927 · microsoft/FluidFramework (github.com)](https://github.com/microsoft/FluidFramework/pull/18927/files)
…soft#18890) ## Bug microsoft#18506 added deleted stats to the GarbageCollection_end logs. However, the calculation for getting deleted data stores count is incorrect. It calls `this.runtime.getNodeType(nodeId)` to get the node's type which checks if the data store with the given nodeId exists. Since the data store has been deleted, it ends up returning the type as other. ## Fix In case of deleted nodes, the garbage collector gets the node type itself by looking the path. This should be fine because it is only used for stats and for the current node types (data store, DDS and blob), this will work fine.
…enable sweep (#19204) This PR ports the following changes to internal-7.4 branch. internal-7.4 contains changes that will help enable sweep and these are follow ups that did not make it to the branch. - [Initialize GC state as soon as possible and update it on (re)connecti… · 40ad2d4 (github.com)](40ad2d4) - [GC: Fix GC deleted stats not correctly updated for data stores by agarwal-navin · Pull Request #18890 · microsoft/FluidFramework (github.com)](#18890) - [GC: Added whether tombstone is enabled to GarbageCollection_end event by agarwal-navin · Pull Request #18927 · microsoft/FluidFramework (github.com)](https://github.com/microsoft/FluidFramework/pull/18927/files)
…enable sweep (microsoft#19204) This PR ports the following changes to internal-7.4 branch. internal-7.4 contains changes that will help enable sweep and these are follow ups that did not make it to the branch. - [Initialize GC state as soon as possible and update it on (re)connecti… · microsoft/FluidFramework@40ad2d4 (github.com)](microsoft@40ad2d4) - [GC: Fix GC deleted stats not correctly updated for data stores by agarwal-navin · Pull Request microsoft#18890 · microsoft/FluidFramework (github.com)](microsoft#18890) - [GC: Added whether tombstone is enabled to GarbageCollection_end event by agarwal-navin · Pull Request microsoft#18927 · microsoft/FluidFramework (github.com)](https://github.com/microsoft/FluidFramework/pull/18927/files)
Bug
#18506 added deleted stats to the GarbageCollection_end logs. However, the calculation for getting deleted data stores count is incorrect. It calls
this.runtime.getNodeType(nodeId)to get the node's type which checks if the data store with the given nodeId exists. Since the data store has been deleted, it ends up returning the type as other.Fix
In case of deleted nodes, the garbage collector gets the node type itself by looking the path. This should be fine because it is only used for stats and for the current node types (data store, DDS and blob), this will work fine.