Memory usage for activation data improved. #5907

SebastianStehle · 2019-08-26T15:57:43Z

This PR creates a memory optimized version of the Grain lifecycle subject. The ConcurrentDictionary creates a lot of sparse arrays and consumes a lot of memory. In the context of the grain I think that concurrency is not a topic.

With concurrent dictionary (100k Grains)

With simple list (100k Grains)

jason-bragg · 2019-08-26T18:00:04Z

Good catch.

I agree that the thread safety of a concurrent dictionary is probably not necessary for grains, however having a common lifecycle implementation (or at least the core of the system being shared) reduces code maintenance costs. Duplicating that logic with a different container doubles our code and testing for this system.

For the effects of this to be meaningful the benchmark allocates 100k grains, which are quite a bit more than what tends to be on a single silo.

I think the change in principle is good, but it's not clear the benefit out ways the maintenance cost of duplicate code. Would you mind taking another pass to see if you can keep the majority of the code common.

Thanks

SebastianStehle · 2019-08-26T18:09:02Z

I was expecting this feedback. For me it seems that there are 3 lifecycle subjects:

Clients
Silos
Grains

For my understanding thread safety is not needed for all three of them, but I am not sure.

# Conflicts: # src/Orleans.Core/Lifecycle/LifecycleSubject.cs # src/Orleans.Runtime/Catalog/GrainLifecycle.cs # test/Benchmarks/Program.cs

SebastianStehle · 2019-08-26T18:37:11Z

I just changed the base implementation to use a list instead.

jason-bragg · 2019-08-26T18:46:04Z

Fair point. When initially introduced it was not clear what threads of control would be modifying the lifecycle. Components adding/removing themselves from the lifecycle while it's running was especially a concern (which we've not fully addressed). Using concurrent containers was more of an expedience than a design decision, because the way the systems were exposed it was not clear (and still isn't) how to either enforce thread safe access, or communicate to users in a clear way how they should modify the system which is safe.

If we consider the lifecycle as immutable once started, maybe we should separate the observable lifecycle from the runnable lifecycle and the observable simply be a builder of observer info, which is passed to a runnable lifecycle...?

ReubenBond · 2019-08-26T18:50:38Z

We could store a simple list of observers & avoid those linq allocations.
Less abstraction is better here, IMHO

jason-bragg · 2019-08-26T19:17:04Z

I think the linq statements are only an issue due to how long those allocations are around in the current impl. They could easily end up in gen 1 as is, which is not great. IMO, the code is more readable and maintainable using them as compared to hand coding the grouping and sorting, but if we did the organizing in a pass before the main fannout loop they shouldn't be harmful. This, to me, seeams another advantage to separating the observer info building from execution.
Thoughts?

SebastianStehle · 2019-08-26T19:38:15Z

Before making more improvements it would be great to also consider other feature requests like:

I think the grain structure in general is hard to use, because it its not easy to add functionalities and it could also be easier to test grains.

sergeybykov · 2019-08-29T16:51:57Z

I like this. If we don't have concurrency in the use cases today, no need to pay the perf price. If we might see a concurrent use, should we simply add an optional lock that would be used only if LifecycleSubject is instantiated with a concurrency flag?

Allocations caused by LINQ are a concern. We see some workloads where grains are activated to handle a single request. The impact of such allocations is even worse in those cases.

SebastianStehle · 2019-08-29T19:10:58Z

The numbers are from a test where 50.000 grains are created, each for one request, but I have not seen the impact of the linq statements in the analysis.

jason-bragg · 2019-08-29T19:45:06Z

The concurrency issue, imo, is not as much a question of what we do today, because the lifecycle is an extensibility point, but more a question of what do we support. If modification of the lifecycle needs be performed in a single thread, then we need have some means of ensuring that users are aware of this and that patterns we support ensure this.

Grains:
The patterns described in the grain lifecycle docs look to be thread safe. Thoughts?

https://dotnet.github.io/orleans/Documentation/grains/grain_lifecycle.html

Silo
The patterns described in the silo lifecycle docs look to be thread safe. Thoughts?

https://dotnet.github.io/orleans/Documentation/clusters_and_clients/silo_lifecycle.html

The patterns described in the lifecycle core system would require thread safe implementations, but thats mainly describing the abstractions, so we can deal with thread safety when we get there.

https://dotnet.github.io/orleans/Documentation/implementation/orleans_lifecycle.html

From the above, I think it's safe to make this change, as the proscribed patterns are safe with this change. Someone please sanity check me here.

sergeybykov · 2019-08-30T22:15:19Z

Thank you, @SebastianStehle!

Memory usage for activation data improved.

2630fd6

SebastianStehle added 6 commits August 26, 2019 20:18

Use single lifecycle subject implementation.

63371c0

Merge branch 'master' into activation-data-memory-usage

6b36845

# Conflicts: # src/Orleans.Core/Lifecycle/LifecycleSubject.cs # src/Orleans.Runtime/Catalog/GrainLifecycle.cs # test/Benchmarks/Program.cs

Improvement implemented again after merge.

bde3e4a

Benchmark removed.

8eee4e4

Reverted change

e3d7c8c

Reverted another change.

f400b75

sergeybykov mentioned this pull request Aug 29, 2019

Optimize GrainLifecycle #5904

Closed

jason-bragg merged commit 1f213bc into dotnet:master Aug 30, 2019

github-actions bot locked and limited conversation to collaborators Dec 2, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory usage for activation data improved. #5907

Memory usage for activation data improved. #5907

SebastianStehle commented Aug 26, 2019

jason-bragg commented Aug 26, 2019

SebastianStehle commented Aug 26, 2019

SebastianStehle commented Aug 26, 2019

jason-bragg commented Aug 26, 2019

ReubenBond commented Aug 26, 2019

jason-bragg commented Aug 26, 2019 •

edited

SebastianStehle commented Aug 26, 2019

sergeybykov commented Aug 29, 2019

SebastianStehle commented Aug 29, 2019 •

edited

jason-bragg commented Aug 29, 2019

sergeybykov commented Aug 30, 2019

Memory usage for activation data improved. #5907

Memory usage for activation data improved. #5907

Conversation

SebastianStehle commented Aug 26, 2019

jason-bragg commented Aug 26, 2019

SebastianStehle commented Aug 26, 2019

SebastianStehle commented Aug 26, 2019

jason-bragg commented Aug 26, 2019

ReubenBond commented Aug 26, 2019

jason-bragg commented Aug 26, 2019 • edited

SebastianStehle commented Aug 26, 2019

sergeybykov commented Aug 29, 2019

SebastianStehle commented Aug 29, 2019 • edited

jason-bragg commented Aug 29, 2019

sergeybykov commented Aug 30, 2019

jason-bragg commented Aug 26, 2019 •

edited

SebastianStehle commented Aug 29, 2019 •

edited