Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support dynamic update cache config #13679

Merged
merged 16 commits into from Mar 14, 2022

Conversation

lordcheng10
Copy link
Contributor

@lordcheng10 lordcheng10 commented Jan 8, 2022

Motivation

Support dynamic update cache config :

managedLedgerCacheEvictionTimeThresholdMillis
managedLedgerCacheSizeMB
managedLedgerCacheEvictionWatermark

Documentation

Check the box below or label this PR directly (if you have committer privilege).

Need to update docs?

  • doc-required

    (If you need help on updating docs, create a doc issue)

  • no-need-doc

    (Please explain why)

  • doc

    (If this PR contains doc changes)

@github-actions github-actions bot added the doc-not-needed Your PR changes do not impact docs label Jan 8, 2022
@lordcheng10
Copy link
Contributor Author

@lordcheng10 lordcheng10 changed the title support dynamic update config : managedLedgerCacheEvictionTimeThresh… Support dynamic update cache config Jan 8, 2022
@lordcheng10
Copy link
Contributor Author

/pulsarbot run-failure-checks

1 similar comment
@lordcheng10
Copy link
Contributor Author

/pulsarbot run-failure-checks

managed-ledger/pom.xml Outdated Show resolved Hide resolved
@lordcheng10
Copy link
Contributor Author

/pulsarbot run-failure-checks

2 similar comments
@lordcheng10
Copy link
Contributor Author

/pulsarbot run-failure-checks

@lordcheng10
Copy link
Contributor Author

/pulsarbot run-failure-checks

Copy link
Contributor

@eolivelli eolivelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A general comment about this implementation.

Referring everything to the Service Configuration is a waste of resource and we should avoid this especially inside synchonized methods.

I suggest to not evaluate the values of the configuration parameters everything but to try to notify the components that the Configuration changed and to update the current local configuration when such notification happens

managed-ledger/pom.xml Outdated Show resolved Hide resolved
@lordcheng10
Copy link
Contributor Author

/pulsarbot run-failure-checks

6 similar comments
@lordcheng10
Copy link
Contributor Author

/pulsarbot run-failure-checks

@lordcheng10
Copy link
Contributor Author

/pulsarbot run-failure-checks

@lordcheng10
Copy link
Contributor Author

/pulsarbot run-failure-checks

@lordcheng10
Copy link
Contributor Author

/pulsarbot run-failure-checks

@lordcheng10
Copy link
Contributor Author

/pulsarbot run-failure-checks

@lordcheng10
Copy link
Contributor Author

/pulsarbot run-failure-checks

@lordcheng10
Copy link
Contributor Author

lordcheng10 commented Jan 11, 2022

A general comment about this implementation.

Referring everything to the Service Configuration is a waste of resource and we should avoid this especially inside synchonized methods.

I suggest to not evaluate the values of the configuration parameters everything but to try to notify the components that the Configuration changed and to update the current local configuration when such notification happens

OK I will fix

@codelipenghui codelipenghui modified the milestones: 2.11.0, 2.10.0 Jan 18, 2022
@github-actions
Copy link

The pr had no activity for 30 days, mark with Stale label.

@lordcheng10
Copy link
Contributor Author

lordcheng10 commented Mar 13, 2022

Fixed.
PTAL,thanks! @eolivelli @Jason918 @codelipenghui

Copy link
Contributor

@aloyszhang aloyszhang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lordcheng10
Copy link
Contributor Author

lordcheng10 commented Mar 14, 2022

We should also add tests

Fixed.
PTAL,thanks! @eolivelli

Copy link
Contributor

@eolivelli eolivelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we are on our way.
I left some minor feedback, last corners to be rounded

/**
* @return time threshold for eviction.
* */
long getCacheEvictionTimeThresholdNanos();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please give a name consistent with updateCacheEvictionTimeThreshold (I would remove "nanos")

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.
PTAL,thanks! @eolivelli

admin.brokers().updateDynamicConfiguration("managedLedgerCacheEvictionTimeThresholdMillis", "2000");

// wait config to be updated
for (int i = 0; i < 5; i++) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why only "5" ?

please consider using Awaiatility

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.
PTAL,thanks! @eolivelli

Copy link
Contributor

@eolivelli eolivelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

awesome work

@eolivelli eolivelli merged commit b0213b2 into apache:master Mar 14, 2022
Copy link
Member

@michaeljmarshall michaeljmarshall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lordcheng10 - are you able to provide motivation for this PR?

Comment on lines +48 to +50
private volatile long maxSize;
private volatile long evictionTriggerThreshold;
private volatile double cacheEvictionWatermark;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am concerned about the cost of this change. These three variables are read each time that a message is written to the cache. While I see the generic benefit of increasing configurability, I don't think we should do so in a way that adds cost (even small costs) to the write path. @lordcheng10 - do you have a use case that shows why the benefits out way the costs here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@merlimat - do you have any thoughts here? I remember that you have raised concerns about volatile in the metadatastore.

I know that we are talking about minor optimizations here, but since this is on the write path, I feel it makes sense to discuss now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you have a use case that shows why the benefits out way the costs here?

If we need to adjust the cache parameters, we need to restart the online cluster in turn. Restarting the cluster will have a certain impact on online business.

For different business scenarios, we cannot give a more reasonable cache configuration at once. May need to adjust the cache configuration multiple times

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am concerned about the cost of this change. These three variables are read each time that a message is written to the cache. While I see the generic benefit of increasing configurability, I don't think we should do so in a way that adds cost (even small costs) to the write path.

Do you have some good ideas? Regarding the need to provide a dynamic configuration cache, but also to solve the problems you mentioned.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lordcheng10 - thanks for the additional context. It would be interesting to see a flame graph for the write path of the broker to identify hot spots. The only reason this PR caught my attention is because of the general heuristic that I've seen @merlimat employ: avoid volatile when possible. Without actual metrics about performance impact, I think we should leave this PR in place.

Nicklee007 pushed a commit to Nicklee007/pulsar that referenced this pull request Apr 20, 2022
nicoloboschi pushed a commit to datastax/pulsar that referenced this pull request Jul 8, 2022
aloyszhang pushed a commit to aloyszhang/pulsar that referenced this pull request Aug 5, 2022
Squash merge branch 'cache_config' into '2.8.1'
Fixes #<xyz>

### Motivation
--story=872772869 支持cache动态配置(apache#13679)


TAPD: --story=872772869
@congbobo184 congbobo184 added cherry-picked/branch-2.9 Archived: 2.9 is end of life release/2.9.4 labels Nov 17, 2022
congbobo184 pushed a commit that referenced this pull request Nov 17, 2022
aloyszhang pushed a commit to aloyszhang/pulsar that referenced this pull request Nov 30, 2022
Squash merge branch 'cache_config' into '2.8.1'
Fixes #<xyz>

### Motivation
--story=872772869 支持cache动态配置(apache#13679)


TAPD: --story=872772869
congbobo184 pushed a commit that referenced this pull request Dec 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants