Skip to content
This repository has been archived by the owner on Aug 2, 2022. It is now read-only.

Add a cluster level collector for node config settings #298

Merged
merged 5 commits into from
Jul 24, 2020

Conversation

rguo-aws
Copy link
Contributor

@rguo-aws rguo-aws commented Jul 22, 2020

Issue #:
#300

Description of changes:

  1. Add a cluster level collector for node config settings
  2. create a thread-safe node config cache in AppContect to store node config settings and share them among different RCA vertices in RCA graph

Tests:
tested on docker

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@codecov
Copy link

codecov bot commented Jul 22, 2020

Codecov Report

Merging #298 into master will decrease coverage by 0.13%.
The diff coverage is 67.69%.

Impacted file tree graph

@@             Coverage Diff              @@
##             master     #298      +/-   ##
============================================
- Coverage     66.71%   66.58%   -0.14%     
- Complexity     1869     1896      +27     
============================================
  Files           276      284       +8     
  Lines         12307    12559     +252     
  Branches        982     1019      +37     
============================================
+ Hits           8211     8362     +151     
- Misses         3764     3848      +84     
- Partials        332      349      +17     
Impacted Files Coverage Δ Complexity Δ
...lyzer/rca/store/collector/NodeConfigCollector.java 80.00% <ø> (ø) 6.00 <0.00> (?)
...ca/store/collector/NodeConfigClusterCollector.java 48.57% <48.57%> (ø) 6.00 <6.00> (?)
...eanalyzer/rca/store/collector/NodeConfigCache.java 88.46% <88.46%> (ø) 4.00 <4.00> (?)
.../elasticsearch/performanceanalyzer/AppContext.java 92.10% <100.00%> (+0.43%) 11.00 <1.00> (+1.00)
...a/framework/api/flow_units/NodeConfigFlowUnit.java 36.36% <100.00%> (+1.98%) 6.00 <1.00> (+1.00)
...h/performanceanalyzer/rca/framework/core/Node.java 89.83% <100.00%> (+1.89%) 30.00 <1.00> (+2.00)
...analyzer/threads/exceptions/PAThreadException.java 0.00% <0.00%> (-57.15%) 0.00% <0.00%> (-1.00%)
...analyzer/rca/framework/core/MetricsDBProvider.java 21.05% <0.00%> (-21.06%) 4.00% <0.00%> (-1.00%)
...ch/performanceanalyzer/threads/ThreadProvider.java 64.70% <0.00%> (-17.65%) 4.00% <0.00%> (ø%)
.../performanceanalyzer/rca/framework/api/Metric.java 76.19% <0.00%> (-14.29%) 8.00% <0.00%> (ø%)
... and 13 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update dbd8675...3363e16. Read the comment docs.

Comment on lines 50 to 52
if (ret == null) {
return Double.NaN;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is not going to sql db, I don't think we need NaN value and check. Let's instead throw a checked exception here for consumer code safety? (anyway the check has to be handled by the caller; exception will help protect against cases where callers forget the NaN check).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done, throw an exception if config does not exist in cache

}

public void put(NodeKey nodeKey, Resource config, double value) {
nodeConfigCache.put(new NodeConfigKey(nodeKey, config), value);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to check here for Non-Null, Non-Negative values before putting them in HashMap?

Copy link
Contributor Author

@rguo-aws rguo-aws Jul 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we perform this check in the node level NodeConfigCollector when collection configs from DB.

private final NodeConfigCollector nodeConfigCollector;

public NodeConfigClusterCollector(final NodeConfigCollector nodeConfigCollector) {
super(0, 5);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The constants here make the code unreadable, if we can add a variable with name suggesting what the constant represents, it is helpful.

Additionally, later we should refactor to introduce a defaultIntervalPeriod and use is across all collector and similarly for RCAs as well.

Copy link
Contributor Author

@rguo-aws rguo-aws Jul 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, we can create a static const here. But since we will do another refactoring to remove the 5 second intervals form the super() constructor completely, it would be better to leave it as is at this moment ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, please add an issue for this.

//unbounded cache with eviction timeout set to 10 mins
public NodeConfigCache() {
nodeConfigCache =
CacheBuilder.newBuilder()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just Curious: Any particular reason we picked the unbounded cache? We may not be having fixed number of entries we want to put in the cache. So wouldn't leaving it unbounded might cause out of memory errors? It should be fine, If we are sure the number of entries are less.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the maxmium number of entries will be (number of node) * (node config on each node) which is not that big considering that stale entries will be evicted if reaches TTL.

}

@Test
public void testSetAndGetValue() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe 1 UT for updating the value of the key and asserting if the updated value is returned.

@rguo-aws rguo-aws linked an issue Jul 23, 2020 that may be closed by this pull request
@rguo-aws rguo-aws added the feature New feature label Jul 23, 2020
@rguo-aws rguo-aws merged commit 531b034 into master Jul 24, 2020
@rguo-aws rguo-aws deleted the rguo-cluster-node-config branch July 24, 2020 19:38
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
feature New feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Cluster level node config collector
4 participants