Skip to content

[SCB-150] Add Status(success/failed) dimension to operation call count and tps#508

Merged
WillemJiang merged 10 commits intoapache:masterfrom
zhengyangyong:SCB-150
Jan 24, 2018
Merged

[SCB-150] Add Status(success/failed) dimension to operation call count and tps#508
WillemJiang merged 10 commits intoapache:masterfrom
zhengyangyong:SCB-150

Conversation

@zhengyangyong
Copy link
Copy Markdown

@zhengyangyong zhengyangyong commented Jan 12, 2018

Follow this checklist to help us incorporate your contribution quickly and easily:

  • Make sure there is a JIRA issue filed for the change (usually before you start working on it). Trivial changes like typos do not require a JIRA issue. Your pull request should address just this issue, without pulling in other changes.
  • Each commit in the pull request should have a meaningful subject line and body.
  • Format the pull request title like [SCB-XXX] Fixes bug in ApproximateQuantiles, where you replace SCB-XXX with the appropriate JIRA issue.
  • Write a pull request description that is detailed enough to understand what the pull request does, how, and why.
  • Run mvn clean install to make sure basic checks pass. A more thorough check will be performed on your pull request automatically.
  • If this contribution is large, please file an Apache Individual Contributor License Agreement.

Add Status(success/failed) dimension to operation call count and tps,now they will split into three metrics like:
servicecomb.instance.producer.producerCall.tps.{Status=all}
servicecomb.instance.producer.producerCall.tps.{Status=success}
servicecomb.instance.producer.producerCall.tps.{Status=failed}
servicecomb.instance.producer.producerCall.total.{Status=all}
servicecomb.instance.producer.producerCall.total.{Status=success}
servicecomb.instance.producer.producerCall.total.{Status=failed}
consumerCall also applied

In prometheus will use Label show this dimension:
servicecomb_calculator_calculatorRestEndpoint_calculate_producer_producerCall_tps{Status="success",instance="localhost:9696",job="servicecomb"}
servicecomb_calculator_calculatorRestEndpoint_calculate_producer_producerCall_tps{Status="failed",instance="localhost:9696",job="servicecomb"}
servicecomb_calculator_calculatorRestEndpoint_calculate_producer_producerCall_tps{Status="all",instance="localhost:9696",job="servicecomb"}

Update
now support setting output level of status dimension :

servicecomb:
  metrics:
    dimension:
      #status dimension
      status:
        # success_failed  : only output success & failed (default)
        # code_group : output 1xx,2xx,3xx,4xx,5xx and xxx
        # code       : output detail code like 200,404,500... etc
        output_level: success_failed

@coveralls
Copy link
Copy Markdown

Coverage Status

Coverage increased (+0.0004%) to 87.34% when pulling 62f9984 on zhengyangyong:SCB-150 into 361c73c on apache:master.

@wujimin
Copy link
Copy Markdown
Contributor

wujimin commented Jan 12, 2018

maybe it's better add statusCode dimension, not succ/fail

zhengyangyong added 8 commits January 15, 2018 08:46
Signed-off-by: zhengyangyong <yangyong.zheng@huawei.com>
Signed-off-by: zhengyangyong <yangyong.zheng@huawei.com>
Signed-off-by: zhengyangyong <yangyong.zheng@huawei.com>
Signed-off-by: zhengyangyong <yangyong.zheng@huawei.com>
Signed-off-by: zhengyangyong <yangyong.zheng@huawei.com>
Signed-off-by: zhengyangyong <yangyong.zheng@huawei.com>
Signed-off-by: zhengyangyong <yangyong.zheng@huawei.com>
Signed-off-by: zhengyangyong <yangyong.zheng@huawei.com>
@zhengyangyong
Copy link
Copy Markdown
Author

Package Name Updated

@coveralls
Copy link
Copy Markdown

Coverage Status

Coverage increased (+0.006%) to 87.362% when pulling 3c612f4 on zhengyangyong:SCB-150 into 18847ab on apache:master.

@zhengyangyong
Copy link
Copy Markdown
Author

maybe it's better add statusCode dimension, not succ/fail
@wujimin I just think it's necessary split them so detail ? @WillemJiang @seanyinx what's your suggestion thanks

@WillemJiang
Copy link
Copy Markdown
Member

There could be lots's of dimensions if we create the counter per status code. My suggestion is we only provide 2xx, 3xx, 4xx, 5xx status code for it. BTW, we can take the Spring Boot actuator as an example, we don't need lots detail metrics, we just need to tell the administrator something is wrong, and you can find more information by looking up the access log.

@wujimin
Copy link
Copy Markdown
Contributor

wujimin commented Jan 16, 2018

2xx/3xx/4xx/5xx almost equals statusCode
and 3xx need to support special logic: 304 belongs success

i thinks, for metrics kernel data, we care for statusCode
but for data published interface, we can provide different model for different scenes

@zhengyangyong
Copy link
Copy Markdown
Author

I think I can add a switch named servicecomb.metrics.dimension.http_code,default value is false,if user enable it,detail metrics will be record like :
servicecomb.instance.producer.producerCall.tps.{Status=all}
servicecomb.instance.producer.producerCall.tps.{Status=success}
servicecomb.instance.producer.producerCall.tps.{Status=failed}
servicecomb.instance.producer.producerCall.tps.{Status=200}
servicecomb.instance.producer.producerCall.tps.{Status=404}
...

@zhengyangyong
Copy link
Copy Markdown
Author

zhengyangyong commented Jan 16, 2018

I also think most of the time user would like use metrics outputs draw curves directly and don't want do any more aggregation by themself ,only detail items provided may not good idea.

@wujimin
Copy link
Copy Markdown
Contributor

wujimin commented Jan 17, 2018

not "only detail items "
detail is our kernel data, by this data, we can summarize everything we want.

MetricsDimension.DIMENSION_STATUS_SUCCESS,
MetricsDimension.DIMENSION_STATUS_FAILED};
}
throw new ServiceCombException("illegal dimension key : " + dimension);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if there's only one dimension, there's no need to pass in the parameter. if there are multiple, returning an empty array is preferable

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will support more dimensions later, and dimension key must correct

}
return tagMap;
}
return null;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

returning Collections.emptyMap() is preferable

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

familySamples.add(new MetricFamilySamples("Instance Level", Type.UNTYPED, "Instance Level Metrics", samples));

if (registryMetric.getConsumerMetrics().size() != 0) {
samples = new ArrayList<>();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use a local list in this scope is better

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

.add(getFamilySamples(producerMetric.getKey() + " Producer Side", producerMetric.getValue().toMap()));

if (registryMetric.getProducerMetrics().size() != 0) {
samples = new ArrayList<>();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

if (this.getKey().equals(value.getKey())) {
return new DoubleMetricValue(this.getKey(), this.getValue() + value.getValue(), this.getDimensions());
}
throw new ServiceCombException("unable merge different key values,source key :" + value.getKey() +
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any place to catch this exception? we don't want to interrupt user business logic

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@seanyinx
Copy link
Copy Markdown
Member

agreed with @zhengyangyong on option to enable/disable metrics collection based on status code and @WillemJiang on metrics on status range instead of exact status code. we have to be aware of framework memory footprint.

zhengyangyong added 2 commits January 17, 2018 10:19
Signed-off-by: zhengyangyong <yangyong.zheng@huawei.com>
Signed-off-by: zhengyangyong <yangyong.zheng@huawei.com>
@zhengyangyong
Copy link
Copy Markdown
Author

now support setting output level of status dimension :

servicecomb:
  metrics:
    dimension:
      #status dimension
      status:
        # success_failed  : only output success & failed (default)
        # code_group : output 1xx,2xx,3xx,4xx,5xx and xxx
        # code       : output detail code like 200,404,500... etc
        output_level: success_failed

@coveralls
Copy link
Copy Markdown

Coverage Status

Coverage decreased (-0.02%) to 87.337% when pulling 63e9371 on zhengyangyong:SCB-150 into 18847ab on apache:master.

@coveralls
Copy link
Copy Markdown

Coverage Status

Coverage decreased (-0.05%) to 87.308% when pulling 63e9371 on zhengyangyong:SCB-150 into 18847ab on apache:master.

@wujimin
Copy link
Copy Markdown
Contributor

wujimin commented Jan 18, 2018

will this make our logic too complex?
our metrics will not record history data, only some summaries, there is no memory problem?
if kernel data is about statusCode, then metrics is very simple
just when publish data, need to publish different model like: succ/fail, 2xx/3xx/4xx......

@WillemJiang WillemJiang merged commit bd87a72 into apache:master Jan 24, 2018
@zhengyangyong zhengyangyong deleted the SCB-150 branch March 27, 2018 06:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants