Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Dubbo-619] Fix #619 and PR #2956 is not enough #3634

Merged
merged 2 commits into from
Apr 2, 2019

Conversation

zhaixiaoxiang
Copy link
Contributor

What is the purpose of the change

Fix #619 , and PR #2956 is not enough in the following use case:

In souche.com, we add 'trace' function in dubbo, such as in FailoverClusterInvoker.doInvoke, we get stackTrace from exception to log detail exception info using the following:

org.apache.commons.lang3.exception.ExceptionUtils.getStackTrace(result.getException())

The above sentence will result the same problem with issue #619 .

So, we should solve this in decode stage.

#619 这个问题, PR #2956 貌似已经解决了, 但是我们公司(souche.com)在对dubbo加全链路埋点时, 还是遇到了之前的问题, 因为 #2956 是在InvokerInvocationHandler.invoke()方法的invoker.invoke(invocation)后解决了问题, 但是我们在FailoverClusterInvoker.doInvoke()方法里坐全链路埋点时, 为了能取到异常栈的信息, 用了如下的语句:
org.apache.commons.lang3.exception.ExceptionUtils.getStackTrace(result.getException())
导致了异常栈又出问题了..

所以应该在decode后, 对exception对象做判断才对.

Brief changelog

RpcResult.java
RpcResultTest.java

Follow this checklist to help us incorporate your contribution quickly and easily:

  • Make sure there is a GITHUB_issue field for the change (usually before you start working on it). Trivial changes like typos do not require a GITHUB issue. Your pull request should address just this issue, without pulling in other changes - one PR resolves one issue.
  • Format the pull request title like [Dubbo-XXX] Fix UnknownException when host config not exist #XXX. Each commit in the pull request should have a meaningful subject line and body.
  • Write a pull request description that is detailed enough to understand what the pull request does, how, and why.
  • Write necessary unit-test to verify your logic correction, more mock a little better when cross module dependency exist. If the new feature or significant change is committed, please remember to add integration-test in test module.
  • Run mvn clean install -DskipTests=false & mvn clean test-compile failsafe:integration-test to make sure unit-test and integration-test pass.
  • If this contribution is large, please follow the Software Donation Guide.

@codecov-io
Copy link

codecov-io commented Mar 11, 2019

Codecov Report

Merging #3634 into master will increase coverage by 0.06%.
The diff coverage is 85.71%.

Impacted file tree graph

@@             Coverage Diff              @@
##             master    #3634      +/-   ##
============================================
+ Coverage     63.44%   63.51%   +0.06%     
  Complexity       71       71              
============================================
  Files           703      706       +3     
  Lines         31000    32096    +1096     
  Branches       5024     5311     +287     
============================================
+ Hits          19669    20386     +717     
- Misses         9066     9370     +304     
- Partials       2265     2340      +75
Impacted Files Coverage Δ Complexity Δ
.../src/main/java/org/apache/dubbo/rpc/RpcResult.java 94.11% <85.71%> (-2.76%) 0 <0> (ø)
.../apache/dubbo/remoting/transport/AbstractPeer.java 58.69% <0%> (-13.05%) 0% <0%> (ø)
...mmon/threadpool/support/AbortPolicyWithReport.java 83.33% <0%> (-5.56%) 0% <0%> (ø)
...pport/zookeeper/ZookeeperDynamicConfiguration.java 77.96% <0%> (-5.37%) 0% <0%> (ø)
...bo/rpc/cluster/support/FailbackClusterInvoker.java 67.21% <0%> (-3.28%) 0% <0%> (ø)
...dubbo/remoting/exchange/support/DefaultFuture.java 72.97% <0%> (-2.71%) 0% <0%> (ø)
.../apache/dubbo/remoting/etcd/jetcd/JEtcdClient.java 0% <0%> (ø) 0% <0%> (ø) ⬇️
...n/java/org/apache/dubbo/config/ArgumentConfig.java 100% <0%> (ø) 0% <0%> (ø) ⬇️
.../dubbo/remoting/etcd/jetcd/JEtcdClientWrapper.java 0% <0%> (ø) 0% <0%> (ø) ⬇️
...ookeeper/ZookeeperDynamicConfigurationFactory.java 100% <0%> (ø) 0% <0%> (ø) ⬇️
... and 20 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b2bfbc7...13e1ce4. Read the comment docs.

Copy link
Member

@carryxyh carryxyh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise, looks good to me.

@zhaixiaoxiang
Copy link
Contributor Author

@carryxyh 谢谢你的建议.

大家在看这个 PR 前建议先看下 PR #2956 , 那里分析了出现issue #619 的原因.

PR #2956 那样改在社区版里看上去是没有问题的, 但是在某些情况下会有问题:

比如我们公司在FailoverClusterInvoker.doInvoke()里加了全链路监控埋点的代码, 如下所示:

image

在finally块里, 我们会取详细的exception异常栈(如果有的话), 然后记录在链路日志中, 用的如下这段:
info.setExceptionStr(ExceptionUtils.getStackTrace(result.getException()));

在执行这段代码之前, result.getException()里的stackTrace是null, 这没有问题, 因为服务端本来就抛出了没有异常栈的NullPointerException, (如果这里不懂的话, 建议先看一下 #2956 , 那里有详细的分析).

但是执行完这段代码后, 见如下截图:
image

stackTrace 就变成了错误的异常栈, 也就是 issue #619 中出现那个误导用户的异常栈, 之所以在PR #2956 后还是有这个问题, 是因为是先执行的上面那块代码, 然后再执行RpcResult.recreate(), 而PR #2956 是在recreate里做的修改, 其实应该是在DecodeableRpcResult 类的handleException()方法里做异常栈的判断和修改:
image

image

本 PR 的目的是优化 PR #2956 , 在decode时就实现异常栈的检测和修改, 而不是等到InvokerInvocationHandler.invoke()后, recreate()时再做检测, 否则如果用户做二次开发时, 在诸如FailoverClusterInvoker.doInvoke()里用类似ExceptionUtils.getStackTrace()方法就破坏了异常栈, 所以要在源头就做好检测和修改, 这样更好一点.

@zhaixiaoxiang
Copy link
Contributor Author

@ralf0131 Could you help trigger CI again? This PR failed previously because the junit test failed: #3657

@zhaixiaoxiang
Copy link
Contributor Author

@carryxyh hi, could you tell me why merging is blocked, thank you.

image

image

@carryxyh carryxyh merged commit f932d7b into apache:master Apr 2, 2019
@cvictory cvictory added this to the 2.7.2 milestone May 28, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

这个怎么搞,有人遇到过吗
4 participants