Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support to receive tracing context from previous node and send full traceid to zipkin. #30

Merged
merged 8 commits into from
Jul 28, 2018
Merged

Add support to receive tracing context from previous node and send full traceid to zipkin. #30

merged 8 commits into from
Jul 28, 2018

Conversation

jjtyro
Copy link
Contributor

@jjtyro jjtyro commented Jul 9, 2018

Motivation:

使SpringMvc服务端,能接收sofa tracer上下文。

Modification:

  • 增加接受前一节点送上来的trace上下文的代码;
  • 使用整个traceid串,以十六进制方式,做为zipkin span的traceid:
    1. 与摘要日志文件中traceid相吻合,方便根据traceid相互查找;
    2. 避免原算法中,若traceid接收自前一节点,但本节点无从获知发送进程的进程号,因而无法判断traceid尾部哪一部分属于进程号,没办法将其截取掉再转换成长整形,造成转换成长整形时超范围异常;

Result:

使用命令行进行测试:

curl -v --header "sftc_head:tcid=ac120001153111583733511008899&spid=af63ad4c86019caf&pspid=&sample=false&" 127.0.0.1:8080/zipkin?name="world"

通过zipkin界面,可以查到trace id为ac120001153111583733511008899的条目:

{
    "data": [
        {
            "traceID": "ac120001153111583733511008899",
            "spans": [
                {
                    "traceID": "ac120001153111583733511008899",
                    "spanID": "f3dfc94f1694f82b",
                    "operationName": "http://127.0.0.1:8080/zipkin",
                    "references": [],
                    "startTime": 1531119671279000,
                    "duration": 8000,
                    ...
              }
           ]
       }
}

@coveralls
Copy link

coveralls commented Jul 9, 2018

Pull Request Test Coverage Report for Build 54

  • 106 of 153 (69.28%) changed or added relevant lines in 8 files are covered.
  • 1 unchanged line in 1 file lost coverage.
  • Overall coverage increased (+0.8%) to 49.241%

Changes Missing Coverage Covered Lines Changed/Added Lines %
tracer-sofa-boot-starter/src/main/java/com/alipay/sofa/tracer/boot/zipkin/ZipkinSofaTracerSpanRemoteReporter.java 10 14 71.43%
tracer-core/src/main/java/com/alipay/common/tracer/core/registry/HttpHeadersB3Formatter.java 8 12 66.67%
sofa-tracer-plugins/sofa-tracer-springmvc-plugin/src/main/java/com/alipay/sofa/tracer/plugins/springmvc/SpringMvcHeadersCarrier.java 0 6 0.0%
tracer-core/src/main/java/com/alipay/common/tracer/core/utils/CommonUtils.java 30 42 71.43%
tracer-core/src/main/java/com/alipay/common/tracer/core/registry/AbstractTextB3Formatter.java 45 66 68.18%
Files with Coverage Reduction New Missed Lines %
tracer-core/src/main/java/com/alipay/common/tracer/core/reporter/digest/DiskReporterImpl.java 1 84.62%
Totals Coverage Status
Change from base Build 48: 0.8%
Covered Lines: 1655
Relevant Lines: 3361

💛 - Coveralls

@guanchao-yang guanchao-yang self-requested a review July 9, 2018 16:17
@guanchao-yang guanchao-yang self-assigned this Jul 9, 2018
@guanchao-yang guanchao-yang added the enhancement New feature or request label Jul 9, 2018
@guanchao-yang guanchao-yang added this to the 2.1.2 milestone Jul 9, 2018
for (int i = 0; i < 10000; i++) {
traceIdNum = new BigInteger(125, 100, rand);
traceIdOrig = traceIdNum.toString(16);

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jjtyro 测试用例中的无用的空行咱们删除掉哈。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

给个示例,哪种空行是必要的

Copy link
Member

@guanchao-yang guanchao-yang Jul 10, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jjtyro 一个方法内的两段表达逻辑可以加个空行。向这个 for 循环内部类似一段完整的逻辑就没必要加上空行了。建议删除空行,包括测试用例中的哈。

charLast--;
i++;
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

代码中不很很必要的空行,我们删除掉吧。

}

SofaTracer tracer = springMvcTracer.getSofaTracer();
SofaTracerSpanContext spanContext = (SofaTracerSpanContext) tracer.extract(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jjtyro 获取上游如 HttpClient 的 SpanContext 信息,的确是通过 SOFATracer 的 API 通过这种方式进行支持,先点个赞哈。后续我们也计划支持上 HttpClient 的方式,这样信息就通了,方便的话,可以直接贡献哈。

另外:关于 SpanContext 的透传方式,通过我们提供的 Extract 方式和 API 可以达到效果,这里适用的场景是非通用协议,即在每个公司内部都用私有的通信协议。在通用的协议上,如 HTTP 等方式的调用上,在 Server 端我们是期望通过如下的方式去支持上的哈,具体的就是如下的 HttpHeader 字段及标识含义,参考说明文档

按照目前的方式没有问题进行解析;同时呢,也期望向标准的 HTTP 请求我们可以和社区打通,这样就算是跨语言我们的链路也是通的。

BTW : 建议这次也兼容掉参考说明文档中 HTTP Header 的方式。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

是同时支持Sofa-tracer内部标准与通用标准,还是单独支持通用标准即可? 对于非sofa中间件用户来讲,这个sofa-tracer好象意义不大。支持了通用标准,只是说Sofa-可以接入其他非sofa的trace请求。那么一些域的取值含义是否也应该一致,比如traceid的组成,是维持sofa的十六进制混合还是完全是十进制什么的方式?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

是同时支持Sofa-tracer内部标准与通用标准,还是单独支持通用标准即可?
@jjtyro HTTP 请求是通用的标准,建议这里我们就支持通用标准即可。

比如traceid的组成,是维持sofa的十六进制混合还是完全是十进制什么的方式?
建议这里还是保持16进制混合,我们可以讨论哈。1.是考虑到已有产品的兼容 2.是考虑到16进制在主要的展示界面中也采用的此方式。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

有个问题,为什么sofa-tracer的spanid发往zipkin时,要使用FNV64HashCode方法把它编码,与spring-mvc-digest.log的spanid对不上,比如:spring-mvc-digest.log里的是0, zipkin上的是”af63ad4c86019caf“

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jjtyro 这样做的主要原因是 SOFATracer 的 SpanId 的生成规则是这个样子,但是对应到 Zipkin 的模型时,我们的 SpanId 并不完全是 16 进制的表达,而是以点号 . 分隔表达调用层次,再界面展示为 16 进制时也失去了其以点号分隔层次的表达含义。所以这里采用了一种 Hash 算法,来尽量较少碰撞概率同时也要保证性能,因此采用了 FNV64Hash 算法。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

了解啦 忘了逗点儿"."的事

@popomore
Copy link

我有个疑问,因为 sofa 的 traceId 是非 64 或 128长度的,所以这里加了一层转换。这种情况需要保证 node 也使用相同的算法进行转换,因为传递时的 traceId 是原始的。

还有一种情况是上下游非 sofa 的,传递 traceId 的时候不一定会进行转换,是否我们的 traceId 也符合 64 或 128 长度。

@jjtyro
Copy link
Contributor Author

jjtyro commented Jul 10, 2018

@popomore 我也是此想法

@jjtyro
Copy link
Contributor Author

jjtyro commented Jul 10, 2018

@popomore 又想了想,完全当十六进制来解析,应是没有问题的,因为至少zipkin、jaeger等最后也是以十六进制显示traceid的

@guanchao-yang
Copy link
Member

@popomore @jjtyro 建议我们还是保留 16 进制字符串的表达形式,在类似界面有 Zipkin 只有对整型 API 时,我们可以进行转换,但其展示还是用 16 进制,最后能够还原回即可;在对应非 SOFA 体系的应用请求时,类似本次 PR,他们的 traceId 的确可能是没有进行转换的,将其解析为字符串 traceId,然后向 SOFA 系统传递即可,从对接的界面如 Zipkin 对这个 traceId 字符串最后被解析展示的也是 16 进制,这样也能够更好的在包括日志和界面中对应起来。咱们可以再讨论看下。

@popomore
Copy link

如果微服务使用一套架构是没有问题的,但是其他平台处理 traceId 的方式会不同。比如 MQ 支持 opentracing,那 sofa 生成 traceId 传给 MQ,同时发转换后的 traceId1 到 zipkin,MQ 接收 traceId 后发给 zipkin 的不一定是 traceId1。

我们可以整理一下现在大部分平台 traceId 的生成规则再来看具体是怎样的格式比较好。

@jjtyro
Copy link
Contributor Author

jjtyro commented Jul 11, 2018

在原来sofa tracer Header的基础上,增加对B3 Header的解析与生成。

@@ -133,13 +135,20 @@ public void close() {
//spanId
String spanId = sofaTracerSpanContext.getSpanId();
zipkinSpanBuilder.id(spanIdToLong(spanId));
//parentid
try {
long parentId = Long.parseLong(sofaTracerSpanContext.getParentId());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个 SpanId 应该是这个样子 zipkinSpanBuilder.parentId(spanIdToLong(parentSpanId)); 直接获取的话事宜 . 分隔的。SpanId 和 parentSpanId 要做同样的规则转换的。否则在 Zipkin 展示无法一一对应起来。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这块我再看看,对方送过来时就是一堆十六进制串。而且zipkin里为表示父子关系就此一处,不太清楚怎么处置,直接转long会超范围,现在是发到jaeger那边,trace是不能与调用者的展现成父子关系的,因为traceid是一样的,所以是放在一块了,也好象按发生时间排序了,但父子关系没有。
zipkin 2.0看json串是用reference元素来表示父子关系的,包括父节点的traceid、spanid。

Copy link
Member

@guanchao-yang guanchao-yang Jul 14, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这块我再看看,对方送过来时就是一堆十六进制串

突然想起来,这块 SOFATracer 中是一个 . 点号分隔的,不是一个 16 进制串哈(那个是 traceId),所以还是要修正下的。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

增加了判断,对于Server received parentId,判断如果是十六进制串,则直接转成long,否则按spanIdToLong。

* 替换str中的"&","=" 和 "%"
*/
private static String escapePercentEqualAnd(String str) {
//替换str中的"&","=" 和 "%"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里方法注释没必要删除吧。误删了?😀

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

& 不能出现在javadoc格式注释里, mvn 编译时默认要生成javadoc, 就出错了。

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jjtyro 你的 Maven 和 JDK 版本是

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

JDK8 maven 3.3.9

* @param operationName server openration name
* @return SofaTracerSpan
*/
public SofaTracerSpan serverReceive(SofaTracerSpanContext spanContextReceived,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jjtyro 这个方法有些问题哈。没有必要再抽象一个这个方法了哈。客户端发起调用的时候,对应到这个服务端应该就是你之前的那种方式恢复服务端的 Span,这样子 client 和 server 都有对应的 digest 日志和 Span 汇报,这里的 spanId 应该是一样的,而不应该是 asChildOf 关系。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这块我也想了会儿,目前的理解是,收到的客户端tracing span,应是我们服务端这边所有span的最顶上的父span,服务端sr span是它的子span。
这个,我从brave例子运行发到zipkin的span json内容上,发现应该是这样,后来又去查了opentracing的规范,我想应是这个样子。但是否影响到digest,还真是没注意,收到的context没建内对应的span,只是在关系引用中存在。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

测试了一下,没有影响到digest汇报,比如收到对方:

traceId : "503ceaf40cf47ed4"
spanId : "daa447cc1240ef6b"
日志汇报里的是:

{"time":"2018-07-12 14:45:51.275","local.app":"SOFATracerReportZipkin","traceId":"503ceaf40cf47ed4","spanId":"daa447cc1240ef6b.1","request.url":"http://localhost:8080/zipkin","method":"GET","result.code":"200","req.size.bytes":-1,"resp.size.bytes":75,"time.cost.milliseconds":167,"current.thread.name":"http-nio-8080-exec-1","baggage":""}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jjtyro 这样子做是没有问题,接收到的再新创建一个,但是在 Zipkin 接收到 Span 数据的时候,相当于只有 CS 和 CR 了。正常的 Zipkin 模型是 类似这样,一个数据展示完整是 CS、SR、SS 和 CR,类似这样,
image

以为这里涉及的相当于是两个组件,客户端可能是类似的 HttpClient,服务端是 MVC 框架,但是从模型上说这个就是一对模型共同构成了 CS、SR、SS 和 CR,所以这里建议这里不做 asChildOf

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

嗯,我好象是错了,改回来了。但github应该是被墙了

sofaTracerSpanContext = SofaTracerSpanContext.deserializeFromString(this
.decodedValue(value));
}
}
//if not found sofa trace context, then try to find out if there have zipkin propagation
if (sofaTracerSpanContext == null) {
sofaTracerSpanContext = propagation.extract(carrier);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

propagation 实例 TextB3Propagation 是一个有状态实例,而这里应该是一个无状态实例应该更合适一些,否则状态在并发造作下对 extract 的 SofaTracerSpanContext 会产生错误影响。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TextB3Propagation 我理解是无状态的,没有任何保存运行数据的成员变量。 只是在内部引用了AbstractTextFormatter实现类的方法,这个不算有状态。

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jjtyro sorry 是的,看错 isGetSampled 这个变量了。

SofaTracerSpanContext parentSofaTracerSpanContext = sofaTracerSpan
.getParentSofaTracerSpan().getSofaTracerSpanContext();
if (parentSofaTracerSpanContext != null) {
String parentSpanId = parentSofaTracerSpanContext.getSpanId();
parentSpanId = parentSofaTracerSpanContext.getSpanId();
//
zipkinSpanBuilder.parentId(spanIdToLong(parentSpanId));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jjtyro parentId 这里也要做同样的操作

if (isHexString(parentSpanId)) {
    zipkinSpanBuilder.parentId(parentIdToId(parentSpanId));
}

/***
* 对指定的值进行编码
* @param value 字符串
* @return 编码后的 value
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jjtyro 新增加代码的注释我们建议是英文哈。


import com.alipay.common.tracer.core.context.span.SofaTracerSpanContext;

public interface Propagation<T> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jjtyro 这个接口没有存在的必要了。直接使用 com.alipay.common.tracer.core.registry.RegistryExtractorInjector 这个接口即可。而对应的这个接口的实例 TextB3Propagation 完全可以按照已有的接口 RegistryExtractorInjector 在提供一个具体的实现即可,并注册到 com.alipay.common.tracer.core.registry.TracerFormatRegistry

@@ -29,6 +32,17 @@
* @since 2017/06/24
*/
public abstract class AbstractTextFormatter implements RegistryExtractorInjector<TextMap> {
TextB3Propagation propagation = new TextB3Propagation(new PropagationEncoder() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里单独代理了这个实现,看起来实在太怪了,也不符合期望哈。直接作为一个新的实例实现 RegistryExtractorInjector 接口并注册到 com.alipay.common.tracer.core.registry.TracerFormatRegistry

carrier.put(FORMATER_KEY_HEAD, this.encodedValue(spanContext.serializeSpanContext()));
//also zipkin propagation trace context head inject
propagation.inject(spanContext, carrier);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jjtyro inject 和 extract 相当于在 AbstractTextFormatter 中都被 propagation 代理了,对其他使用者,也没有必要,所以这里建议是,直接“下沉”到一个具体的 如名字叫 TextB3Propagation 的实现中,这个实现可以通过继承调用父类相应的方法,如果没有获取期望的再使用 B3Propagation 方式。

*/
package com.alipay.common.tracer.core.registry.propagation;

public interface B3Propagation<T> extends Propagation<T> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jjtyro 如上下文所说,这个接口么必要了,直接下沉到相应的具体实现 RegistryExtractorInjector 实现即可。

* @param value String will be decoded
* @return Decoded string
*/
String decodeValue(String value);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jjtyro 这里的 Encode 和 Decode 也没有必要了。我看用到的地方

@Override
public String encodeValue(String value) {
    return AbstractTextFormatter.this.encodedValue(value);
}

也只是代理了一层,优化下实现可以把不必要的这些接口都统一起来哈,PropagationDecoder 和 PropagationEncoder 可以统一到 com.alipay.common.tracer.core.registry.AbstractTextFormatter#encodedValue 这里哈。

* @return long array: [0] -- High 64 bit, [1] -- low 64 bit
*/
public static long parentIdToId(String hexString) {
//Assert.hasText(hexString, "Can't convert empty hex string to long");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

public static 的 parentIdToId 方法和 isHexString 方法,可以直接放到 com.alipay.common.tracer.core.utils; package 下,并可以统一一个工具类的名称哈如,com.alipay.common.tracer.core.utils.CommonUtils 中,写好注释的测试用例即可。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok!

* @return long array: [0] -- High 64 bit, [1] -- low 64 bit
*/
public static long[] traceIdToIds(String hexString) {
//Assert.hasText(hexString, "Can't convert empty hex string to long");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个工具类也是一样哈,统一放到 com.alipay.common.tracer.core.utils.CommonUtils 下吧。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok!

@jjtyro
Copy link
Contributor Author

jjtyro commented Jul 27, 2018

@guanchao-yang 重构完了,你看看。 前面一些Commit Message忘了没写,IDE重复了以前的内容。 :(

@guanchao-yang guanchao-yang merged commit a17cbfe into sofastack:master Jul 28, 2018
@jjtyro jjtyro deleted the add_support_springmvc_receive_trace_context branch July 30, 2018 01:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants