Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

通过 Arthas Trace 命令将接口性能优化十倍(User Case 投稿) #1892

Closed
liuzhiguo630 opened this issue Aug 19, 2021 · 37 comments
Closed

Comments

@liuzhiguo630
Copy link

背景

Helios 系统要处理的数据量比较大,尤其是查询所有服务一天的评分数据时要返回每日 1440 分钟的所有应用的评分,总计有几十万个数据点,接口有时延迟会达到数秒。本文记录如何利用 Arthas ,将接口从几百几千 ms,优化到几十 ms。

链路:

从链路上看,线上获取一整天的数据时大概 300 多 ms,而查询数据库只有 11ms,说明大部分时间都是程序组装数据时消耗的,于是动起了优化代码的念头。

优化过程

温馨提示:代码可以不用看,没有上下文的情况下很难明白函数什么意思。主要看 Arthas Trace 的结果与优化思路。

初始未优化版本

代码

    private HeliosGetScoreResponse queryScores(HeliosGetScoreRequest request) {
        HeliosGetScoreResponse response = new HeliosGetScoreResponse();

        List<HeliosScore> heliosScores = heliosService.queryScoresTimeBetween(request.getStartTime(), request.getEndTime(), request.getFilterByAppId());
        if (CollectionUtils.isEmpty(heliosScores)) {
            return response;
        }

        Set<String> dateSet = new HashSet<>();

        Map<String, List<HeliosScore>> groupByAppIdHeliosScores = heliosScores.stream().collect(Collectors.groupingBy(HeliosScore::getAppId));
        for (List<HeliosScore> value : groupByAppIdHeliosScores.values()) {
            value.sort(Comparator.comparing(HeliosScore::getTimeFrom));
            HeliosGetScoreResponse.Score score = new HeliosGetScoreResponse.Score();
            score.setNamespace(value.get(0).getNamespace());
            score.setAppId(value.get(0).getAppId());
            for (HeliosScore heliosScore : value) {
                List<HeliosScore> splitHeliosScores = heliosScore.split();
                for (HeliosScore splitHeliosScore : splitHeliosScores) {
                    if (splitHeliosScore.getTimeFrom().compareTo(request.getStartTime()) < 0) {
                        continue;
                    }
                    if (splitHeliosScore.getTimeFrom().compareTo(request.getEndTime()) > 0) {
                        break;
                    }
                    dateSet.add(DateUtils.yyyyMMddHHmm.formatDate(splitHeliosScore.getTimeFrom()));
                    if (splitHeliosScore.getScores() == null) {
                        splitHeliosScore.setScores("100");
                        log.error("查询时发现数据缺失: {}", heliosScore);
                    }
                    score.add(Math.max(0, Integer.parseInt(splitHeliosScore.getScores())), null);
                }
            }
            response.getValues().add(score);
        }

        response.setDates(new ArrayList<>(dateSet).stream().sorted().collect(Collectors.toList()));
        return response;
    }

Arthas Trace

`---ts=2021-08-17 16:28:00;thread_name=http-nio-8080-exec-10;id=81;is_daemon=true;priority=5;TCCL=org.springframework.boot.web.embedded.tomcat.TomcatEmbeddedWebappClassLoader@20864cd1
    `---[4046.398447ms] xxxService.controller.HeliosController:queryScores()
        +---[0.022259ms] xxxService.model.helios.HeliosGetScoreResponse:<init>() #147
        +---[0.007132ms] xxxService.model.helios.HeliosGetScoreRequest:getStartTime() #149
        +---[0.006985ms] xxxService.model.helios.HeliosGetScoreRequest:getEndTime() #149
        +---[0.008704ms] xxxService.model.helios.HeliosGetScoreRequest:getFilterByAppId() #149
        +---[19.284658ms] xxxService.service.HeliosService:queryScoresTimeBetween() #149
        +---[0.017468ms] org.apache.commons.collections.CollectionUtils:isEmpty() #150
        +---[0.008054ms] java.util.HashSet:<init>() #154
        +---[0.027591ms] java.util.List:stream() #156
        +---[0.044229ms] java.util.stream.Collectors:groupingBy() #156
        +---[0.155582ms] java.util.stream.Stream:collect() #156
        +---[0.018318ms] java.util.Map:values() #157
        +---[0.019199ms] java.util.Collection:iterator() #157
        +---[min=3.51E-4ms,max=0.014266ms,total=0.125003ms,count=123] java.util.Iterator:hasNext() #157
        +---[min=5.11E-4ms,max=0.010188ms,total=0.145693ms,count=122] java.util.Iterator:next() #157
        +---[min=4.89E-4ms,max=0.045356ms,total=0.321978ms,count=122] java.util.Comparator:comparing() #158
        +---[min=0.003637ms,max=0.033049ms,total=0.928795ms,count=122] java.util.List:sort() #158
        +---[min=5.94E-4ms,max=0.010442ms,total=0.1485ms,count=122] xxxService.model.helios.HeliosGetScoreResponse$Score:<init>() #159
        +---[min=4.5E-4ms,max=0.010857ms,total=0.12773ms,count=122] java.util.List:get() #160
        +---[min=5.01E-4ms,max=0.007849ms,total=0.123696ms,count=122] xxxService.helios.entity.HeliosScore:getNamespace() #160
        +---[min=6.5E-4ms,max=0.007324ms,total=0.135906ms,count=122] xxxService.model.helios.HeliosGetScoreResponse$Score:setNamespace() #160
        +---[min=3.72E-4ms,max=0.010288ms,total=0.086703ms,count=122] java.util.List:get() #161
        +---[min=5.1E-4ms,max=0.00627ms,total=0.103871ms,count=122] xxxService.helios.entity.HeliosScore:getAppId() #161
        +---[min=5.97E-4ms,max=0.006531ms,total=0.126184ms,count=122] xxxService.model.helios.HeliosGetScoreResponse$Score:setAppId() #161
        +---[min=4.45E-4ms,max=0.020198ms,total=0.138299ms,count=122] java.util.List:iterator() #162
        +---[min=3.42E-4ms,max=0.014615ms,total=0.256056ms,count=366] java.util.Iterator:hasNext() #162
        +---[min=3.59E-4ms,max=0.014974ms,total=0.174396ms,count=244] java.util.Iterator:next() #162
        +---[min=0.071035ms,max=0.148132ms,total=19.444179ms,count=244] xxxService.helios.entity.HeliosScore:split() #163
        +---[min=4.06E-4ms,max=0.022364ms,total=0.210152ms,count=244] java.util.List:iterator() #164
        +---[min=3.07E-4ms,max=0.199649ms,total=143.267893ms,count=351604] java.util.Iterator:hasNext() #164
        +---[min=3.25E-4ms,max=24.863976ms,total=177.15363ms,count=351360] java.util.Iterator:next() #164
        +---[min=3.93E-4ms,max=0.096771ms,total=176.843018ms,count=351360] xxxService.helios.entity.HeliosScore:getTimeFrom() #165
        +---[min=4.07E-4ms,max=18.772715ms,total=205.632183ms,count=351360] xxxService.model.helios.HeliosGetScoreRequest:getStartTime() #165
        +---[min=3.33E-4ms,max=0.045589ms,total=149.24486ms,count=351360] java.util.Date:compareTo() #165
        +---[min=3.93E-4ms,max=0.032972ms,total=86.466793ms,count=175680] xxxService.helios.entity.HeliosScore:getTimeFrom() #168
        +---[min=4.12E-4ms,max=0.061003ms,total=94.294061ms,count=175680] xxxService.model.helios.HeliosGetScoreRequest:getEndTime() #168
        +---[min=3.37E-4ms,max=0.038792ms,total=74.505056ms,count=175680] java.util.Date:compareTo() #168
        +---[min=3.97E-4ms,max=0.036548ms,total=87.693935ms,count=175680] xxxService.helios.entity.HeliosScore:getTimeFrom() #171
     1  +---[min=0.001952ms,max=0.068413ms,total=391.739063ms,count=175680] xxxService.utils.DateUtils$yyyyMMddHHmm:formatDate() #171
        +---[min=4.07E-4ms,max=0.037904ms,total=108.107714ms,count=175680] java.util.Set:add() #171
        +---[min=3.95E-4ms,max=0.031555ms,total=88.173857ms,count=175680] xxxService.helios.entity.HeliosScore:getScores() #172
        +---[min=3.88E-4ms,max=0.033584ms,total=84.689466ms,count=175680] xxxService.helios.entity.HeliosScore:getScores() #176
        +---[min=3.11E-4ms,max=0.038121ms,total=69.708752ms,count=175680] java.lang.Math:max() #176
        +---[min=4.66E-4ms,max=0.03391ms,total=104.476576ms,count=175680] xxxService.model.helios.HeliosGetScoreResponse$Score:add() #176
        +---[min=6.17E-4ms,max=0.01503ms,total=0.159826ms,count=122] xxxService.model.helios.HeliosGetScoreResponse:getValues() #179
        +---[min=6.44E-4ms,max=0.03742ms,total=0.21068ms,count=122] java.util.List:add() #179
        +---[0.108961ms] java.util.ArrayList:<init>() #182
        +---[0.017455ms] java.util.ArrayList:stream() #182
        +---[0.011099ms] java.util.stream.Stream:sorted() #182
        +---[0.013699ms] java.util.stream.Collectors:toList() #182
        +---[0.38178ms] java.util.stream.Stream:collect() #182
        `---[0.004627ms] xxxService.model.helios.HeliosGetScoreResponse:setDates() #182

分析

Arthas 显示总共花了 4 秒,但实际上在链路上看大概是 350~450ms 左右。其他多出来的时间是 Arthas 每一次执行统计的消耗,因为方法里的循环比较多。这也告诉我们,不要用 trace 去看循环很多的方法。会对性能有非常严重的影响。

可以看出整个函数有 3 个循环,第一层循环的数量为 appId 的数量约为 140,第二层是查出来的数据条数,一天的数据已经归并了所以这里应该是 1,第三层是时间区间的分钟数,一天的话就是 1440 个。

Trace 中可以看到消耗最多的是封装的一个 SimpleDateFormat.formatDate()

第一次优化

优化方向

  1. 遍历每个时间点的思路改变,把合并过的大对象拆分成一个个小对象直接遍历,改成先合并起来,通过时间点逻辑上遍历。这样会减少创建几十万个对象。

  2. 将时间点集合 Set<String> dateSet 改为 Set<Date>,这样减少反复 formatDate() 的开销。

  3. 优化字符串转数字的过程,减少 Integer.parseInt方法调用,改为用 Map<String, Integer> 提前创建出 0~100 的字符串数字字典。(后来经过 JMH 测试,还是 Integer.parseInt 最快)

代码

private HeliosGetScoreResponse queryScores(HeliosGetScoreRequest request) {
        HeliosGetScoreResponse response = new HeliosGetScoreResponse();

        List<HeliosScore> heliosScoresRecord = heliosService.queryScoresTimeBetween(request.getStartTime(), request.getEndTime(), request.getFilterByAppId());
        if (CollectionUtils.isEmpty(heliosScoresRecord)) {
            return response;
        }

        Set<Date> dateSet = new HashSet<>();

        List<HeliosScore> heliosScores = HeliosDataMergeJob.mergeData(heliosScoresRecord);

        Map<String, List<HeliosScore>> groupByAppIdHeliosScores = heliosScores.stream().collect(Collectors.groupingBy(HeliosScore::getAppId));

        for (List<HeliosScore> scores : groupByAppIdHeliosScores.values()) {
            HeliosScore heliosScore = scores.get(0);
            HeliosGetScoreResponse.Score score = new HeliosGetScoreResponse.Score();
            score.setNamespace(heliosScore.getNamespace());
            score.setAppId(heliosScore.getAppId());
            score.setScores(new ArrayList<>());
            response.getValues().add(score);

            List<Integer> scoreIntList = HeliosHelper.splitScores(heliosScore);

            // 以 requestTime 为准
            Calendar indexDate = DateUtils.roundDownMinute(request.getStartTime().getTime());
            int index = 0;
            // 如果 timeFrom < requestTime,则增加 timeFrom 到 requestTime
            while (indexDate.getTime().compareTo(heliosScore.getTimeFrom()) > 0) {
                heliosScore.getTimeFrom().setTime(heliosScore.getTimeFrom().getTime() + 60_000);
                index++;
            }

            while (indexDate.getTime().compareTo(request.getEndTime()) <= 0 && indexDate.getTime().compareTo(heliosScore.getTimeTo()) <= 0  && index < scoreIntList.size()) {
                Integer scoreInt = scoreIntList.get(index++);
                score.getScores().add(scoreInt);
                dateSet.add(indexDate.getTime());
                indexDate.add(Calendar.MINUTE, 1);
            }
        }

        response.setDates(new ArrayList<>(dateSet).stream().sorted().map(DateUtils.yyyyMMddHHmm::formatDate).collect(Collectors.toList()));
        return response;
    }

Arthas Trace

---ts=2021-08-17 14:44:11;thread_name=http-nio-8080-exec-10;id=ab;is_daemon=true;priority=5;TCCL=org.springframework.boot.web.embedded.tomcat.TomcatEmbeddedWebappClassLoader@16ea0f22
    `---[6997.005629ms] xxxService.controller.HeliosController:queryScores()
        +---[0.020032ms] xxxService.model.helios.HeliosGetScoreResponse:<init>() #149
        +---[0.007451ms] xxxService.model.helios.HeliosGetScoreRequest:getStartTime() #151
        +---[min=0.001054ms,max=7.458198ms,total=213.19538ms,count=170754] xxxService.model.helios.HeliosGetScoreRequest:getEndTime() #57
        +---[0.007267ms] xxxService.model.helios.HeliosGetScoreRequest:getFilterByAppId() #57
        +---[15.255919ms] xxxService.service.HeliosService:queryScoresTimeBetween() #57
        +---[0.020045ms] org.apache.commons.collections.CollectionUtils:isEmpty() #152
        +---[0.015161ms] java.util.HashSet:<init>() #156
        +---[20.06713ms] xxxService.helios.jobs.HeliosDataMergeJob:mergeData() #158
        +---[0.043042ms] java.util.List:stream() #160
        +---[0.028232ms] java.util.stream.Collectors:groupingBy() #57
        +---[min=0.087087ms,max=1.931641ms,total=2.018728ms,count=2] java.util.stream.Stream:collect() #57
        +---[0.0151ms] java.util.Map:values() #162
        +---[0.019611ms] java.util.Collection:iterator() #57
        +---[min=7.55E-4ms,max=0.015165ms,total=0.201221ms,count=121] java.util.Iterator:hasNext() #57
        +---[min=0.001178ms,max=0.02477ms,total=0.220931ms,count=120] java.util.Iterator:next() #57
        +---[min=8.14E-4ms,max=0.01101ms,total=0.155044ms,count=120] java.util.List:get() #163
        +---[min=0.001049ms,max=0.009425ms,total=0.231297ms,count=120] xxxService.model.helios.HeliosGetScoreResponse$Score:<init>() #164
        +---[min=0.001167ms,max=0.009721ms,total=0.194502ms,count=120] xxxService.helios.entity.HeliosScore:getNamespace() #165
        +---[min=0.001222ms,max=0.020409ms,total=0.264791ms,count=120] xxxService.model.helios.HeliosGetScoreResponse$Score:setNamespace() #57
        +---[min=0.001097ms,max=0.006475ms,total=0.169987ms,count=120] xxxService.helios.entity.HeliosScore:getAppId() #166
        +---[min=0.00121ms,max=0.007106ms,total=0.207877ms,count=120] xxxService.model.helios.HeliosGetScoreResponse$Score:setAppId() #57
        +---[min=8.63E-4ms,max=0.008981ms,total=0.176195ms,count=120] java.util.ArrayList:<init>() #167
        +---[min=0.001225ms,max=0.021948ms,total=0.340375ms,count=120] xxxService.model.helios.HeliosGetScoreResponse$Score:setScores() #57
        +---[min=0.00112ms,max=0.008984ms,total=0.196212ms,count=120] xxxService.model.helios.HeliosGetScoreResponse:getValues() #168
        +---[min=7.64E-4ms,max=0.027237ms,total=154.660479ms,count=170753] java.util.List:add() #57
        +---[min=0.028779ms,max=0.237608ms,total=20.049731ms,count=120] xxxService.helios.HeliosHelper:splitScores() #170
        +---[min=0.001178ms,max=0.008102ms,total=0.199087ms,count=120] xxxService.model.helios.HeliosGetScoreRequest:getStartTime() #173
        +---[min=6.89E-4ms,max=0.048069ms,total=140.74298ms,count=170040] java.util.Date:getTime() #57
        +---[min=0.004686ms,max=0.03805ms,total=0.775394ms,count=120] xxxService.utils.DateUtils:roundDownMinute() #57
        +---[min=7.84E-4ms,max=7.562581ms,total=162.855553ms,count=170040] java.util.Calendar:getTime() #176
      2 +---[min=9.94E-4ms,max=0.029962ms,total=385.371864ms,count=339960] xxxService.helios.entity.HeliosScore:getTimeFrom() #57
      1 +---[min=7.76E-4ms,max=7.936578ms,total=483.361269ms,count=511428] java.util.Date:compareTo() #57
        +---[min=9.95E-4ms,max=0.077109ms,total=192.749805ms,count=169920] xxxService.helios.entity.HeliosScore:getTimeFrom() #177
        +---[min=6.94E-4ms,max=7.358942ms,total=151.184751ms,count=169920] java.util.Date:setTime() #57
        +---[min=7.67E-4ms,max=0.029244ms,total=152.500401ms,count=170753] java.util.Calendar:getTime() #181
        +---[min=7.65E-4ms,max=0.016336ms,total=151.879643ms,count=170635] java.util.Calendar:getTime() #182
        +---[min=0.001011ms,max=0.028133ms,total=196.192946ms,count=170635] xxxService.helios.entity.HeliosScore:getTimeTo() #57
        +---[min=6.93E-4ms,max=0.836104ms,total=141.443001ms,count=170635] java.util.List:size() #57
        +---[min=7.63E-4ms,max=7.940119ms,total=162.285955ms,count=170633] java.util.List:get() #183
      3 +---[min=0.001068ms,max=0.973964ms,total=209.721ms,count=170633] xxxService.model.helios.HeliosGetScoreResponse$Score:getScores() #184
        +---[min=7.71E-4ms,max=0.028856ms,total=154.918574ms,count=170633] java.util.Calendar:getTime() #185
        +---[min=8.07E-4ms,max=8.030316ms,total=186.971072ms,count=170633] java.util.Set:add() #57
        +---[min=7.82E-4ms,max=0.034732ms,total=156.2645ms,count=170633] java.util.Calendar:add() #186
        +---[0.050615ms] java.util.ArrayList:<init>() #190
        +---[0.019114ms] java.util.ArrayList:stream() #57
        +---[0.029096ms] java.util.stream.Stream:sorted() #57
        +---[0.018823ms] java.util.stream.Stream:map() #57
        +---[0.009092ms] java.util.stream.Collectors:toList() #57
        `---[0.006768ms] xxxService.model.helios.HeliosGetScoreResponse:setDates() #57

分析

这一步实际上执行时间优化了 50ms 左右。

从 Trace 中看耗时时间最长的是 Date 的 compareTo,也就是代码中的 if (splitHeliosScore.getTimeFrom().compareTo(request.getStartTime()) < 0)

而比较意外的是从对象中 get 属性居然也是有开销的。

第二次优化

优化方向

结合上一次 Arthas Trace 的结果,在以下几个方向进行优化:

  1. Date 对象的换成 long 型时间戳进行比较
  2. Date 对象反复 getTime、setTime,改为 long 型时间戳 += 60_000 实现,得到结果后只 setTime 一次。
  3. 每次填充数据都往 Set<String> dateSet 放入数据,改为通过标识判断只放入一次。
  4. 存放分数的 ArrayList 在第一次循环之后,可以确认大小,之后循环创建 ArrayList 时直接填入固定的大小,减少内存创建。

代码

    private HeliosGetScoreResponse queryScores(HeliosGetScoreRequest request) {
        HeliosGetScoreResponse response = new HeliosGetScoreResponse();

        List<HeliosScore> heliosScoresRecord = heliosService.queryScoresTimeBetween(request.getStartTime(), request.getEndTime(), request.getFilterByAppId());
        if (CollectionUtils.isEmpty(heliosScoresRecord)) {
            return response;
        }

        Set<Date> dateSet = new HashSet<>();
        boolean isDateSetInitial = false;
        int scoreSize = 16;

        List<HeliosScore> heliosScores = HeliosDataMergeJob.mergeData(heliosScoresRecord);

        Map<String, List<HeliosScore>> groupByAppIdHeliosScores = heliosScores.stream().collect(Collectors.groupingBy(HeliosScore::getAppId));

        for (List<HeliosScore> scores : groupByAppIdHeliosScores.values()) {
            HeliosScore heliosScore = scores.get(0);
            HeliosGetScoreResponse.Score score = new HeliosGetScoreResponse.Score();
            score.setNamespace(heliosScore.getNamespace());
            score.setAppId(heliosScore.getAppId());
            score.setScores(new ArrayList<>(scoreSize));
            response.getValues().add(score);

            List<Integer> scoreIntList = HeliosHelper.splitScores(heliosScore);

            // 以 requestTime 为准
            long indexDateMills = request.getStartTime().getTime();
            int index = 0;
            // 如果 timeFrom < requestTime,则增加 timeFrom 到 requestTime
            long heliosScoreTimeFromMills = heliosScore.getTimeFrom().getTime();
            while (indexDateMills > heliosScoreTimeFromMills) {
                heliosScoreTimeFromMills += 60_000;
                index++;
            }
            heliosScore.getTimeFrom().setTime(heliosScoreTimeFromMills);

            long requestEndTimeMills = request.getEndTime().getTime();
            long heliosScoreTimeToMills = heliosScore.getTimeTo().getTime();
            // 循环条件为 (当前时间 <= 请求最大时间) && (当前时间 <= 数据最大时间) && (index < 数据条数)
            while (indexDateMills <= requestEndTimeMills && indexDateMills <= heliosScoreTimeToMills && index < scoreIntList.size()) {
                score.getScores().add(scoreIntList.get(index++));
                if (!isDateSetInitial) {
                    dateSet.add(new Date(indexDateMills));
                }
                indexDateMills += 60_000;
            }
            // 性能优化,减少重复放入的次数
            isDateSetInitial = true;
            // 性能优化,初始化足够的 size 减少扩容次数。 x1.1 为了万一数据数量不一致,留出一点 buffer。
            scoreSize = (int) (score.getScores().size() * 1.1);
        }

        response.setDates(new ArrayList<>(dateSet).stream().sorted().map(DateUtils.yyyyMMddHHmm::formatDate).collect(Collectors.toList()));
        return response;
    }

Arthas Trace

`---ts=2021-08-17 15:20:41;thread_name=http-nio-8080-exec-7;id=aa;is_daemon=true;priority=5;TCCL=org.springframework.boot.web.embedded.tomcat.TomcatEmbeddedWebappClassLoader@14be750c
    `---[1411.395123ms] xxxService.controller.HeliosController:queryScores()
        +---[0.016102ms] xxxService.model.helios.HeliosGetScoreResponse:<init>() #149
        +---[0.019084ms] xxxService.model.helios.HeliosGetScoreRequest:getStartTime() #151
        +---[0.007879ms] xxxService.model.helios.HeliosGetScoreRequest:getEndTime() #57
        +---[0.006808ms] xxxService.model.helios.HeliosGetScoreRequest:getFilterByAppId() #57
        +---[27.494178ms] xxxService.service.HeliosService:queryScoresTimeBetween() #57
        +---[0.02087ms] org.apache.commons.collections.CollectionUtils:isEmpty() #152
        +---[0.007694ms] java.util.HashSet:<init>() #156
        +---[19.990512ms] xxxService.helios.jobs.HeliosDataMergeJob:mergeData() #160
        +---[0.044161ms] java.util.List:stream() #162
        +---[0.025737ms] java.util.stream.Collectors:groupingBy() #57
        +---[min=0.079651ms,max=2.007048ms,total=2.086699ms,count=2] java.util.stream.Stream:collect() #57
        +---[0.018405ms] java.util.Map:values() #164
        +---[0.021408ms] java.util.Collection:iterator() #57
        +---[min=7.4E-4ms,max=0.015625ms,total=0.177657ms,count=121] java.util.Iterator:hasNext() #57
        +---[min=0.001193ms,max=0.026712ms,total=0.258491ms,count=120] java.util.Iterator:next() #57
        +---[min=7.69E-4ms,max=0.011855ms,total=0.158671ms,count=120] java.util.List:get() #165
        +---[min=0.001045ms,max=0.019788ms,total=0.232004ms,count=120] xxxService.model.helios.HeliosGetScoreResponse$Score:<init>() #166
        +---[min=0.001072ms,max=0.007958ms,total=0.193652ms,count=120] xxxService.helios.entity.HeliosScore:getNamespace() #167
        +---[min=0.001164ms,max=0.007796ms,total=0.201584ms,count=120] xxxService.model.helios.HeliosGetScoreResponse$Score:setNamespace() #57
        +---[min=0.001048ms,max=0.007456ms,total=0.178323ms,count=120] xxxService.helios.entity.HeliosScore:getAppId() #168
        +---[min=0.001137ms,max=0.010225ms,total=0.201887ms,count=120] xxxService.model.helios.HeliosGetScoreResponse$Score:setAppId() #57
        +---[min=0.001627ms,max=0.010431ms,total=0.291395ms,count=120] java.util.ArrayList:<init>() #169
        +---[min=0.00116ms,max=0.0088ms,total=0.20171ms,count=120] xxxService.model.helios.HeliosGetScoreResponse$Score:setScores() #57
        +---[min=0.001076ms,max=0.010293ms,total=0.199407ms,count=120] xxxService.model.helios.HeliosGetScoreResponse:getValues() #170
        +---[min=7.54E-4ms,max=0.086952ms,total=150.86682ms,count=170753] java.util.List:add() #57
        +---[min=0.020428ms,max=0.269554ms,total=19.477128ms,count=120] xxxService.helios.HeliosHelper:splitScores() #172
        +---[min=0.001092ms,max=0.005258ms,total=0.202045ms,count=120] xxxService.model.helios.HeliosGetScoreRequest:getStartTime() #175
        +---[min=7.09E-4ms,max=0.021027ms,total=0.630747ms,count=480] java.util.Date:getTime() #57
        +---[min=0.00106ms,max=0.015055ms,total=0.188439ms,count=120] xxxService.helios.entity.HeliosScore:getTimeFrom() #178
        +---[min=0.001025ms,max=0.009712ms,total=0.171506ms,count=120] xxxService.helios.entity.HeliosScore:getTimeFrom() #183
        +---[min=7.4E-4ms,max=0.092253ms,total=0.251068ms,count=120] java.util.Date:setTime() #57
        +---[min=0.001086ms,max=0.006234ms,total=0.184256ms,count=120] xxxService.model.helios.HeliosGetScoreRequest:getEndTime() #185
        +---[min=0.001036ms,max=0.012332ms,total=0.176491ms,count=120] xxxService.helios.entity.HeliosScore:getTimeTo() #186
      3 +---[min=6.73E-4ms,max=0.066785ms,total=135.009239ms,count=170635] java.util.List:size() #188
      1 +---[min=0.001085ms,max=0.089243ms,total=208.003309ms,count=170633] xxxService.model.helios.HeliosGetScoreResponse$Score:getScores() #189
      2 +---[min=7.31E-4ms,max=0.070823ms,total=145.488732ms,count=170633] java.util.List:get() #57
        +---[min=0.001177ms,max=0.143546ms,total=2.319379ms,count=1440] java.util.Date:<init>() #191
        +---[min=0.001346ms,max=0.064411ms,total=2.839878ms,count=1440] java.util.Set:add() #57
        +---[min=0.001096ms,max=0.009059ms,total=0.190336ms,count=120] xxxService.model.helios.HeliosGetScoreResponse$Score:getScores() #198
        +---[min=6.92E-4ms,max=0.016223ms,total=0.141751ms,count=120] java.util.List:size() #57
        +---[0.069753ms] java.util.ArrayList:<init>() #201
        +---[0.021066ms] java.util.ArrayList:stream() #57
        +---[0.029498ms] java.util.stream.Stream:sorted() #57
        +---[0.014089ms] java.util.stream.Stream:map() #57
        +---[0.013053ms] java.util.stream.Collectors:toList() #57
        `---[0.009818ms] xxxService.model.helios.HeliosGetScoreResponse:setDates() #57

分析

这一步将执行时间又优化了 80ms 左右。现在还剩是 160ms 了。

从 Trace 中看耗时时间最长的是三个方法:

  • getScores。直接 get 了属性啥也没干,但是积少成多
  • list.size()
  • list.get(index)

也就是说虽然这几个函数里也没干什么东西,但是函数调用、指针寻址本身也是有开销的。

第三次优化

优化方向

  1. 减少 list 属性的调用
  2. 一次次 list.add 方法改成 subList 一次性放入

也就是说循环中不做任何耗时操作,不做任何指针/引用。

代码

private HeliosGetScoreResponse queryScores(HeliosGetScoreRequest request) {
        HeliosGetScoreResponse response = new HeliosGetScoreResponse();

        List<HeliosScore> heliosScoresRecord = heliosService.queryScoresTimeBetween(request.getStartTime(), request.getEndTime(), request.getFilterByAppId());
        if (CollectionUtils.isEmpty(heliosScoresRecord)) {
            return response;
        }

        Set<Date> dateSet = new HashSet<>();
        boolean isDateSetInitial = false;
        int scoreSize = 16;

        List<HeliosScore> heliosScores = HeliosDataMergeJob.mergeData(heliosScoresRecord);

        Map<String, List<HeliosScore>> groupByAppIdHeliosScores = heliosScores.stream().collect(Collectors.groupingBy(HeliosScore::getAppId));

        for (List<HeliosScore> scores : groupByAppIdHeliosScores.values()) {
            HeliosScore heliosScore = scores.get(0);
            HeliosGetScoreResponse.Score score = new HeliosGetScoreResponse.Score();
            score.setNamespace(heliosScore.getNamespace());
            score.setAppId(heliosScore.getAppId());
            score.setScores(new ArrayList<>(scoreSize));
            response.getValues().add(score);

            List<Integer> scoreIntList = HeliosHelper.splitScores(heliosScore);

            // 以 requestTime 为准
            long indexDateMills = request.getStartTime().getTime();
            int index = 0;
            // 如果 timeFrom < requestTime,则增加 timeFrom 到 requestTime
            long heliosScoreTimeFromMills = heliosScore.getTimeFrom().getTime();
            while (indexDateMills > heliosScoreTimeFromMills) {
                heliosScoreTimeFromMills += 60_000;
                index++;
            }
            heliosScore.getTimeFrom().setTime(heliosScoreTimeFromMills);

            long requestEndTimeMills = request.getEndTime().getTime();
            long heliosScoreTimeToMills = heliosScore.getTimeTo().getTime();

            // 循环条件为 (当前时间 <= 请求最大时间) && (当前时间 <= 数据最大时间) && (index < 数据条数)
            int scoreIntListSize = scoreIntList.size();
            int indexStart = index;
            while (indexDateMills <= requestEndTimeMills && indexDateMills <= heliosScoreTimeToMills && index++ < scoreIntListSize) {
                if (!isDateSetInitial) {
                    dateSet.add(new Date(indexDateMills));
                }
                indexDateMills += 60_000;
            }
            score.getScores().addAll(scoreIntList.subList(indexStart, index - 1));
            // 性能优化,减少重复放入的次数
            isDateSetInitial = true;
            // 性能优化,初始化足够的 size 减少扩容次数。 x1.1 为了万一数据数量不一致,留出一点 buffer。
            scoreSize = (int) (score.getScores().size() * 1.1);
        }

        response.setDates(new ArrayList<>(dateSet).stream().sorted().map(DateUtils.yyyyMMddHHmm::formatDate).collect(Collectors.toList()));
        return response;
    }

Arthas Trace

`---ts=2021-08-17 15:33:40;thread_name=http-nio-8080-exec-11;id=f1;is_daemon=true;priority=5;TCCL=org.springframework.boot.web.embedded.tomcat.TomcatEmbeddedWebappClassLoader@d1c5cf2
    `---[138.624811ms] xxxService.controller.HeliosController:queryScores()
        +---[0.021852ms] xxxService.model.helios.HeliosGetScoreResponse:<init>() #149
        +---[0.00746ms] xxxService.model.helios.HeliosGetScoreRequest:getStartTime() #151
        +---[0.005838ms] xxxService.model.helios.HeliosGetScoreRequest:getEndTime() #57
        +---[0.006341ms] xxxService.model.helios.HeliosGetScoreRequest:getFilterByAppId() #57
    2   +---[15.227453ms] xxxService.service.HeliosService:queryScoresTimeBetween() #57
        +---[0.02168ms] org.apache.commons.collections.CollectionUtils:isEmpty() #152
        +---[0.008923ms] java.util.HashSet:<init>() #156
    1   +---[22.703926ms] xxxService.helios.jobs.HeliosDataMergeJob:mergeData() #160
        +---[0.047118ms] java.util.List:stream() #162
        +---[0.043183ms] java.util.stream.Collectors:groupingBy() #57
        +---[min=0.095654ms,max=2.183288ms,total=2.278942ms,count=2] java.util.stream.Stream:collect() #57
        +---[0.022906ms] java.util.Map:values() #164
        +---[0.025777ms] java.util.Collection:iterator() #57
        +---[min=9.28E-4ms,max=0.017187ms,total=0.261862ms,count=121] java.util.Iterator:hasNext() #57
        +---[min=9.88E-4ms,max=0.018901ms,total=0.280889ms,count=120] java.util.Iterator:next() #57
        +---[min=9.65E-4ms,max=0.014741ms,total=0.262695ms,count=120] java.util.List:get() #165
        +---[min=0.001215ms,max=0.013928ms,total=0.347762ms,count=120] xxxService.model.helios.HeliosGetScoreResponse$Score:<init>() #166
        +---[min=0.001253ms,max=0.010855ms,total=0.328842ms,count=120] xxxService.helios.entity.HeliosScore:getNamespace() #167
        +---[min=0.001316ms,max=0.014714ms,total=0.372553ms,count=120] xxxService.model.helios.HeliosGetScoreResponse$Score:setNamespace() #57
        +---[min=0.001211ms,max=0.010511ms,total=0.322723ms,count=120] xxxService.helios.entity.HeliosScore:getAppId() #168
        +---[min=0.00132ms,max=0.010201ms,total=0.334627ms,count=120] xxxService.model.helios.HeliosGetScoreResponse$Score:setAppId() #57
        +---[min=0.00116ms,max=0.014504ms,total=0.386879ms,count=120] java.util.ArrayList:<init>() #169
        +---[min=0.00131ms,max=0.014072ms,total=0.344922ms,count=120] xxxService.model.helios.HeliosGetScoreResponse$Score:setScores() #57
        +---[min=0.001261ms,max=0.017312ms,total=0.356444ms,count=120] xxxService.model.helios.HeliosGetScoreResponse:getValues() #170
        +---[min=9.73E-4ms,max=0.016531ms,total=0.275794ms,count=120] java.util.List:add() #57
     3  +---[min=0.023208ms,max=19.808819ms,total=47.196601ms,count=120] xxxService.helios.HeliosHelper:splitScores() #172
        +---[min=0.001289ms,max=0.009578ms,total=0.36878ms,count=120] xxxService.model.helios.HeliosGetScoreRequest:getStartTime() #175
        +---[min=8.85E-4ms,max=0.016405ms,total=0.994157ms,count=480] java.util.Date:getTime() #57
        +---[min=0.001238ms,max=0.016801ms,total=0.34399ms,count=120] xxxService.helios.entity.HeliosScore:getTimeFrom() #178
        +---[min=0.001217ms,max=0.008931ms,total=0.316197ms,count=120] xxxService.helios.entity.HeliosScore:getTimeFrom() #183
        +---[min=9.14E-4ms,max=0.015929ms,total=0.277078ms,count=120] java.util.Date:setTime() #57
        +---[min=0.001238ms,max=0.01061ms,total=0.3375ms,count=120] xxxService.model.helios.HeliosGetScoreRequest:getEndTime() #185
        +---[min=0.001225ms,max=0.018059ms,total=0.315198ms,count=120] xxxService.helios.entity.HeliosScore:getTimeTo() #186
        +---[min=8.79E-4ms,max=0.022669ms,total=0.272356ms,count=120] java.util.List:size() #189
        +---[min=0.002001ms,max=0.056977ms,total=4.32853ms,count=1440] java.util.Date:<init>() #193
        +---[min=0.002174ms,max=0.040594ms,total=4.594415ms,count=1440] java.util.Set:add() #57
        +---[min=0.001302ms,max=0.012925ms,total=0.353165ms,count=120] xxxService.model.helios.HeliosGetScoreResponse$Score:getScores() #197
        +---[min=0.001004ms,max=0.033424ms,total=0.338294ms,count=120] java.util.List:subList() #57
        +---[min=0.004871ms,max=0.051046ms,total=2.945263ms,count=120] java.util.List:addAll() #57
        +---[min=0.001291ms,max=0.009831ms,total=0.314292ms,count=120] xxxService.model.helios.HeliosGetScoreResponse$Score:getScores() #201
        +---[min=8.84E-4ms,max=0.018168ms,total=0.249321ms,count=120] java.util.List:size() #57
        +---[0.054305ms] java.util.ArrayList:<init>() #204
        +---[0.024481ms] java.util.ArrayList:stream() #57
        +---[0.028717ms] java.util.stream.Stream:sorted() #57
        +---[0.013725ms] java.util.stream.Stream:map() #57
        +---[0.0128ms] java.util.stream.Collectors:toList() #57
        `---[0.007166ms] xxxService.model.helios.HeliosGetScoreResponse:setDates() #57

分析

这一步又优化了 100ms 左右,现在还剩 60ms。

现在从 trace 上看耗时操作只有三个了:

  • 查数据库
  • 合并数据
  • 拆分得分字符串 "100,100,100" 为 int 数组 [100,100,100]

第四次优化

优化方向

  1. 查数据库发现由于 SQL 判断不准确,每次会多查出来一条数据,在后边循环的时候会多循环一倍
  2. 合并数据时发现可以针对单条数据的情况直接过滤,减少开销。

代码

  1. 改了 SQL 并验证,减少查询出来的数据量
  2. 单条数据时不再处理合并逻辑

Arthas Trace

`---ts=2021-08-17 16:03:24;thread_name=http-nio-8080-exec-13;id=f1;is_daemon=true;priority=5;TCCL=org.springframework.boot.web.embedded.tomcat.TomcatEmbeddedWebappClassLoader@69e2fe3b
    `---[38.171379ms] xxxService.controller.HeliosController:queryScores()
        +---[0.009463ms] xxxService.model.helios.HeliosGetScoreResponse:<init>() #149
        +---[0.00348ms] xxxService.model.helios.HeliosGetScoreRequest:getStartTime() #151
        +---[0.003233ms] xxxService.model.helios.HeliosGetScoreRequest:getEndTime() #57
        +---[0.003395ms] xxxService.model.helios.HeliosGetScoreRequest:getFilterByAppId() #57
     1  +---[10.157226ms] xxxService.service.HeliosService:queryScoresTimeBetween() #57
        +---[0.009989ms] org.apache.commons.collections.CollectionUtils:isEmpty() #152
        +---[0.003394ms] java.util.HashSet:<init>() #156
        +---[0.083535ms] xxxService.helios.jobs.HeliosDataMergeJob:mergeData() #160
        +---[0.017819ms] java.util.List:stream() #162
        +---[0.011787ms] java.util.stream.Collectors:groupingBy() #57
        +---[min=0.047561ms,max=2.02786ms,total=2.075421ms,count=2] java.util.stream.Stream:collect() #57
        +---[0.015525ms] java.util.Map:values() #164
        +---[0.021965ms] java.util.Collection:iterator() #57
        +---[min=7.25E-4ms,max=0.009733ms,total=0.115783ms,count=121] java.util.Iterator:hasNext() #57
        +---[min=8.43E-4ms,max=0.011422ms,total=0.142771ms,count=120] java.util.Iterator:next() #57
        +---[min=7.81E-4ms,max=0.010883ms,total=0.128809ms,count=120] java.util.List:get() #165
        +---[min=0.001023ms,max=0.004301ms,total=0.150165ms,count=120] xxxService.model.helios.HeliosGetScoreResponse$Score:<init>() #166
        +---[min=0.001066ms,max=0.004648ms,total=0.154698ms,count=120] xxxService.helios.entity.HeliosScore:getNamespace() #167
        +---[min=0.001137ms,max=0.005607ms,total=0.170279ms,count=120] xxxService.model.helios.HeliosGetScoreResponse$Score:setNamespace() #57
        +---[min=0.001023ms,max=0.004292ms,total=0.151767ms,count=120] xxxService.helios.entity.HeliosScore:getAppId() #168
        +---[min=0.001105ms,max=0.004701ms,total=0.164955ms,count=120] xxxService.model.helios.HeliosGetScoreResponse$Score:setAppId() #57
        +---[min=0.001359ms,max=0.007931ms,total=0.233665ms,count=120] java.util.ArrayList:<init>() #169
        +---[min=0.001117ms,max=0.00785ms,total=0.168539ms,count=120] xxxService.model.helios.HeliosGetScoreResponse$Score:setScores() #57
        +---[min=0.001073ms,max=0.004488ms,total=0.156654ms,count=120] xxxService.model.helios.HeliosGetScoreResponse:getValues() #170
        +---[min=7.98E-4ms,max=0.00977ms,total=0.125818ms,count=120] java.util.List:add() #57
        +---[min=0.022304ms,max=0.12093ms,total=8.88628ms,count=120] xxxService.helios.HeliosHelper:splitScores() #172
        +---[min=0.001092ms,max=0.004967ms,total=0.161288ms,count=120] xxxService.model.helios.HeliosGetScoreRequest:getStartTime() #175
        +---[min=7.02E-4ms,max=0.012136ms,total=0.467786ms,count=480] java.util.Date:getTime() #57
        +---[min=0.001022ms,max=0.004944ms,total=0.151353ms,count=120] xxxService.helios.entity.HeliosScore:getTimeFrom() #178
        +---[min=0.001018ms,max=0.004731ms,total=0.148025ms,count=120] xxxService.helios.entity.HeliosScore:getTimeFrom() #183
        +---[min=7.3E-4ms,max=0.009359ms,total=0.120588ms,count=120] java.util.Date:setTime() #57
        +---[min=0.00107ms,max=0.008948ms,total=0.162848ms,count=120] xxxService.model.helios.HeliosGetScoreRequest:getEndTime() #185
        +---[min=0.001034ms,max=0.014003ms,total=0.158614ms,count=120] xxxService.helios.entity.HeliosScore:getTimeTo() #186
        +---[min=6.99E-4ms,max=0.009995ms,total=0.11179ms,count=120] java.util.List:size() #189
        +---[min=6.95E-4ms,max=0.005468ms,total=1.116308ms,count=1440] java.util.Date:<init>() #193
        +---[min=7.79E-4ms,max=0.029909ms,total=1.407528ms,count=1440] java.util.Set:add() #57
        +---[min=0.001097ms,max=0.008616ms,total=0.160597ms,count=120] xxxService.model.helios.HeliosGetScoreResponse$Score:getScores() #197
        +---[min=8.23E-4ms,max=0.0294ms,total=0.153353ms,count=120] java.util.List:subList() #57
        +---[min=0.005771ms,max=0.04465ms,total=1.992151ms,count=120] java.util.List:addAll() #57
        +---[min=0.001098ms,max=0.007013ms,total=0.169555ms,count=120] xxxService.model.helios.HeliosGetScoreResponse$Score:getScores() #201
        +---[min=7.04E-4ms,max=0.01315ms,total=0.120998ms,count=120] java.util.List:size() #57
        +---[0.197732ms] java.util.ArrayList:<init>() #204
        +---[0.018589ms] java.util.ArrayList:stream() #57
        +---[0.025192ms] java.util.stream.Stream:sorted() #57
        +---[0.012544ms] java.util.stream.Stream:map() #57
        +---[0.012188ms] java.util.stream.Collectors:toList() #57
        `---[0.0067ms] xxxService.model.helios.HeliosGetScoreResponse:setDates() #57

分析

可以看到现在最大耗时的地方终于是数据库查询了。现在查询一整天的数据,也只需要 25~40ms 左右。

结果

链路:

链路上看程序代码还是要处理个十几 ms,主要是字符串转 int[] 时的开销,这一步可以再想办法继续优化。

结论

从这次优化我们可以得到一些结论:

  1. 尽量少创建对象
  2. SimpleDateFormat的开销很大
  3. Date.compare 的开销不低
  4. 哪怕最简单的操作如 list.size() list.add次数多了开销也很可观
  5. 对于性能分析和优化一定要有合适工具,才能得出有用的结论并针对性优化。一开始我以为减少对象创建就万事大吉,但实际上性能消耗的大头并不在这里。还是得借助 Arthas 的 Trace 才能真正针对性地优化。

另外强烈推荐 动态追踪技术漫谈

@yuhui0531
Copy link

cool

1 similar comment
@kissss
Copy link

kissss commented Oct 13, 2021

cool

@cjx3071
Copy link

cjx3071 commented Nov 4, 2021

很有意义的分享

@LaughingMeMe
Copy link

mark

@davidricardo1026
Copy link

加个排序可好,看得眼有点花

@davidricardo1026
Copy link

  1. 能用单例就单例,比如2就可以用单例的java类
  2. SimpleDateFormat已经可以废弃了,直接java.time.format.DateTimeFormatter
  3. Date.compare,建议使用新的JavaDateApi,如LocalDate,LocalDatetime,Durtaion,YearMonth之类的,简单测试,发现,巨量调用以后,LocalDatetime.compareTo方法是Date.compare的1/3

@davidricardo1026
Copy link

davidricardo1026 commented Dec 8, 2021

java耗时分类里面

  1. IO最耗时,磁盘,网络都是
  2. 复杂的正则,简单还好
  3. 涉及到语法相关的,比如脚本编译之类的
  4. 还有就是耗时对象的重复创建,比如ObjectMapper本来是单例也可以用了,但是你每次用都new,会非常耗时
    总结下来,一个程序优化到位以后,只有IO占大头,还有count比较大,在循环里面的一些方法占小头,其余的可以忽略不急

@1647855237
Copy link

大佬 截图的那个是什么工具呀
可否加个微信呀,感谢!!!
ju_hao_666

@liuzhiguo630
Copy link
Author

大佬 截图的那个是什么工具呀 可否加个微信呀,感谢!!! ju_hao_666

基于 Skywalking 二开的调用链系统,前端也重写了一下。

@1647855237
Copy link

1647855237 commented Feb 18, 2022 via email

@a139169370
Copy link

mark

@Suphzz
Copy link

Suphzz commented Mar 23, 2022

ngood

@grover121
Copy link

mark

@wudg
Copy link

wudg commented Jul 6, 2022

Mark

@kakakeven
Copy link

Mark

@qqggwm
Copy link

qqggwm commented Aug 10, 2022

感谢分享

@shirley-hpu
Copy link

excellent ^_^

@shirley-hpu
Copy link

shirley-hpu commented Aug 26, 2022 via email

@yangchang1997
Copy link

get下 ,如果能用本地缓存的方式或者分布式缓存 ,程序性能至少提升1.5倍

@shirley-hpu
Copy link

shirley-hpu commented Sep 11, 2022 via email

@zhuhao125
Copy link

m

@Sxxjava
Copy link

Sxxjava commented Oct 10, 2022

mark

@shirley-hpu
Copy link

shirley-hpu commented Oct 10, 2022 via email

@hellogy
Copy link

hellogy commented Oct 30, 2022

总结里面为啥会说 list.size()会有性能问题呢,不是O(1)的时间复杂度吗

@shirley-hpu
Copy link

shirley-hpu commented Oct 30, 2022 via email

@liuzhiguo630
Copy link
Author

总结里面为啥会说 list.size()会有性能问题呢,不是O(1)的时间复杂度吗

是的,应该这么说,list.size() 一方面开销在于虚拟机进栈出栈的开销,一方面是 arthas 追踪方法耗时的开销。后者的占比会更大。

@1160068538
Copy link

Mark

1 similar comment
@wukele
Copy link

wukele commented Nov 3, 2022

Mark

@luckilycheng
Copy link

mark

@shirley-hpu
Copy link

shirley-hpu commented Dec 16, 2022 via email

@yokiweb
Copy link

yokiweb commented Dec 22, 2022

Mark

1 similar comment
@Sxxjava
Copy link

Sxxjava commented Apr 26, 2023

Mark

@shirley-hpu
Copy link

shirley-hpu commented Apr 26, 2023 via email

@JackyYangPassion
Copy link

Best Practices!

@shirley-hpu
Copy link

shirley-hpu commented Aug 23, 2023 via email

@BonjourL
Copy link

mark

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests