[IOTDB-2205] Optimize unaligned int/long read/write functions in BytesUtils#4676
[IOTDB-2205] Optimize unaligned int/long read/write functions in BytesUtils#4676JackieTien97 merged 4 commits intoapache:masterfrom
Conversation
tsfile/src/main/java/org/apache/iotdb/tsfile/utils/BytesUtils.java
Outdated
Show resolved
Hide resolved
JackieTien97
left a comment
There was a problem hiding this comment.
Great job! I wonder that how much improvement does this change bring to write performance? Since you only mentioned the query performance improvements in your pr description.
tsfile/src/main/java/org/apache/iotdb/tsfile/utils/BytesUtils.java
Outdated
Show resolved
Hide resolved
tsfile/src/main/java/org/apache/iotdb/tsfile/utils/BytesUtils.java
Outdated
Show resolved
Hide resolved
tsfile/src/main/java/org/apache/iotdb/tsfile/utils/BytesUtils.java
Outdated
Show resolved
Hide resolved
tsfile/src/main/java/org/apache/iotdb/tsfile/utils/BytesUtils.java
Outdated
Show resolved
Hide resolved
Co-authored-by: Jackie Tien <JackieTien@foxmail.com>
|
With the help of IoTDB-benchmark, I measured the performance improvement. Due to the difference in test data, the results are slightly different from those previously reported.
The modified config of IoTDB-benchmark is listed as follows while others are still the default. |
|
Hi, plz run |
Thanks. The format issue is fixed in the new commit. |
|
Note: There was a bug in the code submitted here, IOTDB-4633, which has been fixed in #7669, where has been corrected to |
注意到在现有的代码中,BytesUtils中非对齐的int/long的读写函数一次循环只能处理一个比特,效率较低,可以对其进行优化,使得一次循环可以处理一个字节。
该优化改动了下面的函数:
通过对用例的查找,发现该优化将会对使用TS_2DIFF编码的数据读写产生影响。
单元性能测试表明,该优化可以提高函数接近5倍的性能。总体性能测试表明,对于采用TS_2DIFF编码且不压缩的INT32/INT64数据,该优化可以减少约40%的读取时间开销。
In the current codes, the read and write functions of unaligned int/long in BytesUtils can only process one bit per loop, which is inefficient. It can be optimized to one byte per loop.
The following functions are optimized:
Through the use case search, the optimization influences the reading and writing of data encoded by TS_2DIFF.
In the unit performance test, the optimized functions are about 5x faster the current ones. In the overall test, about 40% time cost is reduced in the query of the uncompressed TS_2DIFF-coded INT32/INT64 data due to the optimization .