-
Notifications
You must be signed in to change notification settings - Fork 0
SST File Online Benchmark
db_bench can perform multiple tests on DB, but its source data is randomly generated.
On the other hand, ordinary performance testing is a full-link test at the level of the entire DB, which cannot dive down into a single SST.
Previously, we added the function of viewing the internal structure of a single SST file in ToplingDB's WebView, which supports various SSTs of Topling and BlockBasedTable.
Recently (2023-03-03) we have added an online Benchmark function on the WebView of the SST file, which can perform multiple performance tests on the SST.
| scan | rev scan | scan value | seek | seek value | rand value
Create an Iterator from SST's TableReader for testing, and add the following parameters to the url:
parameter name | type | default value | description |
---|---|---|---|
bench | enum | null | Optional value: {scan,seek} |
repeat | int | 1 | The number of repetitions, for the entire Benchmark, not for a single operation |
reverse | bool | 0 | Forward scan or reverse scan, both valid in scan and seek |
rand | bool | 0 | It is only valid during seek, first load all the keys of the entire SST into a StrVec, then randomly shuffle the StrVec, then traverse the StrVec sequentially and execute seek |
pointNode 1 | bool | 1 | Special parameters, only valid when bench=seek, rand=1 and fecth_value <= 1, and only meaningful for ToplingZipTable |
fetch_value | int | 0 | When bench=scan, the type is bool, indicating whether to read value during scanning When bench=seekNote 2and rand is true, it means how many KVs are read sequentially after seeking to a position |
Node 1: point parameter, because ToplingZipTable uses the patented PForDelta variant to compress the offset of value (the length of value is the difference between adjacent offsets), and Iterator decompresses PForDelta by block. For random Seek, there are only two valid data in the decompressed whole block. The PForDelta The variant can efficiently decompress only two pieces of data at a time, thereby greatly improving the performance of the search. When point=1, iter->Seek will not be called, but iter->PointGet
will be called to achieve the aforementioned functions. The default of PointGet The behavior is implemented through Seek to adapt to other SSTs other than ToplingZipTable
Node 2: When fetch_value > 1, the number of seeks = entries/fetch_value, where entries refer to the total number of KVs in the SST file. After the seek, move forward (Next) or backward (Prev) fetch_value-1
times according to the reverse parameter, and access fetch_value in total KV
When bench=seek and rand=1, the table is output, otherwise it is simple text output
The individual link buttons are shortcuts to some combination of the above url parameters:
link | parameter combination | description |
---|---|---|
scan | bench=scan | sequential forward scan, do not read value |
rev scan | bench=scan&reverse=1 | sequential backward scan, do not read value |
scan value | bench=scan&fetch_value=1 | sequential forward scan, read value |
seek | bench=seek | seek, do not read value |
seek value | bench=seek&fetch_value=1 | seek, read value |
rand value | bench=seek&fetch_value=1&rand=1 | random point get |
To test seek to a random point, and then forward/reverse scan several steps, you need to enter the url parameter yourself (you can use the rand value
link as a template to modify)
In this performance test, fetch_value is implemented through iter->PrepareValue()
.
The PrepareValue of BlockBasedTable is read by Block: when iter is at the starting position of a Block, it reads/decompresses the entire Block at one time, and when it reads other values of the Block later, there is no need to do anything. This processing method will cause the fetch_value stage to take a time close to 0.
PrepareValue of ToplingZipTable faithfully reads each value
- If there is no compression, use Zero Copy to directly return the memory range corresponding to the value in the mmap of the SST file
- If the value is compressed, the value will be decompressed on site, and the decompression throughput rate is generally above 1GB per second
- This processing method is feasible because on-site decompression is fast enough. For example, a single piece of data is 500 bytes, and on-site decompression only takes 500 nanoseconds
Other SSTs of Topling are not compressed, and PrepareValue is Zero Copy