Skip to content

support read parallel.#774

Open
ColinLeeo wants to merge 2 commits into
developfrom
read_parallel
Open

support read parallel.#774
ColinLeeo wants to merge 2 commits into
developfrom
read_parallel

Conversation

@ColinLeeo
Copy link
Copy Markdown
Contributor

No description provided.

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Apr 11, 2026

Codecov Report

❌ Patch coverage is 32.79883% with 461 lines in your changes missing coverage. Please review.
✅ Project coverage is 62.02%. Comparing base (3c809d5) to head (7ba1919).
⚠️ Report is 1 commits behind head on develop.

Files with missing lines Patch % Lines
cpp/src/reader/aligned_chunk_reader.cc 36.90% 280 Missing and 38 partials ⚠️
cpp/src/file/tsfile_io_reader.cc 0.00% 65 Missing ⚠️
cpp/src/reader/tsfile_series_scan_iterator.cc 38.66% 42 Missing and 4 partials ⚠️
cpp/src/common/tsfile_common.h 4.76% 20 Missing ⚠️
cpp/src/reader/aligned_chunk_reader.h 40.00% 6 Missing ⚠️
cpp/src/reader/tsfile_series_scan_iterator.h 14.28% 5 Missing and 1 partial ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop     #774      +/-   ##
===========================================
- Coverage    62.63%   62.02%   -0.62%     
===========================================
  Files          706      706              
  Lines        42688    43134     +446     
  Branches      6296     6419     +123     
===========================================
+ Hits         26739    26754      +15     
- Misses       14963    15387     +424     
- Partials       986      993       +7     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

ColinLeeo added a commit that referenced this pull request May 26, 2026
Brings together batch decode infrastructure, multi-value aligned read,
parallel page decode, columnar tablet write, and SIMD micro-optimizations
from the long-lived `final` branch into a single review-ready change.

This change is a code snapshot, not a replay of `final` commit history --
the upstream history was a long sequence of WIP commits that wasn't
fit for review. Supersedes #749, #754, #774.

Read path
- Decoder base gains batch APIs (read_batch_int32/int64/float/double,
  skip_*); PLAIN, TS2DIFF, Gorilla decoders implement them. TS2DIFF
  has block-level peeking so time filters can skip blocks without
  decoding. Gorilla adds a raw-pointer GorillaBitReader that bypasses
  ByteStream overhead.
- ChunkReader / AlignedChunkReader add *_DECODE_TV_BATCH methods that
  decode time + value into a TsBlock in one pass, applying batch time
  filters before append.
- AlignedChunkReader supports a multi-value mode: one time chunk + N
  value chunks decoded in a single pass, sharing the decoded timestamps
  and filter mask. SingleDeviceTsBlockReader auto-detects same-device
  measurements via VectorMeasurementColumnContext.
- Optional page-level parallel decompression via a DecodeThreadPool +
  BlockingQueue when ENABLE_THREADS is set. Page-plan classification
  (SKIP / FULL_PASS / BOUNDARY) lets a scatter-free memcpy fast path
  fire when every row passes and no column has nulls.

Write path
- ValuePageWriter gains write_batch / write_string_batch that take
  timestamp+value+nullness arrays directly, removing the per-value
  append loop. Tablet exposes set_timestamps / set_column_values /
  set_column_string_repeated / reset for bulk reuse and switches
  StringColumn to an Arrow-compatible offset+buffer layout.
- TS2DIFFEncoder::flush now packs all deltas with a single
  pack_bits_msb + write_buf instead of per-value write_bits, falling
  back to the scalar path for the rare bit_width > 56 case.
- Int64Statistic::update_batch (NEON-accelerated min/max/sum).

Encoding / SIMD
- TS2DIFF batch decode adds AVX2 helpers via SIMDe (already on develop)
  for both i32 and i64; scalar fallback unchanged.
- PLAIN byte-swap path uses ARM NEON (vrev64q_u8 / vrev32q_u8) when
  available, falling back to __builtin_bswap.
- CMakeLists adds ENABLE_SIMD and turns on -O3 -march=native -flto in
  Release builds.

Allocator / ByteStream
- ByteStream caches page_mask_ (= page_size - 1) so the hot path uses
  a bitmask instead of modulo; wrap_from rounds buffer sizes up to a
  power of two so the mask remains correct. total_size_ widened to
  uint64_t to support files > 4GB.
- UncompressedCompressor now copies its output instead of aliasing
  caller buffers, letting callers free input safely.

C wrapper / Arrow
- Trimmed unused metadata-export surface (TsFileStatisticBase,
  TimeseriesMetadata, DeviceTimeseriesMetadataEntry, tag-filter handles)
  out of the public C API. Internal tag filtering is unaffected.
- arrow_c.cc simplified: per-row offset handling for sliced
  variable-length arrays in place of the InvertArrowBitmap copy.

Tests / benchmarks
- New tsfile_reader_table_batch_test.cc covers the TsBlock batch read
  path. gorilla_codec_test.cc adds Int32/Int64/Float batch decode
  tests. examples/cpp_examples adds bench_read.cpp/.h and an
  examples/read_perf_compare/ target.
- Removed cwrapper_metadata_test.cc and common/path.cc (Path bodies
  inlined into path.h; the C metadata API they covered is gone).

Compatibility
- All new C++ methods are additions; no existing C++ API was removed.
- C wrapper headers lost the metadata export / tag filter symbols
  listed above -- downstream callers (Python wrapper in particular)
  will want a sanity check before merge.
- cpp/third_party/ intentionally left at develop's state so the
  recent MSVC compatibility fixes (WITH_STATIC_CRT OFF, CMP0054 NEW,
  CMAKE_POLICY_VERSION_MINIMUM=3.5, _MSC_VER guards) are preserved.

Verification
- cmake configure + make -j on macOS arm64 (AppleClang, C++11) builds
  cleanly: libtsfile.2.2.1.dev.dylib and TsFile_Test both link, zero
  errors, only unused-lambda-capture warnings in pre-existing tests.
- Full TsFile_Test run and downstream Python binding load are left as
  pre-merge checks.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants