Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enhance: add delta log stream new format reader and writer #34116

Merged
merged 1 commit into from
Jul 6, 2024

Conversation

shaoting-huang
Copy link
Contributor

@shaoting-huang shaoting-huang commented Jun 24, 2024

issue: #34123

Benchmark case: The benchmark run the go benchmark function BenchmarkDeltalogFormat which is put in the Files changed. It tests the performance of serializing and deserializing from two different data formats under a 10 million delete log dataset.

Metrics: The benchmarks measure the average time taken per operation (ns/op), memory allocated per operation (MB/op), and the number of memory allocations per operation (allocs/op).

Test Name Avg Time (ns/op) Time Comparison Memory Allocation (MB/op) Memory Comparison Allocation Count (allocs/op) Allocation Comparison
one_string_format_reader 2,781,990,000 Baseline 2,422 Baseline 20,336,539 Baseline
pk_ts_separate_format_reader 480,682,639 -82.72% 1,765 -27.14% 20,396,958 +0.30%
one_string_format_writer 5,483,436,041 Baseline 13,900 Baseline 70,057,473 Baseline
pk_and_ts_separate_format_writer 798,591,584 -85.43% 2,178 -84.34% 30,270,488 -56.78%

Both read and write operations show significant improvements in both speed and memory allocation.

@sre-ci-robot sre-ci-robot added the size/L Denotes a PR that changes 100-499 lines. label Jun 24, 2024
@mergify mergify bot added the dco-passed DCO check passed. label Jun 24, 2024
Copy link
Contributor

mergify bot commented Jun 24, 2024

@shaoting-huang

Invalid PR Title Format Detected

Your PR submission does not adhere to our required standards. To ensure clarity and consistency, please meet the following criteria:

  1. Title Format: The PR title must begin with one of these prefixes:
  • feat: for introducing a new feature.
  • fix: for bug fixes.
  • enhance: for improvements to existing functionality.
  • test: for add tests to existing functionality.
  • doc: for modifying documentation.
  • auto: for the pull request from bot.
  1. Description Requirement: The PR must include a non-empty description, detailing the changes and their impact.

Required Title Structure:

[Type]: [Description of the PR]

Where Type is one of feat, fix, enhance, test or doc.

Example:

enhance: improve search performance significantly 

Please review and update your PR to comply with these guidelines.

Copy link

codecov bot commented Jun 24, 2024

Codecov Report

Attention: Patch coverage is 82.25806% with 44 lines in your changes missing coverage. Please review.

Project coverage is 80.64%. Comparing base (d51d095) to head (f4c6a46).
Report is 20 commits behind head on master.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #34116      +/-   ##
==========================================
- Coverage   80.86%   80.64%   -0.23%     
==========================================
  Files        1094     1117      +23     
  Lines      137817   138814     +997     
==========================================
+ Hits       111447   111941     +494     
- Misses      22142    22605     +463     
- Partials     4228     4268      +40     
Files Coverage Δ
internal/storage/event_data.go 92.46% <ø> (ø)
internal/storage/event_writer.go 83.73% <100.00%> (+0.64%) ⬆️
internal/storage/serde.go 94.16% <93.54%> (-0.06%) ⬇️
internal/storage/serde_events.go 77.01% <79.90%> (+0.67%) ⬆️

... and 80 files with indirect coverage changes

@shaoting-huang shaoting-huang changed the title add delta log stream new format reader and writer enhance: add delta log stream new format reader and writer Jun 25, 2024
@mergify mergify bot added kind/enhancement Issues or changes related to enhancement and removed ci-passed do-not-merge/invalid-pr-format labels Jun 25, 2024
internal/storage/serde_events.go Outdated Show resolved Hide resolved
internal/storage/serde_events.go Outdated Show resolved Hide resolved
internal/storage/serde_events.go Show resolved Hide resolved
@xiaofan-luan
Copy link
Contributor

  1. what is the performance comparison?

@mergify mergify bot added ci-passed and removed ci-passed labels Jun 27, 2024
@shaoting-huang shaoting-huang force-pushed the delta_format branch 4 times, most recently from 87604a5 to 6bb784b Compare June 27, 2024 07:18
internal/storage/serde_events.go Outdated Show resolved Hide resolved
internal/storage/serde.go Show resolved Hide resolved
internal/storage/serde.go Show resolved Hide resolved
internal/storage/serde_events.go Outdated Show resolved Hide resolved
internal/storage/serde_events_test.go Outdated Show resolved Hide resolved
@mergify mergify bot added the ci-passed label Jun 27, 2024
@sre-ci-robot sre-ci-robot added size/XL Denotes a PR that changes 500-999 lines. and removed size/L Denotes a PR that changes 100-499 lines. labels Jun 27, 2024
@mergify mergify bot removed the ci-passed label Jun 27, 2024
Copy link
Contributor

mergify bot commented Jun 27, 2024

@shaoting-huang E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

@xiaofan-luan
Copy link
Contributor

https://iceberg.apache.org/spec/

wondering should we just follow iceberg standard

@shaoting-huang shaoting-huang force-pushed the delta_format branch 2 times, most recently from bc88066 to 76cf96a Compare June 28, 2024 08:09
@mergify mergify bot added ci-passed and removed ci-passed labels Jun 28, 2024
internal/storage/serde_events.go Show resolved Hide resolved
internal/storage/serde.go Show resolved Hide resolved
internal/storage/serde_events.go Outdated Show resolved Hide resolved
internal/storage/serde_events.go Outdated Show resolved Hide resolved
Copy link
Contributor

mergify bot commented Jul 3, 2024

@shaoting-huang E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>
@xiaofan-luan
Copy link
Contributor

/lgtm
/approve

@sre-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: shaoting-huang, xiaofan-luan

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@sre-ci-robot sre-ci-robot merged commit f4dd7c7 into milvus-io:master Jul 6, 2024
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved ci-passed dco-passed DCO check passed. kind/enhancement Issues or changes related to enhancement lgtm size/XL Denotes a PR that changes 500-999 lines.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants