Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: new crate trace_metric for collecting metrics in read procedure #714

Merged
merged 23 commits into from
Mar 14, 2023

Conversation

ShiKaiWi
Copy link
Member

@ShiKaiWi ShiKaiWi commented Mar 8, 2023

Which issue does this PR close?

Closes #696

Rationale for this change

Refer to #696

What changes are included in this PR?

  • Add trace_metric crate for collecting the metrics during the codebase;
  • Add a proc macro to simplify the metrics collecting;
  • Feed the collected metrics during the scan procedure to datafusion analyze framework;

Are there any user-facing changes?

When using explain analyze [sql], it will output the metrics about the scan table. Here is an example:

2023-03-14 18:14:47.699 INFO [query_engine/src/executor.rs:120] Executor executed plan, request_id:3, cost:9889ms, plan_and_metrics: AnalyzeExec verbose=false, metrics=[]
  CoalescePartitionsExec, metrics=[output_rows=7200000, elapsed_compute=463.779µs, spill_count=0, spilled_bytes=0, mem_used=0]
    ProjectionExec: expr=[timestamp@0 as timestamp, tsid@1 as tsid, hostname@2 as hostname, region@3 as region, datacenter@4 as datacenter, rack@5 as rack, os@6 as os, arch@7 as arch, team@8 as team, service@9 as service, service_version@10 as service_version, service_environment@11 as service_environment, usage_user@12 as usage_user, usage_system@13 as usage_system, usage_idle@14 as usage_idle, usage_nice@15 as usage_nice, usage_iowait@16 as usage_iowait, usage_irq@17 as usage_irq, usage_softirq@18 as usage_softirq, usage_steal@19 as usage_steal, usage_guest@20 as usage_guest, usage_guest_nice@21 as usage_guest_nice], metrics=[output_rows=7200000, elapsed_compute=231.975134ms, spill_count=0, spilled_bytes=0, mem_used=0]
      ScanTable: table=cpu, parallelism=8, order=None, , metrics=[
scan_table:
    do_merge_sort=true
    iter_num=6
    merge_iter_0:
        num_memtables=0
        num_ssts=1
        total_rows_fetch_from_one=1200000
        times_fetch_rows_from_one=2490
        times_fetch_row_from_multiple=0
        init_duration=70.616µs
        scan_duration=2.585217663s
        scan_count=148
        scan_sst_311:
            meta_data_cache_hit=false
            read_meta_data_duration=131.260455ms
            parallelism=8
            prune_row_groups:
                use_custom_filter=true
                total_row_groups=147
                row_groups_after_prune=147
                pruned_by_custom_filter=0
                pruned_by_min_max=0
    merge_iter_1:
        num_memtables=0
        num_ssts=1
        total_rows_fetch_from_one=1200000
        times_fetch_rows_from_one=2490
        times_fetch_row_from_multiple=0
        init_duration=41.87µs
        scan_duration=3.035329703s
        scan_count=148
        scan_sst_339:
            meta_data_cache_hit=false
            read_meta_data_duration=158.434246ms
            parallelism=8
            prune_row_groups:
                use_custom_filter=true
                total_row_groups=147
                row_groups_after_prune=147
                pruned_by_custom_filter=0
                pruned_by_min_max=0
    merge_iter_2:
        num_memtables=0
        num_ssts=1
        total_rows_fetch_from_one=1200000
        times_fetch_rows_from_one=2490
        times_fetch_row_from_multiple=0
        init_duration=52.026µs
        scan_duration=2.954758353s
        scan_count=148
        scan_sst_367:
            meta_data_cache_hit=false
            read_meta_data_duration=140.372347ms
            parallelism=8
            prune_row_groups:
                use_custom_filter=true
                total_row_groups=147
                row_groups_after_prune=147
                pruned_by_custom_filter=0
                pruned_by_min_max=0
    merge_iter_3:
        num_memtables=0
        num_ssts=1
        total_rows_fetch_from_one=1200000
        times_fetch_rows_from_one=2490
        times_fetch_row_from_multiple=0
        init_duration=40.135µs
        scan_duration=3.756076722s
        scan_count=148
        scan_sst_397:
            meta_data_cache_hit=false
            read_meta_data_duration=153.09233ms
            parallelism=8
            prune_row_groups:
                use_custom_filter=true
                total_row_groups=147
                row_groups_after_prune=147
                pruned_by_custom_filter=0
                pruned_by_min_max=0
    merge_iter_4:
        num_memtables=0
        num_ssts=1
        total_rows_fetch_from_one=1200000
        times_fetch_rows_from_one=2490
        times_fetch_row_from_multiple=0
        init_duration=38.686µs
        scan_duration=3.495850882s
        scan_count=148
        scan_sst_424:
            meta_data_cache_hit=false
            read_meta_data_duration=197.153733ms
            parallelism=8
            prune_row_groups:
                use_custom_filter=true
                total_row_groups=147
                row_groups_after_prune=147
                pruned_by_custom_filter=0
                pruned_by_min_max=0
    merge_iter_5:
        num_memtables=1
        num_ssts=2
        total_rows_fetch_from_one=1097729
        times_fetch_rows_from_one=2279
        times_fetch_row_from_multiple=202271
        init_duration=638.566019ms
        scan_duration=3.570071905s
        scan_count=181
        scan_memtable_2160:
        scan_sst_438:
            meta_data_cache_hit=false
            read_meta_data_duration=188.863397ms
            parallelism=8
            prune_row_groups:
                use_custom_filter=true
                total_row_groups=147
                row_groups_after_prune=147
                pruned_by_custom_filter=0
                pruned_by_min_max=0
        scan_sst_439:
            meta_data_cache_hit=false
            read_meta_data_duration=11.038658ms
            parallelism=8
            prune_row_groups:
                use_custom_filter=true
                total_row_groups=11
                row_groups_after_prune=11
                pruned_by_custom_filter=0
                pruned_by_min_max=0
=0]

How does this change test

Existing tests.

@ShiKaiWi ShiKaiWi marked this pull request as draft March 8, 2023 13:36
@codecov-commenter
Copy link

codecov-commenter commented Mar 10, 2023

Codecov Report

Merging #714 (b6ae52c) into main (a803efc) will increase coverage by 0.20%.
The diff coverage is 78.72%.

❗ Current head b6ae52c differs from pull request most recent head a5b82fd. Consider uploading reports for the commit a5b82fd to get more accurate results

@@            Coverage Diff             @@
##             main     #714      +/-   ##
==========================================
+ Coverage   68.26%   68.47%   +0.20%     
==========================================
  Files         287      294       +7     
  Lines       45058    45675     +617     
==========================================
+ Hits        30761    31276     +515     
- Misses      14297    14399     +102     
Impacted Files Coverage Δ
analytic_engine/src/memtable/mod.rs 81.25% <ø> (ø)
analytic_engine/src/row_iter/chain.rs 55.80% <0.00%> (-0.76%) ⬇️
server/src/handlers/error.rs 0.00% <ø> (ø)
server/src/http.rs 0.00% <0.00%> (ø)
table_engine/src/table.rs 57.80% <0.00%> (-0.25%) ⬇️
tools/src/bin/sst-convert.rs 1.23% <0.00%> (-0.02%) ⬇️
table_engine/src/provider.rs 68.60% <11.76%> (-2.67%) ⬇️
server/src/handlers/influxdb.rs 67.02% <67.02%> (ø)
components/trace_metric/src/metric.rs 75.00% <75.00%> (ø)
components/trace_metric_derive/src/builder.rs 98.03% <98.03%> (ø)
... and 16 more

... and 1 file with indirect coverage changes

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@ShiKaiWi ShiKaiWi changed the title feat: add ReadMetricsCollector for collecting metrics in read procedure feat: new crate trace_metric for collecting metrics in read procedure Mar 13, 2023
@ShiKaiWi ShiKaiWi marked this pull request as ready for review March 13, 2023 12:19
components/trace_metric_derive_tests/src/lib.rs Outdated Show resolved Hide resolved
components/trace_metric/src/collector.rs Outdated Show resolved Hide resolved
components/trace_metric/src/collector.rs Outdated Show resolved Hide resolved
components/trace_metric/src/collector.rs Outdated Show resolved Hide resolved
analytic_engine/src/sst/parquet/row_group_pruner.rs Outdated Show resolved Hide resolved
analytic_engine/src/sst/parquet/row_group_pruner.rs Outdated Show resolved Hide resolved
analytic_engine/src/row_iter/merge.rs Show resolved Hide resolved
components/trace_metric/src/metric.rs Outdated Show resolved Hide resolved
@jiacai2050
Copy link
Contributor

When using explain analyze [sql], it will output the metrics about the scan table.

Please give some example.

Copy link
Contributor

@jiacai2050 jiacai2050 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@ShiKaiWi ShiKaiWi added this pull request to the merge queue Mar 14, 2023
Merged via the queue into apache:main with commit c0494fe Mar 14, 2023
chunshao90 pushed a commit to chunshao90/ceresdb that referenced this pull request May 15, 2023
…re (apache#714)

* feat: add ReadMetricsCollector for collecting metrics in read procedure

* collect metrics during merge

* add new component trace_metric

* add derive crate

* impl proc macro for trace_metric

* use the trace_metric crate

* support optional collector

* fix license

* make collector is optional

* trace metrics in parquet reader

* rename Collector to MetricsCollector

* rename TracedMetrics to TraceMetricWhenDrop

* support FormatCollectorVisitor

* set the metrics into datafusion

* remove useless file

* remove useless module

* rename constant variable

* fix wrong field name in metrics

* address CR

* attach index to the merge iter's metrics collector

* rename some constant variables

* fix wrong variable name

* rename metric type
@ShiKaiWi ShiKaiWi deleted the feat-analyze-metrics branch May 29, 2023 08:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Collect scan metrics and feed them into the datafusion analyze framework
4 participants