Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sync to 1.2, re-pr: remove the old fast ranges and update metrics #16440

Conversation

gouhongshen
Copy link
Contributor

@gouhongshen gouhongshen commented May 28, 2024

User description

What type of PR is this?

  • API-change
  • BUG
  • Improvement
  • Documentation
  • Feature
  • Test and CI
  • Code Refactoring

Which issue(s) this PR fixes:

issue ##16146

What this PR does / why we need it:

  1. remove the old fast path
  2. update ranges metrics.
  3. remove some unused codes related to un-serialized composite primary key

PR Type

Enhancement, Bug fix


Description

  • Removed old fast path logic and metrics.
  • Updated Grafana dashboard to include new transaction range metrics.
  • Simplified primary key filter logic by removing composite primary key handling.
  • Enhanced logging for slow path execution in block filtering.
  • Updated metrics observation for both fast and slow paths.

Changes walkthrough 📝

Relevant files
Enhancement
grafana_dashboard_txn.go
Update Grafana dashboard metrics for transaction ranges. 

pkg/util/metric/v2/dashboard/grafana_dashboard_txn.go

  • Removed old fast path metrics and added new metrics for transaction
    ranges.
  • Updated histogram configurations for transaction ranges.
  • Added new rows for transaction ranges selectivity and count.
  • +45/-61 
    metrics.go
    Update metrics registry for transaction ranges.                   

    pkg/util/metric/v2/metrics.go

  • Removed unused metrics.
  • Added new metrics for transaction ranges.
  • +1/-2     
    txn.go
    Update transaction metrics definitions.                                   

    pkg/util/metric/v2/txn.go

  • Removed old metrics related to loaded object meta.
  • Added new histograms for transaction ranges.
  • +14/-18 
    filter.go
    Enhance logging and metrics for block filtering.                 

    pkg/vm/engine/disttae/filter.go

  • Added logging for slow path execution.
  • Updated metrics observation for fast path filters.
  • +34/-0   
    reader.go
    Simplify primary key filter logic in reader.                         

    pkg/vm/engine/disttae/reader.go

  • Removed composite primary key filter logic.
  • Simplified primary key filter logic.
  • +5/-67   
    txn_table.go
    Remove old fast ranges and enhance slow path logging.       

    pkg/vm/engine/disttae/txn_table.go

  • Removed old fast ranges logic.
  • Added logging for slow path execution.
  • Updated metrics observation for slow path.
  • +57/-300
    util.go
    Remove composite primary key handling utilities.                 

    pkg/vm/engine/disttae/util.go

  • Removed functions related to composite primary key handling.
  • Simplified utility functions for primary key handling.
  • +3/-564 
    Tests
    util_test.go
    Remove tests for composite primary key handling.                 

    pkg/vm/engine/disttae/util_test.go

    • Removed tests related to composite primary key handling.
    +0/-397 

    💡 PR-Agent usage:
    Comment /help on the PR to get a list of all available PR-Agent tools and their descriptions

    Copy link

    PR Review 🔍

    ⏱️ Estimated effort to review [1-5]

    3, because the PR involves significant changes to the Grafana dashboard configurations and metrics handling in the code. It requires understanding of the metrics system and how these changes affect the monitoring and performance analysis.

    🧪 Relevant tests

    No

    ⚡ Possible issues

    Possible Bug: The removal of certain metrics and the addition of new ones without corresponding updates in the tests or documentation could lead to issues in monitoring and tracking performance accurately.

    🔒 Security concerns

    No

    @matrix-meow
    Copy link
    Contributor

    @gouhongshen Thanks for your contributions!

    Here are review comments for file pkg/util/metric/v2/metrics.go:

    Pull Request Review:

    Title:

    The title of the pull request is clear and concise, indicating that the changes are related to syncing to version 1.2 and removing old fast ranges while updating metrics.

    Body:

    The body of the pull request provides a structured overview of the changes made, including the type of PR, the related issue, and a description of the changes. It also lists the relevant files that were modified along with a brief summary of the changes made in each file.

    Changes Made:

    1. grafana_dashboard_txn.go:

      • Improvement: Updated Grafana dashboard metrics for transaction ranges.
      • Suggestions: Consider adding comments to explain the purpose of the new metrics for better code readability.
    2. metrics.go:

      • Improvement: Updated metrics registry for transaction ranges.
      • Suggestions: Ensure that all unused metrics are removed to maintain a clean codebase.
    3. txn.go:

      • Improvement: Updated transaction metrics definitions.
      • Suggestions: Double-check that all old metrics related to loaded object meta are removed as intended.
    4. filter.go:

      • Improvement: Enhanced logging and metrics for block filtering.
      • Suggestions: Consider adding more detailed logging information for better troubleshooting.
    5. reader.go:

      • Improvement: Simplified primary key filter logic in reader.
      • Suggestions: Verify that the removal of composite primary key filter logic does not impact other functionalities.
    6. txn_table.go:

      • Improvement: Removed old fast ranges and enhanced slow path logging.
      • Suggestions: Review the significant increase in lines of code added to ensure it aligns with the intended changes.
    7. util.go:

      • Improvement: Removed composite primary key handling utilities.
      • Suggestions: Ensure that the utility functions for primary key handling are simplified without losing any essential functionality.
    8. util_test.go:

      • Improvement: Removed tests for composite primary key handling.
      • Suggestions: Consider adding new tests to cover any changes made to primary key handling.

    Security Concerns:

    1. Code Cleanup: The removal of unused code and metrics is a good practice, but it's essential to ensure that no critical functionality is inadvertently removed during this cleanup process.

    Suggestions for Optimization:

    1. Code Review: Conduct a thorough code review to ensure that all changes align with the intended improvements and do not introduce any regressions.

    2. Testing: Verify that the changes are adequately tested to prevent any unforeseen issues in production.

    3. Documentation: Update relevant documentation to reflect the changes made, especially regarding the updated metrics and removed functionalities.

    Overall, the pull request seems to be focused on enhancing performance and maintainability by removing old code and updating metrics. However, it is crucial to ensure that the changes are thoroughly reviewed and tested to avoid any unintended consequences.

    Copy link

    PR Code Suggestions ✨

    CategorySuggestion                                                                                                                                    Score
    Possible bug
    Handle composite primary keys correctly in the tryExtractPKFilter function

    The tryExtractPKFilter function should handle the case where pk.CompPkeyCol is not nil to
    ensure composite primary keys are processed correctly.

    pkg/vm/engine/disttae/txn_table.go [1756-1788]

     if expr == nil || pk == nil {
         return
     }
     
    -pkColumn := tbl.tableDef.Cols[tbl.primaryIdx]
    -retPKFilter = getPKFilterByExpr(expr, pkColumn.Name, types.T(pkColumn.Typ.Id), tbl.proc.Load())
    -if retPKFilter.isNull || !retPKFilter.isValid || retPKFilter.isVec {
    -    return
    -}
    +if pk.CompPkeyCol != nil {
    +    pkVals := make([]*plan.Literal, len(pk.Names))
    +    _, hasNull := getCompositPKVals(expr, pk.Names, pkVals, tbl.proc.Load())
    +    if hasNull {
    +        retPKFilter.SetNull()
    +        return
    +    }
    +    cnt := getValidCompositePKCnt(pkVals)
    +    if cnt != 0 {
    +        var packer *types.Packer
    +        put := tbl.getTxn().engine.packerPool.Get(&packer)
    +        for i := 0; i < cnt; i++ {
    +            serialTupleByConstExpr(pkVals[i], packer)
    +        }
    +        v := packer.Bytes()
    +        packer.Reset()
    +        pkValue := logtailreplay.EncodePrimaryKey(v, packer)
    +        pkValue = pkValue[0 : len(pkValue)-1]
    +        put.Put()
    +        if cnt == len(pk.Names) {
    +            retPKFilter.SetFullData(function.EQUAL, false, pkValue)
    +        } else {
    +            retPKFilter.SetFullData(function.PREFIX_EQ, false, pkValue)
    +        }
    +    }
    +} else {
    +    pkColumn := tbl.tableDef.Cols[tbl.primaryIdx]
    +    retPKFilter = getPKFilterByExpr(expr, pkColumn.Name, types.T(pkColumn.Typ.Id), tbl.proc.Load())
    +    if retPKFilter.isNull || !retPKFilter.isValid || retPKFilter.isVec {
    +        return
    +    }
     
    -var packer *types.Packer
    +    var packer *types.Packer
     
    -put := tbl.getTxn().engine.packerPool.Get(&packer)
    -val := logtailreplay.EncodePrimaryKey(retPKFilter.val, packer)
    -put.Put()
    +    put := tbl.getTxn().engine.packerPool.Get(&packer)
    +    val := logtailreplay.EncodePrimaryKey(retPKFilter.val, packer)
    +    put.Put()
     
    -if retPKFilter.op == function.EQUAL {
    -    retPKFilter.SetFullData(function.EQUAL, false, val)
    -} else {
    -    val = val[0 : len(val)-1]
    -    retPKFilter.SetFullData(function.PREFIX_EQ, false, val)
    +    if retPKFilter.op == function.EQUAL {
    +        retPKFilter.SetFullData(function.EQUAL, false, val)
    +    } else {
    +        val = val[0 : len(val)-1]
    +        retPKFilter.SetFullData(function.PREFIX_EQ, false, val)
    +    }
     }
     
     return
     
    Suggestion importance[1-10]: 9

    Why: Proper handling of composite primary keys is essential for correct database operations and to prevent potential bugs in key handling logic. This is a significant improvement in functionality.

    9
    Increment loadObjCnt in a consistent manner to ensure accurate counting

    The loadObjCnt variable should be incremented in a consistent manner to ensure accurate
    counting.

    pkg/vm/engine/disttae/txn_table.go [787-793]

     s3BlkCnt += obj.BlkCnt()
     if auxIdCnt > 0 {
    +    loadObjCnt++
         location := obj.ObjectLocation()
    -    loadObjCnt++
         if objMeta, err2 = objectio.FastLoadObjectMeta(
             errCtx, &location, false, tbl.getTxn().engine.fs,
         ); err2 != nil {
             return
         }
     }
     
    Suggestion importance[1-10]: 7

    Why: Consistent incrementation of loadObjCnt is important for accurate metrics and debugging, although it's not a critical bug.

    7
    Possible issue
    Reset slowPathCounter to zero in a thread-safe manner to avoid potential race conditions

    The slowPathCounter should be reset to zero in a thread-safe manner to avoid potential
    race conditions.

    pkg/vm/engine/disttae/txn_table.go [732-738]

    +if slowPathCounter.Add(1) >= 1000 {
    +    slowPathCounter.Store(0)
    +    logutil.Info(
    +        "SLOW-RANGES:",
    +        zap.String("table", tbl.tableDef.Name),
    +        zap.String("exprs", plan2.FormatExprs(exprs)),
    +    )
    +}
     
    -
    Suggestion importance[1-10]: 8

    Why: Ensuring thread safety in the reset operation of slowPathCounter is crucial to avoid race conditions, which is a significant concern in concurrent environments.

    8
    Maintainability
    Refactor repeated histogram creation code into a helper function to reduce redundancy

    The initTxnRangesSelectivityRow and initTxnRangesCountRow functions have repeated code for
    creating histograms. Consider refactoring to a helper function to reduce redundancy and
    improve maintainability.

    pkg/util/metric/v2/dashboard/grafana_dashboard_txn.go [94-110]

    -c.getMultiHistogram(
    +c.createMultiHistogram(
         []string{
    -        c.getMetricWithFilter(`mo_txn_ranges_selectivity_percentage_bucket`, `type="slow_path_block_selectivity"`),
    -        c.getMetricWithFilter(`mo_txn_ranges_selectivity_percentage_bucket`, `type="fast_path_block_selectivity"`),
    -        c.getMetricWithFilter(`mo_txn_ranges_selectivity_percentage_bucket`, `type="fast_path_obj_sort_key_zm_selectivity"`),
    -        c.getMetricWithFilter(`mo_txn_ranges_selectivity_percentage_bucket`, `type="fast_path_obj_column_zm_selectivity"`),
    -        c.getMetricWithFilter(`mo_txn_ranges_selectivity_percentage_bucket`, `type="fast_path_blk_column_zm_selectivity"`),
    +        `mo_txn_ranges_selectivity_percentage_bucket`,
         },
         []string{
             "slow_path_block_selectivity",
             "fast_path_block_selectivity",
             "fast_path_obj_sort_key_zm_selectivity",
             "fast_path_obj_column_zm_selectivity",
             "fast_path_blk_column_zm_selectivity",
         },
         []float64{0.50, 0.8, 0.90, 0.99},
         []float32{3, 3, 3, 3},
         axis.Min(0))...,
     
    Suggestion importance[1-10]: 7

    Why: The suggestion correctly identifies code redundancy in histogram creation and proposes a useful refactor to improve maintainability. This is a good practice, though not critical for functionality.

    7
    Break down the large defer block into smaller functions to improve readability and maintainability

    The defer block in ExecuteBlockFilter function is quite large and complex. Consider
    breaking it into smaller functions to improve readability and maintainability.

    pkg/vm/engine/disttae/filter.go [1033-1048]

    -defer func() {
    -    v2.TxnRangesFastPathLoadObjCntHistogram.Observe(loadHit)
    -    v2.TxnRangesFastPathSelectedBlockCntHistogram.Observe(float64(outBlocks.Len() - 1))
    -    if fastFilterTotal > 0 {
    -        v2.TxnRangesFastPathObjSortKeyZMapSelectivityHistogram.Observe(fastFilterHit / fastFilterTotal)
    -    }
    -    if objFilterTotal > 0 {
    -        v2.TxnRangesFastPathObjColumnZMapSelectivityHistogram.Observe(objFilterHit / objFilterTotal)
    -    }
    -    if blkFilterTotal > 0 {
    -        v2.TxnRangesFastPathBlkColumnZMapSelectivityHistogram.Observe(blkFilterHit / blkFilterTotal)
    -    }
    -    if totalBlocks > 0 {
    -        v2.TxnRangesFastPathBlkTotalSelectivityHistogram.Observe(float64(outBlocks.Len()-1) / totalBlocks)
    -    }
    -}()
    +defer c.observeMetrics(loadHit, outBlocks.Len()-1, fastFilterTotal, fastFilterHit, objFilterTotal, objFilterHit, blkFilterTotal, blkFilterHit, totalBlocks)
     
    Suggestion importance[1-10]: 7

    Why: The suggestion to refactor a complex defer block into a separate function is valid and would enhance code readability and maintainability. It's a good suggestion but not critical.

    7
    Move the prometheus.HistogramOpts initialization to a separate function to improve readability and reduce redundancy

    The prometheus.HistogramOpts initialization for txnTableRangeTotalHistogram can be moved
    to a separate function to improve readability and reduce redundancy.

    pkg/util/metric/v2/txn.go [242-249]

    -txnTableRangeTotalHistogram = prometheus.NewHistogramVec(
    -    prometheus.HistogramOpts{
    -        Namespace: "mo",
    -        Subsystem: "txn",
    -        Name:      "ranges_selected_block_cnt_total",
    -        Help:      "Bucketed histogram of txn table ranges selected block cnt.",
    -        Buckets:   prometheus.ExponentialBuckets(1, 2.0, 15),
    -    }, []string{"type"})
    +txnTableRangeTotalHistogram = createHistogramVec("mo", "txn", "ranges_selected_block_cnt_total", "Bucketed histogram of txn table ranges selected block cnt.", prometheus.ExponentialBuckets(1, 2.0, 15), []string{"type"})
     
    Suggestion importance[1-10]: 7

    Why: The suggestion to encapsulate the histogram initialization in a separate function is a good practice to reduce redundancy and improve code organization. This is beneficial for maintainability.

    7
    Organize import statements to follow a consistent order for better readability

    The import statements should be organized to follow a consistent order, typically with
    standard library imports first, followed by third-party imports, and then local imports.
    This helps improve readability and maintainability.

    pkg/vm/engine/disttae/util.go [20-27]

    +"fmt"
    +"strings"
     
    +"github.com/matrixorigin/matrixone/pkg/clusterservice"
    +"github.com/matrixorigin/matrixone/pkg/common/moerr"
    +"github.com/matrixorigin/matrixone/pkg/common/util"
    +"github.com/matrixorigin/matrixone/pkg/container/batch"
    +"github.com/matrixorigin/matrixone/pkg/container/types"
     
    Suggestion importance[1-10]: 6

    Why: Organizing imports improves code readability and maintainability, but it's a minor style improvement.

    6
    Enhancement
    Simplify the getReadFilter function by directly returning the result of getNonCompositPKFilter

    The getReadFilter function can be simplified by directly returning the result of
    getNonCompositPKFilter when pk is not nil.

    pkg/vm/engine/disttae/reader.go [136-141]

    +if mixin.filterState.expr == nil || pk == nil {
    +    mixin.filterState.evaluated = true
    +    mixin.filterState.filter = nil
    +    return
    +}
    +return mixin.getNonCompositPKFilter(proc, blkCnt)
     
    -
    Suggestion importance[1-10]: 6

    Why: The suggestion to simplify the getReadFilter function is correct and would make the code cleaner and more direct. However, it's a minor enhancement.

    6

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    Bug fix Enhancement Review effort [1-5]: 3 size/XL Denotes a PR that changes [1000, 1999] lines
    Projects
    None yet
    Development

    Successfully merging this pull request may close these issues.

    3 participants