Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert hashSets in parallel before merge #50748

Merged
merged 11 commits into from Jul 27, 2023

Commits on May 31, 2023

  1. Merge pull request #2 from ClickHouse/master

    lastest upstream master
    jiebinn committed May 31, 2023
    Configuration menu
    Copy the full SHA
    f3b3d70 View commit details
    Browse the repository at this point in the history

Commits on Jun 29, 2023

  1. Merge pull request #4 from ClickHouse/master

    Update to the master
    jiebinn committed Jun 29, 2023
    Configuration menu
    Copy the full SHA
    1bbf378 View commit details
    Browse the repository at this point in the history
  2. Convert hashSets in parallel before merge

    Before merge, if one of the lhs and rhs is singleLevelSet and the other is twoLevelSet,
    then the SingleLevelSet will call convertToTwoLevel(). The convert process is not in parallel
    and it will cost lots of cycle if it cosume all the singleLevelSet.
    
    The idea of the patch is to convert all the singleLevelSets to twoLevelSets in parallel if
    the hashsets are not all singleLevel or not all twoLevel.
    
    I have tested the patch on Intel 2 x 112 vCPUs SPR server with clickbench and latest upstream
    ClickHouse.
    Q5 has got a big 264% performance improvement and 24 queries have got at least 5% performance
    gain. The overall geomean of 43 queries has gained 7.4% more than the base code.
    
    Signed-off-by: Jiebin Sun <jiebin.sun@intel.com>
    jiebinn committed Jun 29, 2023
    Configuration menu
    Copy the full SHA
    09e3509 View commit details
    Browse the repository at this point in the history

Commits on Jul 7, 2023

  1. add resize() for the data_vec in parallelizeMergePrepare()

    Signed-off-by: Jiebin Sun <jiebin.sun@intel.com>
    jiebinn committed Jul 7, 2023
    Configuration menu
    Copy the full SHA
    1c1a9d1 View commit details
    Browse the repository at this point in the history

Commits on Jul 21, 2023

  1. Add the performance test prepare_hash_before_merge.xml

    Signed-off-by: Jiebin Sun <jiebin.sun@intel.com>
    jiebinn committed Jul 21, 2023
    Configuration menu
    Copy the full SHA
    ad109c7 View commit details
    Browse the repository at this point in the history

Commits on Jul 24, 2023

  1. Configuration menu
    Copy the full SHA
    4c93d05 View commit details
    Browse the repository at this point in the history
  2. Fit the CI to rename the data set from hits_v1 to test.hits.

    Signed-off-by: Jiebin Sun <jiebin.sun@intel.com>
    jiebinn committed Jul 24, 2023
    Configuration menu
    Copy the full SHA
    ad0ca53 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    b7d9777 View commit details
    Browse the repository at this point in the history

Commits on Jul 25, 2023

  1. Configuration menu
    Copy the full SHA
    2504987 View commit details
    Browse the repository at this point in the history

Commits on Jul 26, 2023

  1. remove the redundant branch in UniqExactSet

    Co-authored-by: Nikita Taranov <nickita.taranov@gmail.com>
    jiebinn and nickitat committed Jul 26, 2023
    Configuration menu
    Copy the full SHA
    a8b5b55 View commit details
    Browse the repository at this point in the history
  2. Remove the empty methods and add throw exception in parallelizeMergeP…

    …repare()
    
    Signed-off-by: Jiebin Sun <jiebin.sun@intel.com>
    jiebinn committed Jul 26, 2023
    Configuration menu
    Copy the full SHA
    635e9d7 View commit details
    Browse the repository at this point in the history