Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update mixed_join to use experimental row hasher and comparator #13028

Merged
merged 49 commits into from
Apr 21, 2023

Conversation

divyegala
Copy link
Member

@divyegala divyegala commented Mar 28, 2023

Part of #11844

mixed_join cannot support nested types as the conditional part relies on AST. This PR adds no new tests or benchmarks for this reason.

Benchmarks

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

divyegala and others added 30 commits February 1, 2023 17:58
Co-authored-by: Nghia Truong <nghiatruong.vn@gmail.com>
@divyegala divyegala changed the base branch from branch-23.04 to branch-23.06 April 6, 2023 17:17
@divyegala
Copy link
Member Author

divyegala commented Apr 7, 2023

Benchmarks

mixed_inner_join_32bit

[0] Tesla V100-SXM2-32GB

Key Type Payload Type Nullable Build Table Size Probe Table Size Ref Time Ref Noise Cmp Time Cmp Noise Diff %Diff Status
I32 I32 0 100000 100000 875.917 us 31.23% 920.228 us 31.25% 44.311 us 5.06% PASS
I32 I32 0 100000 400000 2.147 ms 9.78% 2.074 ms 11.64% -73.278 us -3.41% PASS
I32 I32 0 10000000 10000000 34.967 ms 0.50% 31.057 ms 0.48% -3909.582 us -11.18% FAIL
I32 I32 0 10000000 40000000 111.974 ms 0.08% 96.741 ms 0.09% -15232.559 us -13.60% FAIL
I32 I32 0 10000000 100000000 265.314 ms 0.11% 232.141 ms 0.04% -33172.852 us -12.50% FAIL
I32 I32 0 80000000 100000000 322.752 ms 0.14% 287.081 ms 0.08% -35671.412 us -11.05% FAIL
I32 I32 0 100000000 100000000 337.138 ms 0.10% 296.636 ms 0.08% -40501.154 us -12.01% FAIL
I32 I32 0 10000000 240000000 624.778 ms 0.10% 544.490 ms 0.05% -80287.664 us -12.85% FAIL
I32 I32 0 80000000 240000000 680.880 ms 0.11% 599.661 ms 0.02% -81218.567 us -11.93% FAIL
I32 I32 0 100000000 240000000 697.653 ms 0.05% 604.076 ms 0.05% -93576.949 us -13.41% FAIL

mixed_inner_join_64bit

[0] Tesla V100-SXM2-32GB

Key Type Payload Type Nullable Build Table Size Probe Table Size Ref Time Ref Noise Cmp Time Cmp Noise Diff %Diff Status
I64 I64 0 40000000 50000000 158.300 ms 0.27% 144.077 ms 0.31% -14222.988 us -8.98% FAIL
I64 I64 0 50000000 50000000 165.503 ms 0.06% 152.819 ms 0.10% -12683.730 us -7.66% FAIL
I64 I64 0 40000000 120000000 333.902 ms 0.23% 296.133 ms 0.04% -37768.749 us -11.31% FAIL
I64 I64 0 50000000 120000000 341.840 ms 0.06% 311.494 ms 0.06% -30346.150 us -8.88% FAIL

mixed_inner_join_32bit_nulls

[0] Tesla V100-SXM2-32GB

Key Type Payload Type Nullable Build Table Size Probe Table Size Ref Time Ref Noise Cmp Time Cmp Noise Diff %Diff Status
I32 I32 1 100000 100000 855.482 us 7.58% 884.830 us 7.31% 29.348 us 3.43% PASS
I32 I32 1 100000 400000 2.090 ms 5.45% 2.104 ms 6.45% 14.723 us 0.70% PASS
I32 I32 1 10000000 10000000 25.663 ms 1.09% 25.186 ms 0.34% -476.606 us -1.86% FAIL
I32 I32 1 10000000 40000000 91.612 ms 0.05% 90.654 ms 0.61% -957.890 us -1.05% FAIL
I32 I32 1 10000000 100000000 222.254 ms 0.21% 219.792 ms 1.11% -2462.311 us -1.11% FAIL
I32 I32 1 80000000 100000000 240.013 ms 0.08% 235.942 ms 0.06% -4070.863 us -1.70% FAIL
I32 I32 1 100000000 100000000 246.167 ms 0.03% 242.616 ms 0.03% -3551.231 us -1.44% FAIL
I32 I32 1 10000000 240000000 528.932 ms 0.09% 519.132 ms 0.08% -9799.860 us -1.85% FAIL
I32 I32 1 80000000 240000000 545.368 ms 0.53% 537.740 ms 0.04% -7628.056 us -1.40% FAIL
I32 I32 1 100000000 240000000 550.352 ms 0.02% 543.150 ms 0.03% -7201.982 us -1.31% FAIL

mixed_inner_join_64bit_nulls

[0] Tesla V100-SXM2-32GB

Key Type Payload Type Nullable Build Table Size Probe Table Size Ref Time Ref Noise Cmp Time Cmp Noise Diff %Diff Status
I64 I64 1 40000000 50000000 122.723 ms 0.49% 120.482 ms 0.49% -2241.163 us -1.83% FAIL
I64 I64 1 50000000 50000000 124.862 ms 0.04% 123.097 ms 0.05% -1764.631 us -1.41% FAIL
I64 I64 1 40000000 120000000 277.457 ms 0.03% 273.331 ms 0.75% -4126.197 us -1.49% FAIL
I64 I64 1 50000000 120000000 277.129 ms 0.04% 273.958 ms 0.04% -3170.674 us -1.14% FAIL

mixed_left_join_32bit

[0] Tesla V100-SXM2-32GB

Key Type Payload Type Nullable Build Table Size Probe Table Size Ref Time Ref Noise Cmp Time Cmp Noise Diff %Diff Status
I32 I32 0 100000 100000 827.773 us 5.35% 848.363 us 6.53% 20.590 us 2.49% PASS
I32 I32 0 100000 400000 2.764 ms 4.84% 2.687 ms 5.67% -76.618 us -2.77% PASS
I32 I32 0 10000000 10000000 34.666 ms 0.50% 31.997 ms 0.37% -2669.613 us -7.70% FAIL
I32 I32 0 10000000 40000000 110.222 ms 0.47% 98.915 ms 0.49% -11306.492 us -10.26% FAIL
I32 I32 0 10000000 100000000 262.402 ms 0.29% 232.691 ms 0.09% -29710.526 us -11.32% FAIL
I32 I32 0 80000000 100000000 316.661 ms 0.32% 294.183 ms 0.04% -22478.294 us -7.10% FAIL
I32 I32 0 100000000 100000000 332.354 ms 0.09% 301.118 ms 0.05% -31235.909 us -9.40% FAIL
I32 I32 0 10000000 240000000 619.216 ms 0.21% 544.261 ms 0.05% -74955.222 us -12.10% FAIL
I32 I32 0 80000000 240000000 673.325 ms 0.37% 602.561 ms 0.04% -70764.088 us -10.51% FAIL
I32 I32 0 100000000 240000000 688.355 ms 0.22% 620.849 ms 0.06% -67506.181 us -9.81% FAIL

mixed_left_join_64bit

[0] Tesla V100-SXM2-32GB

Key Type Payload Type Nullable Build Table Size Probe Table Size Ref Time Ref Noise Cmp Time Cmp Noise Diff %Diff Status
I64 I64 0 40000000 50000000 160.282 ms 0.49% 146.169 ms 0.47% -14113.187 us -8.81% FAIL
I64 I64 0 50000000 50000000 168.559 ms 0.28% 154.476 ms 0.04% -14082.513 us -8.35% FAIL
I64 I64 0 40000000 120000000 338.015 ms 0.04% 304.001 ms 0.07% -34014.113 us -10.06% FAIL
I64 I64 0 50000000 120000000 347.587 ms 0.27% 313.285 ms 0.01% -34301.858 us -9.87% FAIL

mixed_left_join_32bit_nulls

[0] Tesla V100-SXM2-32GB

Key Type Payload Type Nullable Build Table Size Probe Table Size Ref Time Ref Noise Cmp Time Cmp Noise Diff %Diff Status
I32 I32 1 100000 100000 874.524 us 7.57% 901.068 us 7.61% 26.544 us 3.04% PASS
I32 I32 1 100000 400000 2.831 ms 5.01% 2.901 ms 5.38% 70.177 us 2.48% PASS
I32 I32 1 10000000 10000000 28.132 ms 0.11% 28.528 ms 0.35% 396.458 us 1.41% FAIL
I32 I32 1 10000000 40000000 99.452 ms 0.05% 100.024 ms 0.50% 572.122 us 0.58% FAIL
I32 I32 1 10000000 100000000 240.189 ms 0.05% 242.774 ms 0.09% 2.585 ms 1.08% FAIL
I32 I32 1 80000000 100000000 261.058 ms 0.14% 261.193 ms 0.03% 134.508 us 0.05% FAIL
I32 I32 1 100000000 100000000 265.447 ms 0.50% 266.299 ms 0.08% 851.936 us 0.32% FAIL
I32 I32 1 10000000 240000000 579.174 ms 0.65% 575.149 ms 0.02% -4024.845 us -0.69% FAIL
I32 I32 1 80000000 240000000 597.026 ms 0.02% 598.344 ms 0.93% 1.318 ms 0.22% FAIL
I32 I32 1 100000000 240000000 597.369 ms 0.06% 600.454 ms 0.60% 3.085 ms 0.52% FAIL

mixed_left_join_64bit_nulls

[0] Tesla V100-SXM2-32GB

Key Type Payload Type Nullable Build Table Size Probe Table Size Ref Time Ref Noise Cmp Time Cmp Noise Diff %Diff Status
I64 I64 1 40000000 50000000 132.639 ms 1.00% 132.736 ms 0.74% 97.311 us 0.07% PASS
I64 I64 1 50000000 50000000 136.233 ms 0.96% 135.553 ms 0.36% -679.952 us -0.50% FAIL
I64 I64 1 40000000 120000000 302.531 ms 1.17% 302.765 ms 0.82% 233.774 us 0.08% PASS
I64 I64 1 50000000 120000000 303.721 ms 0.83% 301.337 ms 0.05% -2383.392 us -0.78% FAIL

mixed_full_join_32bit

[0] Tesla V100-SXM2-32GB

Key Type Payload Type Nullable Build Table Size Probe Table Size Ref Time Ref Noise Cmp Time Cmp Noise Diff %Diff Status
I32 I32 0 100000 100000 1.619 ms 5.76% 1.689 ms 7.48% 70.280 us 4.34% PASS
I32 I32 0 100000 400000 3.548 ms 4.17% 3.561 ms 5.35% 13.589 us 0.38% PASS
I32 I32 0 10000000 10000000 38.671 ms 0.60% 36.384 ms 0.48% -2286.466 us -5.91% FAIL
I32 I32 0 10000000 40000000 120.127 ms 0.49% 107.183 ms 0.28% -12944.384 us -10.78% FAIL
I32 I32 0 10000000 100000000 281.015 ms 0.16% 247.728 ms 0.07% -33286.333 us -11.85% FAIL
I32 I32 0 80000000 100000000 345.990 ms 0.14% 312.403 ms 0.16% -33587.017 us -9.71% FAIL
I32 I32 0 100000000 100000000 362.816 ms 0.04% 338.722 ms 0.07% -24093.794 us -6.64% FAIL
I32 I32 0 10000000 240000000 662.235 ms 0.13% 576.524 ms 0.04% -85711.315 us -12.94% FAIL
I32 I32 0 80000000 240000000 729.054 ms 0.10% 647.950 ms 0.06% -81104.237 us -11.12% FAIL
I32 I32 0 100000000 240000000 744.710 ms 0.03% 687.004 ms 0.10% -57705.650 us -7.75% FAIL

mixed_full_join_64bit

[0] Tesla V100-SXM2-32GB

Key Type Payload Type Nullable Build Table Size Probe Table Size Ref Time Ref Noise Cmp Time Cmp Noise Diff %Diff Status
I64 I64 0 40000000 50000000 174.977 ms 0.66% 159.482 ms 0.63% -15495.777 us -8.86% FAIL
I64 I64 0 50000000 50000000 183.799 ms 0.13% 171.743 ms 0.11% -12055.740 us -6.56% FAIL
I64 I64 0 40000000 120000000 366.850 ms 0.09% 324.385 ms 0.50% -42465.371 us -11.58% FAIL
I64 I64 0 50000000 120000000 375.960 ms 0.13% 347.154 ms 0.62% -28806.794 us -7.66% FAIL

mixed_full_join_32bit_nulls

[0] Tesla V100-SXM2-32GB

Key Type Payload Type Nullable Build Table Size Probe Table Size Ref Time Ref Noise Cmp Time Cmp Noise Diff %Diff Status
I32 I32 1 100000 100000 1.657 ms 7.04% 1.751 ms 12.84% 93.132 us 5.62% PASS
I32 I32 1 100000 400000 3.609 ms 4.06% 3.731 ms 5.23% 122.430 us 3.39% PASS
I32 I32 1 10000000 10000000 31.378 ms 0.50% 31.703 ms 1.05% 324.377 us 1.03% FAIL
I32 I32 1 10000000 40000000 105.160 ms 0.80% 104.737 ms 0.10% -422.547 us -0.40% FAIL
I32 I32 1 10000000 100000000 250.239 ms 0.06% 250.823 ms 0.06% 583.495 us 0.23% FAIL
I32 I32 1 80000000 100000000 278.419 ms 0.65% 276.691 ms 0.05% -1728.547 us -0.62% FAIL
I32 I32 1 100000000 100000000 286.424 ms 0.81% 283.127 ms 0.11% -3297.830 us -1.15% FAIL
I32 I32 1 10000000 240000000 594.717 ms 0.03% 594.180 ms 0.97% -536.208 us -0.09% FAIL
I32 I32 1 80000000 240000000 622.416 ms 1.33% 618.749 ms 0.03% -3666.502 us -0.59% FAIL
I32 I32 1 100000000 240000000 621.383 ms 0.15% 624.727 ms 0.04% 3.344 ms 0.54% FAIL

mixed_full_join_64bit_nulls

[0] Tesla V100-SXM2-32GB

Key Type Payload Type Nullable Build Table Size Probe Table Size Ref Time Ref Noise Cmp Time Cmp Noise Diff %Diff Status
I64 I64 1 40000000 50000000 141.654 ms 1.29% 140.827 ms 0.60% -826.439 us -0.58% PASS
I64 I64 1 50000000 50000000 145.382 ms 0.12% 145.545 ms 0.29% 163.376 us 0.11% PASS
I64 I64 1 40000000 120000000 311.480 ms 0.04% 313.604 ms 1.00% 2.124 ms 0.68% FAIL
I64 I64 1 50000000 120000000 317.613 ms 0.23% 316.508 ms 0.08% -1105.175 us -0.35% FAIL

mixed_left_semi_join_32bit

[0] Tesla V100-SXM2-32GB

Key Type Payload Type Nullable Build Table Size Probe Table Size Ref Time Ref Noise Cmp Time Cmp Noise Diff %Diff Status
I32 I32 0 100000 100000 918.193 us 7.97% 985.107 us 6.98% 66.914 us 7.29% FAIL
I32 I32 0 100000 400000 2.382 ms 4.66% 2.477 ms 4.83% 95.143 us 3.99% PASS
I32 I32 0 10000000 10000000 46.359 ms 2.80% 46.723 ms 0.09% 364.310 us 0.79% FAIL
I32 I32 0 10000000 40000000 156.524 ms 0.50% 156.436 ms 0.80% -88.738 us -0.06% PASS
I32 I32 0 10000000 100000000 374.372 ms 0.60% 370.161 ms 1.03% -4211.560 us -1.12% FAIL
I32 I32 0 80000000 100000000 441.258 ms 0.06% 441.189 ms 0.07% -68.701 us -0.02% PASS
I32 I32 0 100000000 100000000 459.026 ms 0.03% 453.277 ms 1.30% -5749.236 us -1.25% FAIL
I32 I32 0 10000000 240000000 878.973 ms 0.20% 838.687 ms 4.06% -40285.768 us -4.58% FAIL
I32 I32 0 80000000 240000000 956.071 ms 0.13% 956.254 ms 0.16% 182.645 us 0.02% PASS
I32 I32 0 100000000 240000000 976.787 ms 0.02% 969.631 ms 0.22% -7156.461 us -0.73% FAIL

mixed_left_semi_join_64bit

[0] Tesla V100-SXM2-32GB

Key Type Payload Type Nullable Build Table Size Probe Table Size Ref Time Ref Noise Cmp Time Cmp Noise Diff %Diff Status
I64 I64 0 40000000 50000000 215.053 ms 6.14% 223.259 ms 1.89% 8.206 ms 3.82% FAIL
I64 I64 0 50000000 50000000 226.895 ms 6.11% 215.685 ms 7.89% -11210.475 us -4.94% PASS
I64 I64 0 40000000 120000000 428.565 ms 7.55% 485.504 ms 4.38% 56.939 ms 13.29% FAIL
I64 I64 0 50000000 120000000 479.317 ms 6.70% 447.274 ms 8.72% -32042.610 us -6.69% PASS

mixed_left_semi_join_32bit_nulls

[0] Tesla V100-SXM2-32GB

Key Type Payload Type Nullable Build Table Size Probe Table Size Ref Time Ref Noise Cmp Time Cmp Noise Diff %Diff Status
I32 I32 1 100000 100000 926.982 us 7.51% 1.005 ms 9.23% 77.928 us 8.41% FAIL
I32 I32 1 100000 400000 2.970 ms 3.96% 2.887 ms 3.87% -83.107 us -2.80% PASS
I32 I32 1 10000000 10000000 61.259 ms 0.81% 52.022 ms 1.73% -9237.515 us -15.08% FAIL
I32 I32 1 10000000 40000000 228.216 ms 0.50% 186.977 ms 1.54% -41239.266 us -18.07% FAIL
I32 I32 1 10000000 100000000 561.330 ms 0.95% 453.644 ms 1.66% -107685.815 us -19.18% FAIL
I32 I32 1 80000000 100000000 600.342 ms 0.95% 489.422 ms 2.14% -110920.400 us -18.48% FAIL
I32 I32 1 100000000 100000000 604.942 ms 0.49% 519.131 ms 0.28% -85811.232 us -14.19% FAIL
I32 I32 1 10000000 240000000 1.341 s 0.37% 1.073 s 2.12% -267571.470 us -19.96% FAIL
I32 I32 1 80000000 240000000 1.375 s 1.62% 1.159 s 0.47% -215711.559 us -15.69% FAIL
I32 I32 1 100000000 240000000 1.377 s 0.37% 1.175 s 0.24% -202419.389 us -14.70% FAIL

mixed_left_semi_join_64bit_nulls

[0] Tesla V100-SXM2-32GB

Key Type Payload Type Nullable Build Table Size Probe Table Size Ref Time Ref Noise Cmp Time Cmp Noise Diff %Diff Status
I64 I64 1 40000000 50000000 299.598 ms 0.15% 247.938 ms 1.13% -51659.806 us -17.24% FAIL
I64 I64 1 50000000 50000000 306.092 ms 1.50% 261.783 ms 2.19% -44309.125 us -14.48% FAIL
I64 I64 1 40000000 120000000 694.572 ms 1.37% 555.701 ms 1.33% -138871.070 us -19.99% FAIL
I64 I64 1 50000000 120000000 693.432 ms 0.08% 575.585 ms 1.07% -117846.801 us -16.99% FAIL

mixed_left_anti_join_32bit

[0] Tesla V100-SXM2-32GB

Key Type Payload Type Nullable Build Table Size Probe Table Size Ref Time Ref Noise Cmp Time Cmp Noise Diff %Diff Status
I32 I32 0 100000 100000 918.153 us 7.64% 995.259 us 7.74% 77.105 us 8.40% FAIL
I32 I32 0 100000 400000 2.712 ms 4.84% 2.806 ms 3.59% 93.587 us 3.45% PASS
I32 I32 0 10000000 10000000 47.054 ms 0.32% 47.042 ms 0.50% -11.918 us -0.03% PASS
I32 I32 0 10000000 40000000 156.867 ms 0.75% 156.341 ms 0.71% -525.685 us -0.34% PASS
I32 I32 0 10000000 100000000 374.510 ms 0.37% 370.025 ms 1.16% -4484.904 us -1.20% FAIL
I32 I32 0 80000000 100000000 443.100 ms 0.05% 441.946 ms 0.24% -1154.136 us -0.26% FAIL
I32 I32 0 100000000 100000000 460.366 ms 0.07% 459.571 ms 0.05% -795.455 us -0.17% FAIL
I32 I32 0 10000000 240000000 889.670 ms 2.03% 879.992 ms 0.17% -9677.595 us -1.09% FAIL
I32 I32 0 80000000 240000000 947.377 ms 0.47% 953.130 ms 0.12% 5.753 ms 0.61% FAIL
I32 I32 0 100000000 240000000 975.763 ms 0.21% 974.909 ms 0.14% -854.753 us -0.09% PASS

mixed_left_anti_join_64bit

[0] Tesla V100-SXM2-32GB

Key Type Payload Type Nullable Build Table Size Probe Table Size Ref Time Ref Noise Cmp Time Cmp Noise Diff %Diff Status
I64 I64 0 40000000 50000000 225.808 ms 0.05% 214.724 ms 7.33% -11084.391 us -4.91% FAIL
I64 I64 0 50000000 50000000 222.122 ms 7.36% 229.684 ms 4.47% 7.562 ms 3.40% PASS
I64 I64 0 40000000 120000000 430.324 ms 8.68% 417.045 ms 6.88% -13278.233 us -3.09% PASS
I64 I64 0 50000000 120000000 430.292 ms 6.81% 487.004 ms 3.25% 56.711 ms 13.18% FAIL

mixed_left_anti_join_32bit_nulls

[0] Tesla V100-SXM2-32GB

Key Type Payload Type Nullable Build Table Size Probe Table Size Ref Time Ref Noise Cmp Time Cmp Noise Diff %Diff Status
I32 I32 1 100000 100000 944.036 us 6.09% 1.005 ms 5.97% 61.055 us 6.47% FAIL
I32 I32 1 100000 400000 3.532 ms 3.71% 3.495 ms 3.39% -37.584 us -1.06% PASS
I32 I32 1 10000000 10000000 70.217 ms 0.25% 62.038 ms 0.32% -8179.339 us -11.65% FAIL
I32 I32 1 10000000 40000000 261.965 ms 1.00% 230.093 ms 0.97% -31872.124 us -12.17% FAIL
I32 I32 1 10000000 100000000 643.455 ms 0.49% 562.045 ms 1.29% -81410.600 us -12.65% FAIL
I32 I32 1 80000000 100000000 687.748 ms 0.20% 600.031 ms 0.07% -87716.303 us -12.75% FAIL
I32 I32 1 100000000 100000000 692.308 ms 0.82% 608.738 ms 0.13% -83569.846 us -12.07% FAIL
I32 I32 1 10000000 240000000 1.540 s 0.66% 1.339 s 0.13% -201690.455 us -13.09% FAIL
I32 I32 1 80000000 240000000 1.586 s 0.51% 1.382 s 0.07% -203630.768 us -12.84% FAIL
I32 I32 1 100000000 240000000 1.599 s 0.11% 1.395 s 0.66% -204474.926 us -12.79% FAIL

mixed_left_anti_join_64bit_nulls

[0] Tesla V100-SXM2-32GB

Key Type Payload Type Nullable Build Table Size Probe Table Size Ref Time Ref Noise Cmp Time Cmp Noise Diff %Diff Status
I64 I64 1 40000000 50000000 341.539 ms 0.62% 297.213 ms 0.62% -44326.053 us -12.98% FAIL
I64 I64 1 50000000 50000000 349.254 ms 0.42% 304.333 ms 0.16% -44921.109 us -12.86% FAIL
I64 I64 1 40000000 120000000 795.228 ms 0.48% 682.996 ms 0.29% -112231.978 us -14.11% FAIL
I64 I64 1 50000000 120000000 798.917 ms 0.24% 690.432 ms 0.35% -108485.524 us -13.58% FAIL

@divyegala divyegala marked this pull request as ready for review April 7, 2023 14:48
@divyegala divyegala requested a review from a team as a code owner April 7, 2023 14:48
@vyasr
Copy link
Contributor

vyasr commented Apr 7, 2023

These benchmark results are very (pleasantly) surprising. Without any further investigation, there are two possibilities that I see for why the new code is faster:

  • Less likely: The new hasher is somehow either faster or less memory intensive than the old one, and in this algorithm that reduction has a significant positive effect on the algorithm as a whole.
  • More likely: The experimental equality comparator somehow inlines more cleanly into the pair_expression_equality comparator that is used in this algorithm to combine the equality with the evaluator.

Will need to look back at the structs to see which is more likely or if it could be something else entirely. The second is definitely more likely IMO but also seems harder to justify offhand.

auto& build_conditional_view = swap_tables ? left_conditional_view : right_conditional_view;
row_equality equality_probe{
cudf::nullate::DYNAMIC{has_nulls}, *probe_view, *build_view, compare_nulls};
auto& probe = swap_tables ? right_equality : left_equality;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we have swap_tables?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The smaller table is used as the build table. Since mixed_join doesn't have a stateful API like hash_join, you see the swapping happening at every step whereas hash_join does the swapping in its constructor/outer API

@GregoryKimball
Copy link
Contributor

GregoryKimball commented Apr 18, 2023

Thank you @divyegala for suggesting this change. The automated microbenchmarks look really good! We see about 10-15% faster performance for mixed joins with nulls, and no impact on the "not nullable" variants. 🎉
image

@divyegala divyegala requested a review from ttnghia April 18, 2023 16:16
@vyasr
Copy link
Contributor

vyasr commented Apr 19, 2023

Are we waiting on anything here? Do we want to do some further investigation to understand the benchmarks before moving forward? We might discover something helpful to improve performance of other PRs using the new comparator/hasher, but I doubt it since the characteristics are likely to be very algorithm-specific.

@divyegala
Copy link
Member Author

@vyasr just waiting on reviews. If there are any insights to be gleaned from this PR that could be applied to other algorithms, that should be async. Please review if you have some time!

@davidwendt
Copy link
Contributor

Does this not require new unit tests? A description for this PR would be helpful I think.

@divyegala
Copy link
Member Author

Does this not require new unit tests? A description for this PR would be helpful I think.

@davidwendt I updated the description

@divyegala
Copy link
Member Author

/merge

@rapids-bot rapids-bot bot merged commit bccf3ab into rapidsai:branch-23.06 Apr 21, 2023
49 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants