Minor: Add routine to debug join fuzz tests #10970

comphead · 2024-06-18T01:03:42Z

Which issue does this PR close?

Closes #.

Rationale for this change

It is pretty complicated to find the exact root cause for issues triggered by fuzz tests. Those tests have a random nature and sometime require stable things to reproduce the case especially when the test is flaky. Adding debug procedures and doc how to use then while debugging #10886

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

alamb

Looks like an improvement to me -- thanks for improving the testing infrastructure @comphead

alamb · 2024-06-18T14:48:13Z

datafusion/core/tests/fuzz_cases/join_fuzz.rs

+
+            // if debug flag is set the test will save randomly generated inputs and outputs to user folders
+            // so it is easy to run debug on top of the failed test data
+            if debug {


👍

It might make sense to add this information about saving information to the overall docstring of this function (run_tests) so people don't have to read the implementation to find out what debug does.

alamb · 2024-06-18T14:49:38Z

datafusion/core/tests/fuzz_cases/join_fuzz.rs

@@ -383,7 +419,7 @@ impl JoinFuzzTestCase {

    /// Perform sort-merge join and hash join on same input
    /// and verify two outputs are equal
-    async fn run_test(&self) {
+    async fn run_test(&self, join_test: &[JoinTest], debug: bool) {


Since this is already on a struct, another way to pass in the debug flag would be as a new field on JoinFuzzTestCase

like

struct JoinFuzzTestCase { batch_sizes: &'static [usize], input1: Vec<RecordBatch>, input2: Vec<RecordBatch>, join_type: JoinType, join_filter_builder: Option<JoinFilterBuilder>, // if debug flag is set the test will save randomly generated inputs and outputs to user folders // so it is easy to run debug on top of the failed test data debug: bool, }

alamb · 2024-06-18T14:49:55Z

datafusion/core/tests/fuzz_cases/join_fuzz.rs

+    /// This method useful for debugging fuzz tests
+    /// It helps to save randomly generated input test data for both join inputs into the user folder
+    /// as a parquet files preserving partitioning.
+    /// Once the data is saved it is possible to run a custom test on top of the saved data and debug


comphead · 2024-06-18T17:25:39Z

Thanks @alamb for the review

comphead added 11 commits June 17, 2024 16:08

Fix: Sort Merge Join crashes on TPCH Q21

afe0fd7

Fix LeftAnti SMJ join when the join filter is set

7363567

rm dbg

21c3543

Minor: disable fuzz test to avoid CI spontaneous failures

6284875

Minor: disable fuzz test to avoid CI spontaneous failures

e118df7

Fix: Sort Merge Join crashes on TPCH Q21

be263ac

Fix LeftAnti SMJ join when the join filter is set

ec78dc3

rm dbg

aaaea61

Minor: disable fuzz test to avoid CI spontaneous failures

9c5b194

Minor: disable fuzz test to avoid CI spontaneous failures

d3d6839

Minor: Add routine to debug join fuzz tests

c128aa3

github-actions bot added the core Core datafusion crate label Jun 18, 2024

comphead added 2 commits June 17, 2024 18:06

Minor: Add routine to debug join fuzz tests

811a218

Minor: Add routine to debug join fuzz tests

6ef3ace

alamb approved these changes Jun 18, 2024

View reviewed changes

comphead added 2 commits June 18, 2024 09:06

Minor: Add routine to debug join fuzz tests

6d6fa26

Minor: Add routine to debug join fuzz tests

d131a64

comphead merged commit e9f9a23 into apache:main Jun 18, 2024
23 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Minor: Add routine to debug join fuzz tests #10970

Minor: Add routine to debug join fuzz tests #10970

comphead commented Jun 18, 2024

alamb left a comment

alamb Jun 18, 2024

alamb Jun 18, 2024

alamb Jun 18, 2024

comphead commented Jun 18, 2024

Minor: Add routine to debug join fuzz tests #10970

Minor: Add routine to debug join fuzz tests #10970

Conversation

comphead commented Jun 18, 2024

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

alamb left a comment

Choose a reason for hiding this comment

alamb Jun 18, 2024

Choose a reason for hiding this comment

alamb Jun 18, 2024

Choose a reason for hiding this comment

alamb Jun 18, 2024

Choose a reason for hiding this comment

comphead commented Jun 18, 2024