Skip to content

[CALCITE-6042] Add test cases for ARRAY-related functions by using spark array function#3462

Open
chucheng92 wants to merge 1 commit intoapache:mainfrom
chucheng92:CALCITE-6042
Open

[CALCITE-6042] Add test cases for ARRAY-related functions by using spark array function#3462
chucheng92 wants to merge 1 commit intoapache:mainfrom
chucheng92:CALCITE-6042

Conversation

@chucheng92
Copy link
Member

@chucheng92 chucheng92 commented Oct 12, 2023

https://issues.apache.org/jira/browse/CALCITE-6042

Add test cases by using spark array function rather than calcite array constructor.
these cases only work in spark library.

@chucheng92 chucheng92 changed the title [CALCITE-6042] Add test cases for ARRAY-related functions by using spark array function rather than array constructor(Spark Library) [CALCITE-6042] Add test cases for ARRAY-related functions by using spark array function (Spark Library) Oct 12, 2023
+ "'SORT_ARRAY\\(<INTEGER ARRAY>, <INTEGER>\\)'\\."
+ " Supported form\\(s\\): 'SORT_ARRAY\\(<ARRAY>\\)'\n"
+ "'SORT_ARRAY\\(<ARRAY>, <BOOLEAN>\\)'", false);

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code is highly repetitive. There should be a way to reduce repetitions and make sure that the exact same code is executed modulo the array constructor. The test should be inside of a loop that iterates twice: once for constructors, once for functions.

Here is a possible way to implement this, not necessarily the best:

The query is generated by some string substitution process, e.g., a helper function "array_query" that is invoked like: array_query(false, "sort_array(%A%0)", "2, null, 1")

First argument is a false for arrays, and true for constructors, the second argument is a pattern, and the following arguments are all strings, with values to substitute in the pattern for %0, %1, etc. In this example %A%0 is substituted with either array(%0) or array[%0]; afterwards %0 is a positional argument that is substituted with the first argument.

If some tests do not work with constructors perhaps a second flag is needed to skip these, but I have proposed in the past to allow constructors for empty arrays. Hopefully we'll implement that as well, and then all the tests should pass in both cases.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mihaibudiu thanks mihai for reviewing, good comments, agree with you. I till try to optimize this duplication.

@chucheng92
Copy link
Member Author

chucheng92 commented Oct 13, 2023

I have wrote a array-expr-generator, it looks like:

the usage:

    // array_append(array(1), 2) & array_append(array[1], 2)
    f.checkScalarArray(ArrayType.ALL, "[1, 2]", "INTEGER NOT NULL ARRAY NOT NULL",
        "array_append(%A%0, %1)", "1", "2");

    // array_append(array(), 1)        
    f.checkScalarArray(ArrayType.FUNCTION, "[1]", "INTEGER NOT NULL ARRAY NOT NULL",
        "array_append(%A%0, %1)", "", "1");

    // array_append(array(array(1,2)), array(3,4)) & array_append(array[array[1,2]], array[3,4]) 
    // -> It's a nested array case.
    f.checkScalarArray(ArrayType.ALL, "[[1, 2], [3, 4]]",
        "INTEGER NOT NULL ARRAY NOT NULL ARRAY NOT NULL",
        "array_append(%A1%0, %A%1)", "1, 2", "3, 4");

the implemented method:

  default void checkScalarArray(
      ArrayType syntaxType,
      String result,
      String resultType,
      String exprPattern,
      String... exprArgs) {
    if (syntaxType == ArrayType.ALL) {
      String constructorExpr = generateArrayExpr(true, exprPattern, exprArgs);
      String functionExpr = generateArrayExpr(false, exprPattern, exprArgs);
      checkScalar(constructorExpr, result, resultType);
      checkScalar(functionExpr, result, resultType);
    } else {
      boolean useConstructor = syntaxType == ArrayType.CONSTRUCTOR;
      String arrayExpr = generateArrayExpr(useConstructor, exprPattern, exprArgs);
      checkScalar(arrayExpr, result, resultType);
    }
  }

generateArrayExpr generate array() or array[] with given syntaxType, and it supports nested array.

  • exprPattern
    %A[n]%i placeholders are used for array syntax, where n is the optional nested level and i is the index of the argument to be inserted. %i placeholders without %A will be replaced without array syntax.

  • exprArgs
    Arguments to replace the placeholders in the expression pattern. They will replace %0, %1, etc.

I feel it will be less readable, but it does remove duplicate code.
I want to know what everyone thinks of this plan? If possible, I will update all array cases to this form.

@mihaibudiu what do you think?

@mihaibudiu
Copy link
Contributor

There is actually an even better way to do it, I should have thought about it sooner. The idea is to use Calcite itself to rewrite the code. The way this is done is by implementing a SqlShuttle which replaces calls to one of the operators with calls to the other one. Then for each test you:

  • run the test as is
  • parse the Sql, run the shuttle, unparse the sql
  • run the resulting sql

Let me know if you need help implementing this solution.

@chucheng92
Copy link
Member Author

chucheng92 commented Oct 15, 2023

thanks for suggestions. @mihaibudiu

Frankly speaking, I'm not sure if this is the best solution, it may be over-optimized, and the duplication in the PR may just be sonar's strategy (since I handle all array functions)

Try to see if others have other opinions, I might be leaning towards good readability.

@chucheng92 chucheng92 changed the title [CALCITE-6042] Add test cases for ARRAY-related functions by using spark array function (Spark Library) [CALCITE-6042] Add test cases for ARRAY-related functions by using spark array function Oct 19, 2023
@sonarqubecloud
Copy link

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 19 Code Smells

100.0% 100.0% Coverage
43.4% 43.4% Duplication

@github-actions
Copy link

This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 90 days if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the PR at any time and @mention a reviewer or discuss it on the dev@calcite.apache.org list. Thank you for your contributions.

@github-actions
Copy link

This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 90 days if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the PR at any time and @mention a reviewer or discuss it on the dev@calcite.apache.org list. Thank you for your contributions.

@github-actions
Copy link

This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 90 days if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the PR at any time and @mention a reviewer or discuss it on the dev@calcite.apache.org list. Thank you for your contributions.

@github-actions github-actions bot added the stale label Jun 28, 2025
@github-actions github-actions bot removed the stale label Sep 7, 2025
@github-actions
Copy link

This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 90 days if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the PR at any time and @mention a reviewer or discuss it on the dev@calcite.apache.org list. Thank you for your contributions.

@github-actions github-actions bot added the stale label Oct 25, 2025
@github-actions github-actions bot removed the stale label Jan 12, 2026
@github-actions
Copy link

This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 90 days if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the PR at any time and @mention a reviewer or discuss it on the dev@calcite.apache.org list. Thank you for your contributions.

@github-actions github-actions bot added the stale label Feb 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants