Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[#1573] feat(spark-connector): Spark regression test system #3933

Merged
merged 11 commits into from
Aug 2, 2024

Conversation

FANNG1
Copy link
Contributor

@FANNG1 FANNG1 commented Jun 21, 2024

What changes were proposed in this pull request?

add Spark regression test system to do Regression test for SparkSQL.

Why are the changes needed?

Fix: #1573

Does this PR introduce any user-facing change?

no

How was this patch tested?

add IT

@FANNG1 FANNG1 marked this pull request as draft June 21, 2024 03:34
@FANNG1 FANNG1 force-pushed the r-test branch 11 times, most recently from a6923a2 to 744990b Compare June 26, 2024 15:25
@FANNG1 FANNG1 changed the title [SIP] Spark Regression test [#1573] feat(spark-connector): Spark regression test system Jun 27, 2024
@FANNG1 FANNG1 marked this pull request as ready for review June 27, 2024 01:08
@FANNG1
Copy link
Contributor Author

FANNG1 commented Jun 27, 2024

@yuqi1129 @diqiu50 @jerryshao @qqqttt123 , please help to review when you free, thanks

@diqiu50
Copy link
Contributor

diqiu50 commented Jun 27, 2024

What changes were proposed in this pull request?

add Spark regression test system to do Regression test for SparkSQL.

Why are the changes needed?

Fix: #1573

Does this PR introduce any user-facing change?

no

How was this patch tested?

add IT

The links are not accessed

@FANNG1
Copy link
Contributor Author

FANNG1 commented Jun 27, 2024

What changes were proposed in this pull request?

add Spark regression test system to do Regression test for SparkSQL.

Why are the changes needed?

Fix: #1573

Does this PR introduce any user-facing change?

no

How was this patch tested?

add IT

The links are not accessed

I could access the links, which links are not accessed? could you try again?

@FANNG1
Copy link
Contributor Author

FANNG1 commented Jul 31, 2024

@diqiu50 @yuqi1129 @jerryshao , do you have time to review this?

@jerryshao
Copy link
Contributor

I'm not sure if you add docs about how to test it or not, if not please add the docs.

public String getWarehouseLocation() {
return get(WAREHOUSE_DIR);
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need configurations for tests? I feel think we don't have to formalize them just for test, WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer to keep the configs to control the test behavior better.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want to keep this configuration, you should have a better name to clarify its scope and usage, currently I feel the name is randomly picked, some of them are not so meaningful.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

@@ -144,10 +144,14 @@ tasks.test {

val skipITs = project.hasProperty("skipITs")
val skipSparkITs = project.hasProperty("skipSparkITs")
val skipSparkSQLITs = project.hasProperty("skipSparkSQLITs")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need a configuration to enable or disable this test?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like there are too many configurations to control whether to run something or not, do we have better solutions?

I would incline to use gradle to control whether to run this or not, by default we would not run these SQL tests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to define a specific task to run the test, but failed to initIntegrationTest. So I'd like to define a enableSparkSQLTests flag to enable the test explicitly, WDYT?

tasks.register("testSQL") {
  dependsOn("compileTestJava")

  doLast {
    javaexec {
      mainClass.set("org.apache.gravitino.spark.connector.integration.test.sql.SparkSQLRegressionTest33")
      classpath = sourceSets["test"].runtimeClasspath
      environment("GRAVITINO_CI_HIVE_DOCKER_IMAGE", "datastrato/gravitino-ci-hive:0.1.13")
     // failed to init initIntegrationTest in javaexec
      val init = project.extra.get("initIntegrationTest") as (Test) -> Unit
      init(this)
    }
  }
}

Comment on lines 30 to 37
switch (str.toLowerCase()) {
case "hive":
return HIVE;
case "lakehouse-iceberg":
return ICEBERG;
default:
return UNKNOWN;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
switch (str.toLowerCase()) {
case "hive":
return HIVE;
case "lakehouse-iceberg":
return ICEBERG;
default:
return UNKNOWN;
}
for (CatalogType type : CatalogType.values())
if (type.name.equals(str.toUpperCase())) {
//....
}
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LAKEHOUSE-ICEBERG, couldn't be used as enum values

Comment on lines 272 to 282
private static File stringToFile(Path path, String str) throws IOException {
File file = path.toFile();
try (PrintWriter out = new PrintWriter(file, StandardCharsets.UTF_8.toString())) {
out.write(str);
}
return file;
}

private String fileToString(Path filePath) throws IOException {
return new String(Files.readAllBytes(filePath), StandardCharsets.UTF_8);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FileUtils.writeStringToFile();
FileUtils.readFileToString()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

* "com.datastrato.gravitino.spark.connector.integration.test.sql.SparkSQLRegressionTest"
* -PconfigFile=/xxx/xx
*/
@Tag("gravitino-docker-it")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gravitino-docker-test

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

.toString();

private static final ConfigEntry<String> TEST_BASE_DIR =
new ConfigBuilder("gravitino.test.sql.dir")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gravitino.test.sql.dir -> gravitino.spark.test.dir

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@FANNG1
Copy link
Contributor Author

FANNG1 commented Aug 1, 2024

I'm not sure if you add docs about how to test it or not, if not please add the docs.

update document

@jerryshao
Copy link
Contributor

@yuqi1129 would you please take another look?

}
}

// To find the catalog type from the directory name, the first name in directory match CatalogType
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the first non-unknown CatalogType from parent to child determines the catalog type.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@yuqi1129
Copy link
Contributor

yuqi1129 commented Aug 2, 2024

Others LGTM

@FANNG1
Copy link
Contributor Author

FANNG1 commented Aug 2, 2024

add support exclude test resources from test jar

@jerryshao jerryshao merged commit f288d91 into apache:main Aug 2, 2024
33 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Subtask] [spark-connector] setup a test system to cover existings spark&hive sql tests
4 participants