Skip to content

Commit

Permalink
Add new SnowflakeCatalog implementation to enable directly using Snow…
Browse files Browse the repository at this point in the history
…flake-managed Iceberg tables (apache#6428)

* Initial read-only Snowflake Catalog implementation by @sfc-gh-mparmar (#1)

Initial read-only Snowflake Catalog implementation built on top of the Snowflake JDBC driver,
providing support for basic listing of namespaces, listing of tables, and loading/reads of tables.

Auth options are passthrough to the JDBC driver.

Co-authored-by: Maninder Parmar <maninder.parmar@snowflake.com>
Co-authored-by: Maninder Parmar <maninder.parmar+oss@snowflake.com>
Co-authored-by: Dennis Huo <dennis.huo+oss@snowflake.com>

* Add JdbcSnowflakeClientTest using mocks (#2)

Add JdbcSnowflakeClientTest using mocks; provides full coverage of JdbcSnowflakeClient
and entities' ResultSetHandler logic.

Also update target Spark runtime versions to be included.

* Add test { useJUnitPlatform() } tuple to iceberg-snowflake for
consistency and future interoperability with inheriting from abstact
unittest base classes.

* Extract versions into versions.props per PR review

* Misc test-related refactors per review suggestions
-Convert unittests to all use assertj/Assertions for "fluent assertions"
-Refactor test injection into overloaded initialize() method
-Add test cases for close() propagation
-Use CloseableGroup.

* Fix unsupported behaviors of loadNamedpaceMetadata and defaultWarehouseLocation

* Move TableIdentifier checks out of newTableOps into the
SnowflakTableOperations class itself, add test case.

* Refactor out any Namespace-related business logic from the lower
SnowflakeClient/JdbcSnowflakeClient layers and merge SnowflakeTable
and SnowflakeSchema into a single SnowflakeIdentifier that also
encompasses ROOT and DATABASE level identifiers.

A SnowflakeIdentifier thus functions like a type-checked/constrained
Iceberg TableIdentifier, and eliminates any tight coupling between
a SnowflakeClient and Catalog business logic.

Parsing of Namespace numerical levels into a SnowflakeIdentifier
is now fully encapsulated in NamespaceHelpers so that callsites
don't duplicate namespace-handling/validation logic.

* Finish migrating JdbcSnowflakeClientTest off any usage of org.junit.Assert
in favor of assertj's Assertions.

* Style refactorings from review comments, expanded and moved InMemoryFileIO into core
with its own unittest.

* Fix behavior of getNamespaceMetadata to throw when the namespace doesn't
exist.

Refactor for naming conventions and consolidating identifier
handling into NamespaceHandlers.

Make FileIO instantiated fresh for each newTableOps call.

* Move private constructor to top, add assertion to test case.

* Define minimal ResultSetParser/QueryHarness classes to fully replace
any use of commons-dbutils; refactor ResultSet handling fully into
JdbcSnowflakeClient.java.

* Update snowflake/src/main/java/org/apache/iceberg/snowflake/SnowflakeTableOperations.java

Co-authored-by: Eduard Tudenhöfner <etudenhoefner@gmail.com>

* Refactor style suggestions; remove debug-level logging, arguments in exceptions,
private members if not accessed outside, move precondition checks, add test for
NamespaceHelpers.

* Fix precondition messages, remove getConf()

* Clean up varargs.

* Make data members final, include rawJsonVal in toString for debuggability.

* Combine some small test cases into roundtrip test cases, misc cleanup

* Add comment for why a factory class is exposed for testing purposes.

Co-authored-by: Dennis Huo <dennis.huo@snowflake.com>
Co-authored-by: Maninder Parmar <maninder.parmar@snowflake.com>
Co-authored-by: Maninder Parmar <maninder.parmar+oss@snowflake.com>
Co-authored-by: Eduard Tudenhöfner <etudenhoefner@gmail.com>
  • Loading branch information
5 people committed Jan 14, 2023
1 parent 7943e94 commit e56c1cf
Show file tree
Hide file tree
Showing 19 changed files with 2,415 additions and 4 deletions.
4 changes: 3 additions & 1 deletion .github/labeler.yml
Original file line number Diff line number Diff line change
Expand Up @@ -81,4 +81,6 @@ ALIYUN:
GCP:
- gcp/**/*
DELL:
- dell/**/*
- dell/**/*
SNOWFLAKE:
- snowflake/**/*
18 changes: 18 additions & 0 deletions build.gradle
Original file line number Diff line number Diff line change
Expand Up @@ -697,6 +697,24 @@ project(':iceberg-dell') {
}
}

project(':iceberg-snowflake') {
test {
useJUnitPlatform()
}

dependencies {
implementation project(':iceberg-core')
implementation project(':iceberg-common')
implementation project(path: ':iceberg-bundled-guava', configuration: 'shadow')
implementation "com.fasterxml.jackson.core:jackson-databind"
implementation "com.fasterxml.jackson.core:jackson-core"

runtimeOnly("net.snowflake:snowflake-jdbc")

testImplementation project(path: ':iceberg-core', configuration: 'testArtifacts')
}
}

@Memoized
boolean versionFileExists() {
return file('version.txt').exists()
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -27,12 +27,12 @@
import org.apache.iceberg.CatalogProperties;
import org.apache.iceberg.ClientPoolImpl;

class JdbcClientPool extends ClientPoolImpl<Connection, SQLException> {
public class JdbcClientPool extends ClientPoolImpl<Connection, SQLException> {

private final String dbUrl;
private final Map<String, String> properties;

JdbcClientPool(String dbUrl, Map<String, String> props) {
public JdbcClientPool(String dbUrl, Map<String, String> props) {
this(
Integer.parseInt(
props.getOrDefault(
Expand All @@ -42,7 +42,7 @@ class JdbcClientPool extends ClientPoolImpl<Connection, SQLException> {
props);
}

JdbcClientPool(int poolSize, String dbUrl, Map<String, String> props) {
public JdbcClientPool(int poolSize, String dbUrl, Map<String, String> props) {
super(poolSize, SQLNonTransientConnectionException.class, true);
properties = props;
this.dbUrl = dbUrl;
Expand Down
2 changes: 2 additions & 0 deletions settings.gradle
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@ include 'hive-metastore'
include 'nessie'
include 'gcp'
include 'dell'
include 'snowflake'

project(':api').name = 'iceberg-api'
project(':common').name = 'iceberg-common'
Expand All @@ -51,6 +52,7 @@ project(':hive-metastore').name = 'iceberg-hive-metastore'
project(':nessie').name = 'iceberg-nessie'
project(':gcp').name = 'iceberg-gcp'
project(':dell').name = 'iceberg-dell'
project(':snowflake').name = 'iceberg-snowflake'

if (null != System.getProperty("allVersions")) {
System.setProperty("flinkVersions", System.getProperty("knownFlinkVersions"))
Expand Down
Loading

0 comments on commit e56c1cf

Please sign in to comment.