Iceberg: use LocationProvider instead of hardcoded path #8573

jackye1995 · 2021-07-16T02:40:30Z

Use Iceberg's LocationProvider instead of hard-coding file paths. With the current hard-coded Hive-based file paths to write data files, users on cloud storage cannot enjoy the benefit of Iceberg's ObjectStorageLocationProvider. This PR fixes this issue.

findepi · 2021-07-16T08:07:04Z

cc @losipiuk @alexjo2144

findepi · 2021-07-16T08:10:36Z

plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergWritableTableHandle.java

    {
-        return outputPath;
+        if (locationProvider == null) {
+            locationProvider = deserializeFromBytes(serializedLocationProvider);


We do not use Java serialization. In particular, it can lead to security issues.

Sure, there are multiple ways we can go with this. We can also get the location provider through the dependency injection of table operation, since there is a general agreement that HiveTableOperations will be used across the board.

Thought about it more, it would be inefficient to initialize a table operation, read table metadata and get the location provider. The alternative way is to pass in table properties. Please see if the new version works, thanks!

alexjo2144

The code changes seem reasonable, just one broader thought. A table could specify any custom LocationProvider implementation in the properties. Is that location guaranteed to be compatible with org.apache.hadoop.fs.Path?

Alternatively, we could scope this down to just supporting DefaultLocationProvider and ObjectStoreLocationProvider by introducing a config or table property option and initializing the LocationProvider directly rather than through LocationProviders#locationsFor.

plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergPageSink.java

jackye1995 · 2021-07-16T20:06:59Z

@alexjo2144 thanks for the feedback.

we could scope this down to just supporting DefaultLocationProvider and ObjectStoreLocationProvider

Sure I can introduce another static util method instead of using the Iceberg one.

Is that location guaranteed to be compatible with org.apache.hadoop.fs.Path?

This is related to the multi-catalog discussion, the conclusion there is that:

multi-catalog will be supported, but only ones explicitly listed by Trino through an Enum, no custom loading is supported.
only HdfsFileIO is used across all catalogs as the IO operator

So yes, it's guaranteed to be compatible.

jackye1995 · 2021-07-16T20:37:50Z

plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergUtil.java

@@ -296,4 +292,12 @@ public static Object deserializePartitionValue(Type type, String valueString, St

        return Collections.unmodifiableMap(partitionKeys);
    }
+
+    public static LocationProvider getLocationProvider(String tableLocation, Map<String, String> properties)


@alexjo2144 I think this is the best we can do on Trino side, the 2 location provider are protected and not public classes, so we can only block the creation if locaiton-impl table property is set.

findepi

(just skimming)

findepi · 2021-07-19T12:42:31Z

plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergUtil.java

@@ -159,14 +159,6 @@ public static long resolveSnapshotId(Table table, long snapshotId)
        return columns.build();
    }

-    public static String getDataPath(String location)


doesn't seem to belong to "pass in table properties to initialize location provider in sink prov…" commit

findepi · 2021-07-19T12:42:34Z

plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergPageSinkProvider.java

        return new IcebergPageSink(
                schema,
                partitionSpec,
-                tableHandle.getOutputPath(),
+                locationProvider,


doesn't seem to belong to "pass in table properties to initialize location provider in sink prov…" commit

plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergMetadata.java

alexjo2144 · 2021-07-19T14:54:47Z

@findepi did you have an opinion on allowing custom LocationProviders vs restricting?

@jackye1995 this looks good to me, can you just add a test case to TestSparkCompatibility with a test case

Create table in Spark with ObjectStorageLocationProvider
Insert data from Spark
Insert data from Trino
Read from both and ensure both get the same data
Read the "$path" column and ensure data files have the correct path structure

jackye1995 · 2021-07-20T18:22:09Z

@findepi thanks for the comments, I am still getting familiar with the standards here so sorry for the bad commit message. I was used to commit with random message and merge with squash. I have force updated everything into a single commit, please let me know if this works.

@alexjo2144 test added, should be good for another look

alexjo2144 · 2021-07-20T19:49:41Z

...trino-product-tests/src/main/java/io/trino/tests/product/iceberg/TestSparkCompatibility.java

+                "'write.object-storage.enabled'=true," +
+                "'write.object-storage.path'='" + dataPath + "')";
+        onSpark().executeQuery(format(sparkTableDefinition, sparkTableName));
+        onSpark().executeQuery(format("INSERT INTO %s VALUES ('a_string', 1000000000000000)", sparkTableName));


It's important to do an INSERT from Trino here too. That's what runs the changes you made in the IcebergPageSink

Yeah actually this is a good point, it does not test anything by inserting it using Spark, so I directly changed it to be executed on onTrino.

If we want to test Trino able to read Spark data written by object storage location provider, I think it should be placed in the testPrestoReadingSparkData test. But LocationProvider is a write side feature, on read side it's just using the file path in the manifest, so I think there is little value for adding that test case.

alexjo2144

Looks good to me. The option to create tables using ObjectStoreLocationProvider from Trino would be nice, but in separate Issue/PR

jackye1995 · 2021-07-26T19:48:02Z

@alexjo2144 @findepi any updates on this? Please let me know if there is any change needed, otherwise could you merge the PR? Thanks!

electrum

If I'm understanding this correctly, the location provider should be specific to the catalog implementation, correct? If so, then we should add this to the upcoming catalog API. Or is this something that the user should configure, for example, depending on if they are using HDFS or S3?

Either way, this shouldn't be or be passed in a table property. We can discuss this on Slack if that's easier.

electrum · 2021-08-03T00:12:16Z

plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergPageSink.java

        this.fileWriterFactory = requireNonNull(fileWriterFactory, "fileWriterFactory is null");
        this.hdfsEnvironment = requireNonNull(hdfsEnvironment, "hdfsEnvironment is null");
        this.hdfsContext = requireNonNull(hdfsContext, "hdfsContext is null");
-        this.jobConf = toJobConf(hdfsEnvironment.getConfiguration(hdfsContext, new Path(outputPath)));
+        this.jobConf = toJobConf(hdfsEnvironment.getConfiguration(hdfsContext, new Path(locationProvider.newDataLocation(""))));


This parameter is for a filename, so an empty filename would seem to be invalid, and thus a location provider might reject it. We could use a default name like data-file instead.

jackye1995 · 2021-08-03T07:05:29Z

@electrum thanks for the comment, I will both replay here just for visibility.

We discussed about the use of table properties in Trino versus Iceberg last time, what I found out is that they are not really the same concept after all. The fundamental difference is that:

On Trino side, table property is only used at table creation time, and it is up to the connector to decide what to do with each property. For Iceberg's case, we have:

partitioning: used to input partition transform information for Iceberg tables, stored as a partition spec in Iceberg table metadata JSON file
location: used to input table location, stored as the location field of the table metadata.
format, used to specify table format, stored as the write.format.default (Iceberg TableProperties.DEFAULT_FILE_FORMAT) in the table metadata's table properties map.

On Iceberg side, table properties are directly stored as a part of the table metadata JSON file, which is the format case in Trino Iceberg connector table property. The property is used to determine read/write behavior at runtime.

With this understanding, now we look at location provider configuration, which is related to the following Iceberg table properties:

write.object-storage.enabled: boolean to determine if object storage location provider should be used or not
write.object-storage.path: root path for data files in object storage mode
write.location-provider.impl: any custom location provider impl, which Trino will directly reject loading.

What this PR is trying to support is the case for which:

a user writes an Iceberg table with the above configurations set using some other systems like Spark/Hive
then the user tries to use Trino to also perform additional write operations

In the case described above, Trino should respect the configurations and write to the configured location instead of the hard-coded table location. This has nothing to do with user configuring any Trino table property.

I think what you are concerned about is the case where user can create a table with object storage mode or not based on a Trino table property or some other table features like the storage provider. I agree that behavior should be controlled. I am trying to keep this PR small, we can introduce a boolean table property like object_storage_mode and map to the Iceberg table properties behind the scene, similar to the format case.

findepi · 2021-08-03T12:06:22Z

@jackye1995

can write.object-storage.path be used without specifying write.location-provider.impl?
can write.object-storage.path be used without specifying write.object-storage.enabled? what should be the behavior then?

perhaps we can narrow down the scope of the change to

respect write.object-storage.path (and write.object-storage.enabled if they must be used in tandem) -- instead of writing to table's directory (determined from location), write to location pointed to by write.object-storage.path
throw when attempting to write when write.location-provider.impl is set

Such changes should require significant code changes, should they?

Actually, isn't the following implementing the aforementioned behavior already?

trino/plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/HiveTableOperations.java

Line 258 in b8bcec2

    
           return LocationProviders.locationsFor(metadata.location(), metadata.properties());

jackye1995 · 2021-08-03T19:15:26Z

can write.object-storage.path be used without specifying write.location-provider.impl
can write.object-storage.path be used without specifying write.object-storage.enabled? what should be the behavior then?

write.object-storage.path is only used by ObjectStorageLocationProvider when write.object-storage.enabled is true. There is no other way to enable this as the class is not public. Other location provider implementations might leverage the same config, but Trino will reject it through write.location-provider.impl config.

perhaps we can narrow down the scope of the change to ...

Yes this PR is exactly trying to achieve the 2 goals you listed.

jackye1995 · 2021-08-03T19:43:01Z

Actually, isn't the following implementing the aforementioned behavior already?

Yes, but to use it in the page sink provider, we need to construct a new HiveTableOperations which is quite expensive. That's why IcebergUtil.getLocationProvider is directly called.

findepi · 2021-08-03T20:13:36Z