Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/trigger_files/IO_Iceberg_Integration_Tests.json
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
{
"comment": "Modify this file in a trivial way to cause this test suite to run.",
"modification": 1
"modification": 2
}
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
{
"comment": "Modify this file in a trivial way to cause this test suite to run.",
"modification": 1
"modification": 2
}
10 changes: 10 additions & 0 deletions sdks/java/io/expansion-service/build.gradle
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,16 @@ configurations.runtimeClasspath {

// Pin nimbus-jose-jwt to 9.37.4 to fix CVE-2025-53864 (transitive via hadoop-auth)
resolutionStrategy.force 'com.nimbusds:nimbus-jose-jwt:9.37.4'

// [iceberg]
// bigdataoss:gcs-connector and parquet:parquet-hadoop have conflicts with global hadoop-common:3.4.2
// upgrading gcs-connector to 4.0.0 would be fine, because it uses hadoop-common 3.4.2
// but parquet-hadoop is still at 3.3.0
// so for now we need to pin hadoop to 3.3.6 until parquet-hadoop releases a version that uses hadoop 3.4.2+
resolutionStrategy.force 'org.apache.hadoop:hadoop-common:3.3.6'
resolutionStrategy.force 'org.apache.hadoop:hadoop-client:3.3.6'
resolutionStrategy.force 'org.apache.hadoop:hadoop-hdfs:3.3.6'
resolutionStrategy.force 'org.apache.hadoop:hadoop-hdfs-client:3.3.6'
Comment on lines +61 to +69
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This block of version pinning and the accompanying comment are duplicated in sdks/java/io/iceberg/build.gradle. To improve maintainability and ensure that this temporary override is managed in one place, consider defining the Hadoop version as a variable in a central location (e.g., BeamModulePlugin or a project-level property) and referencing it in both files.

}

shadowJar {
Expand Down
11 changes: 10 additions & 1 deletion sdks/java/io/iceberg/build.gradle
Original file line number Diff line number Diff line change
Expand Up @@ -59,7 +59,7 @@ dependencies {
implementation "org.apache.iceberg:iceberg-parquet:$iceberg_version"
implementation "org.apache.iceberg:iceberg-orc:$iceberg_version"
implementation "org.apache.iceberg:iceberg-data:$iceberg_version"
implementation library.java.hadoop_common
implementation "org.apache.hadoop:hadoop-common:3.3.6"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The dependency hadoop-common is hardcoded to version 3.3.6 here, while hadoop_client on line 74 continues to use the library.java catalog. Since resolutionStrategy.force is applied later in this file (lines 125-128) to ensure version 3.3.6 is used across all configurations, it is more consistent and maintainable to keep using the library catalog here.

    implementation library.java.hadoop_common

// TODO(https://github.com/apache/beam/issues/21156): Determine how to build without this dependency
provided "org.immutables:value:2.8.8"
permitUnusedDeclared "org.immutables:value:2.8.8"
Expand All @@ -70,6 +70,7 @@ dependencies {
runtimeOnly "org.apache.iceberg:iceberg-azure:$iceberg_version"
runtimeOnly "org.apache.iceberg:iceberg-azure-bundle:$iceberg_version"
runtimeOnly library.java.bigdataoss_gcs_connector
runtimeOnly library.java.bigdataoss_util_hadoop
runtimeOnly library.java.hadoop_client

testImplementation project(":sdks:java:managed")
Expand Down Expand Up @@ -117,6 +118,14 @@ dependencies {
configurations.all {
// iceberg-core needs avro:1.12.0
resolutionStrategy.force 'org.apache.avro:avro:1.12.0'
// bigdataoss:gcs-connector and parquet:parquet-hadoop have conflicts with global hadoop-common:3.4.2
// upgrading gcs-connector to 4.0.0 would be fine, because it uses hadoop-common 3.4.2
// but parquet-hadoop is still at 3.3.0
// so for now we need to pin hadoop to 3.3.6 until parquet-hadoop releases a version that uses hadoop 3.4.2+
resolutionStrategy.force 'org.apache.hadoop:hadoop-common:3.3.6'
resolutionStrategy.force 'org.apache.hadoop:hadoop-client:3.3.6'
resolutionStrategy.force 'org.apache.hadoop:hadoop-hdfs:3.3.6'
resolutionStrategy.force 'org.apache.hadoop:hadoop-hdfs-client:3.3.6'
}

hadoopVersions.each {kv ->
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@

import static org.apache.beam.vendor.guava.v32_1_2_jre.com.google.common.base.Preconditions.checkArgument;
import static org.apache.beam.vendor.guava.v32_1_2_jre.com.google.common.base.Preconditions.checkNotNull;
import static org.apache.hadoop.util.Sets.newHashSet;
import static org.apache.beam.vendor.guava.v32_1_2_jre.com.google.common.collect.Sets.newHashSet;

import com.google.auto.value.AutoValue;
import java.io.Serializable;
Expand Down
Loading