New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GCP: Add bundle jar for GCP-related dependencies #8231
Conversation
gcp-bundle/NOTICE
Outdated
|
||
-------------------------------------------------------------------------------- | ||
|
||
This binary artifact includes Project Nessie with the following in its NOTICE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bryanck I think we need to revisit the LICENSE and NOTICE files. If this is just a GCP bundle, it shouldn't include all of the Iceberg and related dependencies. Just the GCP notice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, thanks. I updated these.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just left some minor comments on the license file
gcp-bundle/LICENSE
Outdated
|
||
Group: com.google.code.gson Name: gson Version: 2.10.1 | ||
Project URL: https://github.com/google/gson/gson | ||
License: "Apache-2.0";link="https://www.apache.org/licenses/LICENSE-2.0.txt" (Not packaged) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we remove this line maybe?
gcp-bundle/LICENSE
Outdated
-------------------------------------------------------------------------------- | ||
|
||
Group: org.checkerframework Name: checker-qual Version: 3.33.0 | ||
License: MIT (Not packaged) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can probably be removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I removed these lines.
This PR packages the Iceberg GCP library in a similar way to the Iceberg AWS library. The
icberg-gcp
project is packaged with the engine runtimes (Spark, Flink, Hive) without the GCP dependencies, similar to howiceberg-aws
is included without the AWS dependencies. This has a negligible impact on the engine runtime size.In the
iceberg-gcp
project, the GCP dependencies are changed to compile-only. This mirrors how the AWS dependencies are declared iniceberg-aws
. While this could impact those usingiceberg-gcp
today, the thought is that there may not be many heavy users of it, as there was a critical bug in the GCP reader that was only recently fixed and is not in a release yet (#8071). Current users would need to add the GCP dependencies to their build.Finally, a new
iceberg-gcp-bundle
project is added that packages the necessary GCP libraries and shades any packages that might conflict with engine libraries. This allows users to simply include two dependencies when using Iceberg on GCP, e.g. for Spark,iceberg-spark-runtime
andiceberg-gcp-bundle
. (AWS users similarly can specify the AWS bundle, but Google doesn't provide something similar.)