Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Azure: Add FileIO that supports ADLSv2 storage #8303
Azure: Add FileIO that supports ADLSv2 storage #8303
Changes from 33 commits
6d965d2
8c59928
e76a1a5
2576250
e2fe211
42debe3
f5a67c8
6b53fbb
571b562
bc36c1d
f7b2221
f86811b
dd066b9
c8a4b5a
93a00c3
9c5c73e
852bbf0
0d9b7f0
6764f70
902a39b
3de84de
c72f3ac
339f066
29d5700
60ad696
e048e2f
4da61a3
ed7c4d4
7450fca
217a5b6
3d4cedc
3efb647
f19d5af
de97b78
9f02e72
0ea6e05
e758e08
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
Large diffs are not rendered by default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For other integrations, we don't bundle the dependencies and only ship the Iceberg side. That keeps our bundle small and doesn't force any particular version on downstream consumers. It also avoids needing to do a lot of license and notice documentation work. Is that possible here? Is there a dependency bundle that we can use at runtime?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The
azure-bundle
project build was set up for bundling all of the necessary Azure dependencies in one shadow jar as an (optional) convenience to users, similar to theaws-bundle
andgcp-bundle
projects, which include only the necessary runtime libraries at the same version used for the Iceberg build and shades conflicting libraries. A user can opt to include their own Azure dependencies if desired and not use this at all. For example, all you would need to run with Spark is the Spark runtime and Azure/AWS/GCP bundle. Neither Microsoft nor Google provide such a bundle. Amazon has one for AWS but it is very large, which causes issues with some systems.This is separate from the
azure
project build which declares the Azure dependencies ascompileOnly
so they are not included with any runtime.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I see. I didn't know about the other bundle projects. Looks like the LICENSE file is updated for those, but not the NOTICE. Did you check whether each bundled project has a NOTICE that we need to include?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not, I'll do that now for all three.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added this. I'll open a separate PR for the AWS and GCP bundles.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PR for the AWS and GCP bundles is here: #8323
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! It's amazing that we can automate this now. It was such a giant pain to do this in the past!