-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change the default CompressionCodec.Factory to leverage compression support transparently #43469
Comments
ccciudatu
added a commit
to hstack/arrow
that referenced
this issue
Jul 29, 2024
…ssion support transparently (apache#43469)
ccciudatu
added a commit
to hstack/arrow
that referenced
this issue
Jul 29, 2024
…ssion support transparently (apache#43469)
ccciudatu
added a commit
to hstack/arrow
that referenced
this issue
Jul 29, 2024
…ssion support transparently (apache#43469)
ccciudatu
added a commit
to hstack/arrow
that referenced
this issue
Jul 29, 2024
…ssion support transparently (apache#43469)
ccciudatu
added a commit
to hstack/arrow
that referenced
this issue
Jul 29, 2024
…ssion support transparently (apache#43469)
danepitkin
pushed a commit
that referenced
this issue
Jul 30, 2024
…age compression support transparently (#43471) ### Rationale for this change Add compression support to Flight RPC and others by just including the `arrow-compression` jar in the module path (or classpath). ### What changes are included in this PR? Change the default compression factory to the new `CompressionCodec.Factory.INSTANCE`, a ServiceLoader-backed singleton that delegates to the best suited available implementation in the module/class path for each codec type. ### Are these changes tested? yes ### Are there any user-facing changes? No. * GitHub Issue: #43469 Authored-by: Costi Ciudatu <ccciudatu@gmail.com> Signed-off-by: Dane Pitkin <dpitkin@apache.org>
Issue resolved by pull request 43471 |
@danepitkin this also applies to #41457 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Describe the enhancement requested
Application code is currently required to choose upfront between handling compressed vs. uncompressed data by specifying one of the two (mutually exclusive)
CompressionCodec.Factory
implementations:NoCompressionCodec.Factory
andCommonsCompressionFactory
.While this is totally acceptable (or even required) for the write path (e.g.
ArrowWriter
) it makes it really tedious to support compression on the read path, as it's not reasonable to choose between handling uncompressed-data-only and compressed-data-only when writing (e.g.) a client app for Arrow Flight.As already reported in #41457, the Java FlightClient currently fails with the following error when trying to decode a compressed stream:
The
FlightStream
class does not explicitly pass a compression codec factory when creating aVectorLoader
, which then uses the defaultNoCompressionCodec.Factory
. Changing the default toCommonsCompressionFactory
is not an option because:CommonsCompressionFactory
does not support uncompressed dataarrow-compression
is not a dependency forarrow-vector
Instead of challenging these two design decisions, the proposed solution (upcoming PR) is to make the default
CompressionCodec.Factory
use aServiceLoader
to gather all the available implementations and combine them to support as manyCodecType
s as possible, falling back toNoCompressionCodec.Factory.INSTANCE
(i.e. the same default as today).The arrow-compression module would then act as a service provider, so that whenever it's present in the module- (or class-) path, it will transparently fill in the gaps of the default factory.
As a side note, this is in fact the literal meaning of the above error message ("Please add arrow-compression module to use CommonsCompressionFactory"), so we can assume this might have been the original intention.
Component(s)
FlightRPC, Java
The text was updated successfully, but these errors were encountered: