New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Implement dedup and filter for vectors #245
Conversation
59901e1
to
8502409
Compare
1f01dd8
to
92703af
Compare
35d07f2
to
ab3fc5d
Compare
Codecov Report
@@ Coverage Diff @@
## develop #245 +/- ##
===========================================
+ Coverage 83.33% 83.87% +0.53%
===========================================
Files 295 297 +2
Lines 25338 25737 +399
===========================================
+ Hits 21116 21587 +471
+ Misses 4222 4150 -72
Flags with carried forward coverage won't be shown. Click here to find out more.
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
535effa
to
f6622e1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Use field val instead of tuple for variant ListValueRef::Ref to keep consistence with ListValueRef::Indexed
Also implements ScalarVectorBuilder for ListVectorBuilder, Scalar for ListValue and ScalarRef for ListValueRef
Move compute operations to VectorOp trait and acts as an super trait of Vector. So we could later put dedup/filter methods to VectorOp trait, avoid to define too many methods in Vector trait.
Move Scalar and ScalarRef trait bounds to PrimitiveElement, so for each native type which implements PrimitiveElement, its PrimitiveVector always implements ScalarVector, so we could use it as ScalarVector without adding additional trait bounds
Remove compute mod and move dedup logic to operations::dedup
Also fix NullVector::dedup and ConstantVector::dedup
Also fix TimestampVector eq not implemented.
9864253
to
43dc7c7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
* feat: Dedup vector * refactor: Re-export Date/DateTime/Timestamp * refactor: Named field for ListValueRef::Ref Use field val instead of tuple for variant ListValueRef::Ref to keep consistence with ListValueRef::Indexed * feat: Implement ScalarVector for ListVector Also implements ScalarVectorBuilder for ListVectorBuilder, Scalar for ListValue and ScalarRef for ListValueRef * test: Add tests for ScalarVector implementation of ListVector * feat: Implement dedup using match_scalar_vector * refactor: Move dedup func to individual mod * chore: Update ListValueRef comments * refactor: Move replicate to VectorOp Move compute operations to VectorOp trait and acts as an super trait of Vector. So we could later put dedup/filter methods to VectorOp trait, avoid to define too many methods in Vector trait. * refactor: Move scalar bounds to PrimitiveElement Move Scalar and ScalarRef trait bounds to PrimitiveElement, so for each native type which implements PrimitiveElement, its PrimitiveVector always implements ScalarVector, so we could use it as ScalarVector without adding additional trait bounds * refactor: Move dedup to VectorOp Remove compute mod and move dedup logic to operations::dedup * feat: Implement VectorOp::filter * test: Move replicate test of primitive to replicate.rs * test: Add more replicate tests * test: Add tests for dedup and filter Also fix NullVector::dedup and ConstantVector::dedup * style: fix clippy * chore: Remove unused scalar.rs * test: Add more tests for VectorOp and fix failed tests Also fix TimestampVector eq not implemented. * chore: Address CR comments * chore: mention vector should be sorted in comment * refactor: slice the vector directly in replicate_primitive_with_type
Changes
This PR adds some necessary compute operations for vectors in order to implement the
DedupReader
.VectorOp
super trait to provide computation support forVector
dedup
andfilter
replicate
method toVectorOp
ScalarVector
forListVector
, so all exceptNullVector
could share the samededup
implementation based on theScalarVector
trait.ListVector
now implementsScalarVector
, it could usereplicate_scalar()
to implementreplicate()
ListValueRef::Ref
be a struct variant instead of tuple variant, for consistence betweenRef
andIndexed
variantPrimitiveElement
trait and move theScalar
andScalarRef
trait bound to thePrimitiveElement
trait, soPrimitiveVector<T: PrimitiveElement>
implies theScalarVector
trait bound