-
Notifications
You must be signed in to change notification settings - Fork 20
Improve performance of Row API by storing TypePtr instead of String #204
Conversation
pub fn make_row(fields: Vec<(TypePtr, Field)>) -> Row { Row { fields } } | ||
|
||
impl PartialEq for Row { | ||
fn eq(&self, other: &Row) -> bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Equality is only used in tests, normally user would not run that in the production code.
@@ -706,6 +724,20 @@ mod tests { | |||
}}; | |||
} | |||
|
|||
// NOTE: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should refactor our tests, there is a lot of code duplication here and there.
Fails to run |
Yes, there are some issue in the rustup toolchain right now. See the reddit thread. Let me see if we can use a nightly on a particular date to fix this issue. |
Pull Request Test Coverage Report for Build 706
💛 - Coveralls |
Sorry @sadikovi I forgot about this PR. Since |
Yes, no problem. I think it would be good to mark this repo as moved or something similar. |
Yes. Will update the README for the merge. |
This PR updates
Row
struct to storeVec<(TypePtr, Field)>
instead ofVec<(String, Field)>
. This improves time when reading Parquet files.For this I had to change record reader to pass type pointer instead of field name. This unfortunately makes it difficult to test, so I added a macro to generate dummy type with a field name.
Benchmark results:
Before
After