New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
IntSeries / IntMutableList for joins and filters #26
Milestone
Comments
Note that an implementation for joins is fairly straightforward. However an implementation for "sort" operation is more quirky, as JDK libs do not support sorting of int[] with a custom Comparator. Will need to write our own sort algorithm. |
andrus
added a commit
that referenced
this issue
Apr 10, 2019
andrus
changed the title
IntSeries / IntMutableList - let's try using primitives
IntSeries / IntMutableList for joins and filters
Apr 10, 2019
Latest performance measurements:
|
andrus
added a commit
that referenced
this issue
Apr 10, 2019
After related #27 implementation, the numbers are improved: Latest performance measurements:
|
This was referenced Apr 10, 2019
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Let's create
IntMutableList
(an appendable collection of primitive "int" values) that can be converted toIntSeries
, which is immutable.While working with collections of primitives in Java is painful, there can be real performance gains. My prototype of the data structures above speeds up joins by ~ 25-30% when used for indexing joined DataFrames.
This task will switch joins and filters to int-based implementation. Sorters and groupers will be switched separately, as this requires our own custom sorter.
The text was updated successfully, but these errors were encountered: