You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is this the only thing related in multiple core (wrote in readme)?
In mergeSortedFiles seem read one line from the sorted files one by one.
in sortInBatch . seem sort one block one by one . files.add(sortAndSave(tmplist, cmp, cs,tmpdirectory, distinct, usegzip, parallel));
Can we do concurrent handling in mergeSortedFiles (like read block concurrent ) and in sortInBatch (one thread merge the smaller , one thread to merge the bigger, such like 20 tmpfiles to 10 tmpfiles then to 4 then to 1)
The text was updated successfully, but these errors were encountered:
jiamo
changed the title
Don't find function use multi core?
How to use multi core maximize?
Jun 21, 2021
As you have yourself observed, there are no parameters.
You seem to believe that we can do much better. I am sure it is true but consider that the library is meant to be able to sort very large files using very little memory. So it is not a simple matter of throwing more cores and more memory. Please consider the following points:
Memory usage should be kept constant. It is easy to improve the performance by sorting in parallel multiple chunks, but that's not a fair comparison. The actual comparison is between sorting one chunk in memory, or two half-chunk, or four quarter-chunk. That is, the more cores you use, the less memory you have a per-core basis.
Your pull request should include reasonable benchmarks so we can measure the benefits as the number of cores grow.
I found code in
sortAndSave
Is this the only thing related in
multiple core
(wrote in readme)?In
mergeSortedFiles
seem read one line from the sorted files one by one.in
sortInBatch
. seem sort one block one by one .files.add(sortAndSave(tmplist, cmp, cs,tmpdirectory, distinct, usegzip, parallel));
Can we do concurrent handling in
mergeSortedFiles
(like read block concurrent ) and insortInBatch
(one thread merge the smaller , one thread to merge the bigger, such like 20 tmpfiles to 10 tmpfiles then to 4 then to 1)The text was updated successfully, but these errors were encountered: