Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Response to Issue #15:
Changed the computations of statistics in the DescriptiveStatistics and Correlation classes to the one pass algorithm based on J. Bannett et al. Numerically Stable, Single Pass, Parallel Statistics Algorithms. Proc. IEEE International Conference on Cluster Computing. (2009). (Although DescriptiveStatistics still needs two passes, one for computing mean, variances, etc. and the other one for computing median, whereas the current version needs three passes, first compute mean, then the other statistics except median, then compute median) The ones in the static Statistics class are already one pass.
Changed the choice of pivot in the private OrderSelect method in the Statistics class. The current choice of pivot performs badly with a partially sorted list, which is a problem when using it on a large dataset with even number of entries. (because in the Median method, OrderSelect is called twice when the number of entries are even and in the first call, it partially sorted the buffer set, which leads to poor performance in the second call)
All these changes have passed the existing unit tests.