Skip to content

Conversation

@ChaiTRex
Copy link
Contributor

@ChaiTRex ChaiTRex commented Sep 18, 2018

Summary with benchmark results

Benchmarking merge sort with a vector size of 250,000,000, timing results are unaffected by these commits. However, the benchmarks use:

about 9.32 GiB before these commits
$ time -v dist/build/simple-bench/simple-bench -A MergeSort -n 250000000
Testing: merge sort
Random             : 105.641547237 seconds
Sorted             : 43.755033358 seconds
Reverse-sorted     : 47.927451066 seconds
Random duplicates  : 94.516762071 seconds
Median killer      : 55.749034796 seconds
	Command being timed: "dist/build/simple-bench/simple-bench -A MergeSort -n 250000000"
	User time (seconds): 355.31
	System time (seconds): 4.27
	Percent of CPU this job got: 99%
	Elapsed (wall clock) time (h:mm:ss or m:ss): 5:59.68
	Average shared text size (kbytes): 0
	Average unshared data size (kbytes): 0
	Average stack size (kbytes): 0
	Average total size (kbytes): 0
	Maximum resident set size (kbytes): 9772512
	Average resident set size (kbytes): 0
	Major (requiring I/O) page faults: 0
	Minor (reclaiming a frame) page faults: 2442312
	Voluntary context switches: 1
	Involuntary context switches: 5833
	Swaps: 0
	File system inputs: 0
	File system outputs: 0
	Socket messages sent: 0
	Socket messages received: 0
	Signals delivered: 0
	Page size (bytes): 4096
	Exit status: 0

$ time -v dist/build/simple-bench/simple-bench -A MergeSort -n 250000000
Testing: merge sort
Random             : 102.0173054 seconds
Sorted             : 43.215281981 seconds
Reverse-sorted     : 46.977218873 seconds
Random duplicates  : 90.620823144 seconds
Median killer      : 54.175290449 seconds
	Command being timed: "dist/build/simple-bench/simple-bench -A MergeSort -n 250000000"
	User time (seconds): 344.88
	System time (seconds): 3.93
	Percent of CPU this job got: 99%
	Elapsed (wall clock) time (h:mm:ss or m:ss): 5:48.85
	Average shared text size (kbytes): 0
	Average unshared data size (kbytes): 0
	Average stack size (kbytes): 0
	Average total size (kbytes): 0
	Maximum resident set size (kbytes): 9772684
	Average resident set size (kbytes): 0
	Major (requiring I/O) page faults: 0
	Minor (reclaiming a frame) page faults: 2442312
	Voluntary context switches: 1
	Involuntary context switches: 1626
	Swaps: 0
	File system inputs: 0
	File system outputs: 0
	Socket messages sent: 0
	Socket messages received: 0
	Signals delivered: 0
	Page size (bytes): 4096
	Exit status: 0
about 7.45 GiB after the first commit
$ time -v dist/build/simple-bench/simple-bench -A MergeSort -n 250000000
Testing: merge sort
Random             : 102.570755197 seconds
Sorted             : 43.153803419 seconds
Reverse-sorted     : 48.570614044 seconds
Random duplicates  : 93.463369144 seconds
Median killer      : 54.63930903 seconds
	Command being timed: "dist/build/simple-bench/simple-bench -A MergeSort -n 250000000"
	User time (seconds): 349.71
	System time (seconds): 3.49
	Percent of CPU this job got: 99%
	Elapsed (wall clock) time (h:mm:ss or m:ss): 5:53.24
	Average shared text size (kbytes): 0
	Average unshared data size (kbytes): 0
	Average stack size (kbytes): 0
	Average total size (kbytes): 0
	Maximum resident set size (kbytes): 7819640
	Average resident set size (kbytes): 0
	Major (requiring I/O) page faults: 0
	Minor (reclaiming a frame) page faults: 1954059
	Voluntary context switches: 1
	Involuntary context switches: 2245
	Swaps: 0
	File system inputs: 0
	File system outputs: 0
	Socket messages sent: 0
	Socket messages received: 0
	Signals delivered: 0
	Page size (bytes): 4096
	Exit status: 0

$ time -v dist/build/simple-bench/simple-bench -A MergeSort -n 250000000
Testing: merge sort
Random             : 104.736320866 seconds
Sorted             : 44.565925792 seconds
Reverse-sorted     : 49.598375232 seconds
Random duplicates  : 92.925600618 seconds
Median killer      : 56.065786839 seconds
	Command being timed: "dist/build/simple-bench/simple-bench -A MergeSort -n 250000000"
	User time (seconds): 355.34
	System time (seconds): 3.49
	Percent of CPU this job got: 99%
	Elapsed (wall clock) time (h:mm:ss or m:ss): 5:58.87
	Average shared text size (kbytes): 0
	Average unshared data size (kbytes): 0
	Average stack size (kbytes): 0
	Average total size (kbytes): 0
	Maximum resident set size (kbytes): 7819500
	Average resident set size (kbytes): 0
	Major (requiring I/O) page faults: 0
	Minor (reclaiming a frame) page faults: 1954058
	Voluntary context switches: 1
	Involuntary context switches: 2174
	Swaps: 0
	File system inputs: 0
	File system outputs: 0
	Socket messages sent: 0
	Socket messages received: 0
	Signals delivered: 0
	Page size (bytes): 4096
	Exit status: 0
about 5.59 GiB after both commits
$ time -v dist/build/simple-bench/simple-bench -A MergeSort -n 250000000
Testing: merge sort
Random             : 100.119498798 seconds
Sorted             : 44.544652572 seconds
Reverse-sorted     : 47.439132265 seconds
Random duplicates  : 89.077430416 seconds
Median killer      : 52.108055876 seconds
	Command being timed: "dist/build/simple-bench/simple-bench -A MergeSort -n 250000000"
	User time (seconds): 341.05
	System time (seconds): 2.96
	Percent of CPU this job got: 99%
	Elapsed (wall clock) time (h:mm:ss or m:ss): 5:44.05
	Average shared text size (kbytes): 0
	Average unshared data size (kbytes): 0
	Average stack size (kbytes): 0
	Average total size (kbytes): 0
	Maximum resident set size (kbytes): 5866496
	Average resident set size (kbytes): 0
	Major (requiring I/O) page faults: 0
	Minor (reclaiming a frame) page faults: 1465782
	Voluntary context switches: 1
	Involuntary context switches: 3009
	Swaps: 0
	File system inputs: 0
	File system outputs: 0
	Socket messages sent: 0
	Socket messages received: 0
	Signals delivered: 0
	Page size (bytes): 4096
	Exit status: 0

$ time -v dist/build/simple-bench/simple-bench -A MergeSort -n 250000000
Testing: merge sort
Random             : 99.84921236 seconds
Sorted             : 40.872955826 seconds
Reverse-sorted     : 46.306745789 seconds
Random duplicates  : 89.254555669 seconds
Median killer      : 53.497425148 seconds
	Command being timed: "dist/build/simple-bench/simple-bench -A MergeSort -n 250000000"
	User time (seconds): 337.80
	System time (seconds): 2.78
	Percent of CPU this job got: 99%
	Elapsed (wall clock) time (h:mm:ss or m:ss): 5:40.62
	Average shared text size (kbytes): 0
	Average unshared data size (kbytes): 0
	Average stack size (kbytes): 0
	Average total size (kbytes): 0
	Maximum resident set size (kbytes): 5866352
	Average resident set size (kbytes): 0
	Major (requiring I/O) page faults: 0
	Minor (reclaiming a frame) page faults: 1465779
	Voluntary context switches: 1
	Involuntary context switches: 3351
	Swaps: 0
	File system inputs: 0
	File system outputs: 0
	Socket messages sent: 0
	Socket messages received: 0
	Signals delivered: 0
	Page size (bytes): 4096
	Exit status: 0
This also reduces merge sort's memory usage outside of benchmarks.

Compiling and testing

I'm unsure of whether or not the BoundsChecks, UnsafeChecks, and InternalChecks flags actually do anything, as I didn't use them with cabal install vector. At any rate, all tests passed in all three cases above.

$ cabal configure -O2 --flags="BoundsChecks UnsafeChecks InternalChecks bench properties llvm" --enable-benchmarks --enable-tests
Resolving dependencies...
Configuring vector-algorithms-0.8.0.0...

$ cabal build --ghc-options="-fforce-recomp -O2 -fllvm -optlo -O3 -optlc -O3"
Preprocessing library for vector-algorithms-0.8.0.0..
Building library for vector-algorithms-0.8.0.0..
[ 1 of 10] Compiling Data.Vector.Algorithms.Common ( src/Data/Vector/Algorithms/Common.hs, dist/build/Data/Vector/Algorithms/Common.o )
⋮
[2 of 2] Compiling Main             ( bench/simple/Main.hs, dist/build/simple-bench/simple-bench-tmp/Main.o )
Linking dist/build/simple-bench/simple-bench ...

$ dist/build/properties/properties
Int tests:
+++ OK, passed 1000 tests (100.0% introsort).
⋮
+++ OK, passed 1 tests (100% flagsort empty).

Even though the vector lengths used in each test are identical,
`simple-bench` allocates a new vector for every test it runs rather than
reinitializing the same vector repeatedly.

This can cause system freezes if the extra vectors overfill RAM and
start causing aggressive swapping.
The documentation at the top of `Merge.hs` claims "The temporary buffer
is preallocated to 1/2 the size of the input array", and this is now
correct.

We also save a few comparisons in order to be able to use one to avoid
allocating a vector if it won't be used at all.

Ceiling (`(len + 1)/2`) is used instead of floor (`len/2`) in order to
be able to fit the larger half of odd-length vectors.
Copy link
Owner

@erikd erikd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@erikd erikd merged commit 499241f into erikd:master Sep 18, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants