The partitions.bin
used to be a 2D image of pixels, with each pixel describing a “q ring”, as such:
Each pixel is an index describing which ring it belongs to, starting at 0 (blue) in the center.
partitions.bin
contains this 2D image as a list of Word8
index values. To process it further, we want a different format though: for each partition p
, we want a list of all pixel indices (as Int
) for this partition. It’s a translation like this:
toIndices :: [Word8] -> [[Int]]
Assuming contiguous indices for the partitions.
The Minimal.hs
file implements this logic, based on Data.Vector.Unboxed
for the pixels instead of slow, big Haskell lists.
I’d like to make this as fast as possible, and I thought, I could parallelize the work: go through the image “number of partitions” times in parallel and gather the indices. Instead of the plain map:
(\partitionNumber -> UV.findIndices (== partitionNumber) pixels) <$> partitionIndices
…one could do:
((\partitionNumber -> UV.findIndices (== partitionNumber) pixels) <$> partitionIndices) `using` parList rseq
To spark each iteration through the image and gather the results.
This works, in the sense that we get sparks, and they are all converted. But the result, for some reason, is slower than before.
I don’t know how to debug this problem, and ask for help.
Easy enough:
cabal run problem -- +RTS -s
To run the program without parallelization (with -N1
). Takes about 3.2s on my machine.
With parallelization:
cabal run problem -- +RTS -s -N
Takes about 5s on my machine.