We can use `findall` instead of current approach which will be faster with e.g. https://github.com/andyferris/AcceleratedArrays.jl