In [1]:
n = 10^3

nu_ = rand((1, 2, 3), n)

;

Generators do not alloacte.

In [13]:
@btime sum(nu == 1 for nu in $nu_)

;

  134.143 ns (0 allocations: 0 bytes)


Making a vector object results in an unnecessary allocation.

In [14]:
@btime sum([nu == 1 for nu in $nu_])

;

  303.450 ns (1 allocation: 1.06 KiB)


Mapping a function allocates.
But letting `sum` chain this mapping speeds up the computation.

In [15]:
@btime sum(map(==(1), $nu_))

@btime sum(==(1), $nu_)

;

  397.801 ns (1 allocation: 1.06 KiB)
  137.111 ns (0 allocations: 0 bytes)


Vectorization is usually fast.

In [16]:
@btime $nu_ .* 1 ./ 2 .+ 3 .- 4 .^ 5

@btime [nu * 1 / 2 + 3 - 4 ^ 5 for nu in $nu_]

;

  232.868 ns (1 allocation: 7.94 KiB)
  247.361 ns (1 allocation: 7.94 KiB)


But vectorization is slow here because it makes a `BitVector` instead of `Vector{Bool}`, where the former is slower to make, smaller, and faster to compute on.

In [17]:
@btime sum($nu_ .== 1)

;

  454.107 ns (3 allocations: 4.41 KiB)


In [18]:
bo_ = [nu == 1 for nu in nu_]

println(supertypes(typeof(bo_)))

println(sizeof(bo_))

@btime sum($bo_)

;

(Vector{Bool}, DenseVector{Bool}, AbstractVector{Bool}, Any)
1000
  27.513 ns (0 allocations: 0 bytes)


In [19]:
bi_ = nu_ .== 1

println(supertypes(typeof(bi_)))

println(sizeof(bi_))

@btime sum($bi_)

;

(BitVector, AbstractVector{Bool}, Any)
128
  4.875 ns (0 allocations: 0 bytes)


Generators are usually the best, but prefer a specific function like `sum(function, container)`.

Check vectorization intermediate types.