In [13]:
] activate ../../

[32m[1mActivating[22m[39m environment at `C:\Users\carsten\Desktop\Oulu2020\Project.toml`


# STREAM Benchmark

Resources: https://blogs.fau.de/hager/archives/8263, https://www.cs.virginia.edu/stream/

Getting a realistic estimate of the achievable maximal memory bandwidth.

## Kernel

In [7]:
function copy(C,A)
    @assert length(C) == length(A)
    @inbounds for i in eachindex(C,A)
        C[i] = A[i]
    end
    nothing
end

function scale(B,C,s)
    @assert length(C) == length(B)
    @inbounds for i in eachindex(C)
        B[i] = s * C[i]
    end
    nothing
end

function add(C,A,B)
    @assert length(C) == length(B) == length(A)
    @inbounds for i in eachindex(C)
        C[i] = A[i] + B[i]
    end
    nothing
end

function triad(A,B,C,s)
    @assert length(C) == length(B) == length(A)
    @inbounds for i in eachindex(C)
        A[i] = B[i] + s*C[i]
    end
    nothing
end

triad (generic function with 2 methods)

In [35]:
using CpuId

N = 2 * Int(4 * last(cachesize()) / 8) # rule of thumb
A, B, C, D = zeros(N), zeros(N), zeros(N), zeros(N);
s = rand();

In [41]:
using BenchmarkTools

#assuming write-allocate transfers below
t_copy = @belapsed copy($C, $A)
membw_copy = N*24*1e-9/t_copy
println("COPY:  ", membw_copy, " GB/s")

t_scale = @belapsed scale($B, $C, $s)
membw_scale = N*24*1e-9/t_scale
println("SCALE: ", membw_scale, " GB/s")

t_add = @belapsed add($C, $A, $B)
membw_add = N*32*1e-9/t_add
println("ADD:   ", membw_add, " GB/s")

t_triad = @belapsed triad($A, $B, $C, $s)
membw_triad = N*32*1e-9/t_triad
println("TRIAD: ", membw_triad, " GB/s")

COPY:  30.319661050982916 GB/s
SCALE: 30.193547922664337 GB/s
ADD:   27.856793759242 GB/s
TRIAD: 28.070409637210343 GB/s


In [50]:
using Statistics
membw_mean = mean([membw_copy, membw_scale, membw_add, membw_triad])
println("Mean max. memory bandwidth: ", round(membw_mean, digits=2), " GB/s")

Mean max. memory bandwidth: 29.11 GB/s


Checking the [Intel specification for the i5-6600](https://ark.intel.com/content/www/de/de/ark/products/88188/intel-core-i5-6600-processor-6m-cache-up-to-3-90-ghz.html), we find that the theoretical maximal memory bandwidth is 34.1 GB/s,

In [53]:
membw_theory = 34.1;

In [54]:
membw_mean/membw_theory

0.8536687123907595