-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Docs: add table with amount of memory required at different matrix sizes #3
Comments
Worth pointing out that it tries to be reasonably efficient, i.e. it reuses the same memory for all arrays being benchmarked. So only the largest array matters. This gives the memory requirement for the largest matrix, in julia> mem_req(s, T = Float64) = 3s^2*sizeof(T) / (1 << 30)
mem_req (generic function with 2 methods)
julia> mem_req(10_000)
2.2351741790771484
julia> mem_req(20_000)
8.940696716308594 So for 2.25 GiB isn't too bad. However, if you have an 8 core (16 thread) computer capable of running full-rate AVX2 at 4 GHz, you have julia> 4 * (4 + 4) * 2 * 8
512 512 theoretical peak GFLOPS. julia> time_est(sz, gflops) = 2e-9 * sz^3 / gflops
time_est (generic function with 1 method)
julia> time_est(10_000, 512)
3.9062500000000004
julia> time_est(20_000, 512)
31.250000000000004 3.9 or 31 seconds. Running a lot of benchmarks with big matrices can take a while. |
Based on the above Float64
Float32
Int64
Int32
mem_req(s, T) = 3s^2*sizeof(T) / (1 << 30)
function f(::Type{T}, Ns = nothing) where {T}
println("| Matrix Size | Memory |")
println("| ----------- | ------ |")
if Ns isa Nothing
_Ns = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 30, 40, 50, 60, 70, 80, 90, 100]
else
_Ns = Ns
end
for N in _Ns
mem = mem_req(N * 1_000, T)
m = round(mem; digits = 2)
println("| $(N)k by $(N)k | $(m) GiB |")
end
return nothing
end |
I'll make a PR to add those tables to a page in the docs. |
Sorry, I realized it creates two |
Ah good point. I'll regenerate the tables. |
E.g. here's what I'm envisioning:
I just made those numbers up - obviously they are incorrect. But it would be nice to have a table like that, with the actual numbers.
That way, when people want to run these benchmarks on their own computer, they can just:
Also, if someone wants to run the benchmarks on a cluster, they can ask their scheduler (e.g. SLURM) for the necessary amount of memory.
The text was updated successfully, but these errors were encountered: