-
Notifications
You must be signed in to change notification settings - Fork 50
Description
I've written a simple example to test the performance of MbedTLS. It's unoptimized and probably incorrect in some aspects, but I hope it shows the issue that I'm facing when using MbedTLS through HTTP.jl.
using MbedTLS
using Sockets
using Base.Threads
function tls_test(num_iters, concurrency)
entropy = MbedTLS.Entropy()
rng = MbedTLS.CtrDrbg()
MbedTLS.seed!(rng, entropy)
size = 1024*1024
buffer = Array{UInt8}(undef, size)
p = Ptr{UInt8}(ccall(:jl_value_ptr, Ptr{UInt8}, (Any,), buffer))
sem = Base.Semaphore(concurrency)
@sync begin
for i in 1:num_iters
@spawn begin
Base.acquire(sem)
sock = connect("httpbin.org", 443)
ctx = MbedTLS.SSLContext()
conf = MbedTLS.SSLConfig()
MbedTLS.config_defaults!(conf)
MbedTLS.authmode!(conf, MbedTLS.MBEDTLS_SSL_VERIFY_REQUIRED)
MbedTLS.rng!(conf, rng)
function show_debug(level, filename, number, msg)
@show level, filename, number, msg
end
MbedTLS.dbg!(conf, show_debug)
MbedTLS.ca_chain!(conf)
MbedTLS.setup!(ctx, conf)
MbedTLS.set_bio!(ctx, sock)
MbedTLS.handshake(ctx)
Base.unsafe_write(ctx, p, size)
close(sock)
Base.release(sem)
end
end
end
end
tls_test(4096, 512)
On machine with 8 cores and 1.5GB/s NIC throughput, this achieves a bit less than 200 MB/s. CPU is 100%, and it takes ~22s.
mbedtls_gcm_update takes 40%, which means that CPU time spent in that function is ~70s (accounting for 8 cores). My assumption is that this function doesn't do network communication nor invokes it, but does pure processing.
So throughput of mbedtls_gcm_update is effectively ~58 MB/s per core on this machine.
This means that while machine has 1.5GB/s throughput, mbedtls_gcm_update is taking time, allowing only for around ~464MB/s for 8 cores in ideal conditions (no other CPU usage in the callstack), and would require more than 24 cores to utilise full NIC.
For comparison, similar (with a bit higher level of abstraction) test with HTTP put requests in Go, on the same machine, can achieve ~1.5GB/s, hitting NIC's throughput as a bottleneck.
Are there any ideas for how mbedtls_gcm_update could be optimized? Is this something worth submitting as an issue in https://github.com/Mbed-TLS/mbedtls ? I am not sure if this is also what happens if it's used directly, without Julia wrapper though.
PProf profile file:
prof_ssl1.pb.gz
