The current implementation uses in many cases the Overlap and Save method for Linear Convolution.
While this method is optimized to a streamed data it is not optimized for cases where all data is given in memory.
See https://discourse.julialang.org/t/convolution-conv-with-same-size-output/38260/11:

I think it is better to implement direct loop with @simd and @inbounds. It will probably be faster than the current method for most cases. For longer kernels / signals it should use the Frequency Domain as it is implemented now,