Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improve read_pcm_samples #95

Merged
merged 2 commits into from
Jun 26, 2021
Merged

improve read_pcm_samples #95

merged 2 commits into from
Jun 26, 2021

Conversation

ymtoo
Copy link
Contributor

@ymtoo ymtoo commented Nov 25, 2020

Use broadcasting to reduce number of allocations.

using BenchmarkTools, WAV

path = mktempdir()
y = trunc.(Int16, 10000 .* sin.((0:9999999)/48000*2pi*440))
wavwrite(y, joinpath(path, "test.wav"), Fs=48000)

Before:

julia> @benchmark wavread(joinpath(path, "test.wav"); format="native")
BenchmarkTools.Trial: 
  memory estimate:  486.24 MiB
  allocs estimate:  29366204
  --------------
  minimum time:     1.229 s (1.05% GC)
  median time:      1.321 s (5.87% GC)
  mean time:        1.324 s (5.91% GC)
  maximum time:     1.426 s (10.15% GC)
  --------------
  samples:          4
  evals/sample:     1

After:

@benchmark wavread(joinpath(path, "test.wav"); format="native")
BenchmarkTools.Trial: 
  memory estimate:  114.44 MiB
  allocs estimate:  58
  --------------
  minimum time:     114.778 ms (0.18% GC)
  median time:      119.207 ms (3.60% GC)
  mean time:        125.500 ms (6.17% GC)
  maximum time:     188.214 ms (35.16% GC)
  --------------
  samples:          40
  evals/sample:     1

@coveralls
Copy link

coveralls commented Nov 25, 2020

Coverage Status

Coverage decreased (-0.07%) to 68.352% when pulling 634312c on ymtoo:patch-1 into 249b072 on dancasimiro:main.

@mgkuhn
Copy link
Contributor

mgkuhn commented Nov 25, 2020

Thanks for pointing out this huge inefficiency! However, is always first allocating the samples array in UInt64 and then bulk-converting it in the end to sample_type really the best we can do here? Shouldn't sample_type really be a type parameter of the function, such that the compiler can then specialize all of read_pcm_samples for each value of sample_type and dispatch to a function specialized for that type?

I think how this function deals with types needs a slightly bigger rethink.

@mgkuhn
Copy link
Contributor

mgkuhn commented Nov 25, 2020

I see that

function read_ieee_float_samples(io::IO, chunk_size,
                                 fmt::WAVFormat, subrange,
                                 ::Type{floatType}) where {floatType}

does already use a type parameter floatType, and I suspect read_pcm_samples should do something similar. The call to pcm_container_type(nbits) needs to move out of read_pcm_samples such that the compiler can dispatch on the result. For performance critical code, always decide on types at compile type.

@ymtoo
Copy link
Contributor Author

ymtoo commented Nov 25, 2020

This PR is a workaround to avoid the large number of allocations with minimal changes to the function. I am not familiar with how the compiler works but agreed that a rework is required such that a method will be determined at compile time when calling the function with a type parameter.

@mgkuhn
Copy link
Contributor

mgkuhn commented Nov 25, 2020

Here is an alternative quick workaround (which makes sample_type a constant rather than a variable):

diff --git a/src/WAV.jl b/src/WAV.jl
index a41a6f1..be99782 100644
--- a/src/WAV.jl
+++ b/src/WAV.jl
@@ -313,13 +313,13 @@ end
 
 ieee_float_container_type(nbits) = (nbits == 32 ? Float32 : (nbits == 64 ? Float64 : error("$nbits bits is not supported for WAVE_FORMAT_IEEE_FLOAT.")))
 
-function read_pcm_samples(io::IO, chunk_size, fmt::WAVFormat, subrange)
+function read_pcm_samples(io::IO, chunk_size, fmt::WAVFormat, subrange,
+                          ::Type{sample_type}) where {sample_type}
     nbits = bits_per_sample(fmt)
     if isempty(subrange)
-        return Array{pcm_container_type(nbits), 2}(undef, 0, fmt.nchannels)
+        return Array{sample_type, 2}(undef, 0, fmt.nchannels)
     end
-    samples = Array{pcm_container_type(nbits), 2}(undef, length(subrange), fmt.nchannels)
-    sample_type = eltype(samples)
+    samples = Array{sample_type, 2}(undef, length(subrange), fmt.nchannels)
     nbytes = ceil(Integer, nbits / 8)
     bitshift = [0x0, 0x8, 0x10, 0x18, 0x20, 0x28, 0x30, 0x38, 0x40]
     mask = UInt64(0x1) << (nbits - 1)
@@ -605,7 +605,8 @@ function read_data(io::IO, chunk_size, fmt::WAVFormat, format, subrange)
         subrange = 1:convert(UInt, chunk_size / fmt.block_align)
     end
     if isformat(fmt, WAVE_FORMAT_PCM)
-        samples = read_pcm_samples(io, chunk_size, fmt, subrange)
+        samples = read_pcm_samples(io, chunk_size, fmt, subrange,
+                                   pcm_container_type(bits_per_sample(fmt)))
         convert_to_double = x -> convert_pcm_to_double(x, bits_per_sample(fmt))
     elseif isformat(fmt, WAVE_FORMAT_IEEE_FLOAT)
         samples = read_ieee_float_samples(io, chunk_size, fmt, subrange)

(But the more I look at this function, the more I develop an urge to rewrite it from scratch at some point ...)

@ymtoo
Copy link
Contributor Author

ymtoo commented Nov 26, 2020

Your workaround on the type stability of read_pcm_samples works better!

@benchmark wavread(joinpath(path, "test.wav"); format="native")
BenchmarkTools.Trial: 
  memory estimate:  38.15 MiB
  allocs estimate:  53
  --------------
  minimum time:     82.005 ms (0.00% GC)
  median time:      83.332 ms (0.38% GC)
  mean time:        86.543 ms (3.02% GC)
  maximum time:     155.182 ms (43.36% GC)
  --------------
  samples:          58
  evals/sample:     1

@dancasimiro dancasimiro merged commit 8564ac9 into dancasimiro:main Jun 26, 2021
dancasimiro added a commit that referenced this pull request Jun 26, 2021
Includes fixes for
#95 -- optimize read_pcm_samples
#101 -- Fix bug where tags with multiple nulls exploded.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants