(let [max-t (double (* 44100 1.0))
inc-t (double (/ 1.0 44100))]
(time (loop [t 0.0]
(when (< t max-t)
(recur (+ t inc-t))))))
“Elapsed time: 5100.59828 msecs”
Holy crap!
Only wait, this is for 44100 seconds of audio, not one second like I thought.
(let [^double max-t (double (* 44100 1.0))
^double inc-t (double (/ 1.0 44100))]
(time (loop [t 0.0]
(when (< t max-t)
(recur (+ t inc-t))))))
Fails because you can’t primitive-hint locals like that. The `(double)` declaration ought to be enough to get it to use primitive math. So WTF is it so slow?
(let [max-t (double (* 44100 1.0))
inc-t (double (/ 1.0 44100))]
(time (doseq [t (range 0 max-t inc-t)])))
“Elapsed time: 55371.522445 msecs”
Yaa!
Only wait, this is for 44100 seconds of audio, not one second like I thought.
Hmm. I have the debugging enabled in my project. Wonder if that matters? Turning it off and doing this again:
(let [max-t (double (* 44100 1.0))
inc-t (double (/ 1.0 44100))]
(time (loop [t 0.0]
(when (< t max-t)
(recur (+ t inc-t))))))
“Elapsed time: 5087.389305 msecs”
Nope.
(let [max-t (* 44100 1.0)
inc-t (/ 1.0 44100)]
(time (loop [t 0.0]
(when (< t max-t)
(recur (+ t inc-t))))))
“Elapsed time: 5101.189955 msecs”
(let [max-t (* 44100 1.0)
inc-t (/ 1.0 44100)]
(time (loop [t 0]
(when (< t max-t)
(recur (+ t inc-t))))))
“Elapsed time: 7947.990526 msecs”
(let [max-t (* 44100 1.0)
inc-t (/ 1.0 44100)]
(time (loop [t (long 0)]
(when (< t max-t)
(recur (+ t inc-t))))))
“Elapsed time: 7411.203103 msecs”
(let [max-t (double (* 44100 1.0))
inc-t (/ 1.0 44100)]
(time (loop [t (int 0)]
(when (< t max-t)
(recur (+ t inc-t))))))
“Elapsed time: 7299.571628 msecs”
(let [max-t (double (* 44100 1.0))
inc-t (/ 1.0 44100)]
(time (loop [t (int 0)]
(when (< t max-t)
(recur (+ t inc-t))))))
“Elapsed time: 7406.9407 msecs”
(time
(loop [t 0]
(when (< t 1.0)
(recur (+ t 2.2675736961451248E-5)))))
“Elapsed time: 3.055637 msecs”
(time
(loop [t 0]
(when (< t 1.0)
(recur (+ t (/ 1.0 44100))))))
“Elapsed time: 4.770925 msecs”
(let [inc-t (/ 1.0 44100)]
(time
(loop [t 0]
(when (< t 1.0)
(recur (+ t inc-t))))))
“Elapsed time: 3.64431 msecs”
(let [inc-t (/ 1.0 44100)
max-t 1.0]
(time
(loop [t 0]
(when (< t max-t)
(recur (+ t inc-t))))))
“Elapsed time: 4.422404 msecs”
(let [inc-t (/ 1.0 44100)
max-t 3600.0]
(time
(loop [t 0]
(when (< t max-t)
(recur (+ t inc-t))))))
(let [inc-t (/ 1.0 44100)
max-t 3600.0]
(time
(loop [t 0.0]
(when (< t max-t)
(Math/sin t)
(recur (+ t inc-t))))))
(let [inc-t (/ 1.0 44100)
max-t 60.0
s (null-sound)]
(time
(loop [t 0.0]
(when (< t max-t)
(sample s t)
(recur (+ t inc-t))))))
(let [inc-t (/ 1.0 44100)
max-t 60.0
s (silence 60)]
(time
(loop [t 0.0]
(when (< t max-t)
(sample s t)
(recur (+ t inc-t))))))
Wow - that’s much faster than the null sound. Something in `sample` must be slow. I suspect the call to vec. Let’s try taking that out.
(let [inc-t (/ 1.0 44100)
max-t 60.0
s (null-sound)]
(time
(loop [t 0.0]
(when (< t max-t)
(sample s t)
(recur (+ t inc-t))))))
Even slower without the vec! Maybe we should try to make generating the zero-samples faster.
(let [inc-t (/ 1.0 44100)
max-t 60.0
s (null-sound)]
(time
(loop [t 0.0]
(when (< t max-t)
(sample s t)
(recur (+ t inc-t))))))
Still slow! Maybe it’s calling channels that’s slow, not the zeros. Let’s try memoizing based on the sound, too.
(let [inc-t (/ 1.0 44100)
max-t 60.0
s (null-sound)]
(time
(loop [t 0.0]
(when (< t max-t)
(sample s t)
(recur (+ t inc-t))))))
Wow - much, much better. OK, so channels is probably slow. Why?
(let [n 1000000]
(time
(loop [n n]
(when (pos? n)
(recur (dec n))))))
(let [n 1000000
s (linear 1.0 1 0)]
(time
(loop [n n]
(when (pos? n)
(channels s)
(recur (dec n))))))
So, yes: slow
(let [n 1000000
s (null-sound)]
(time
(loop [n n]
(when (pos? n)
(satisfies? impl/Sound s)
(recur (dec n))))))
Nope.
(let [n 1000000
s (linear 1.0 1 0)]
(time
(loop [n n]
(when (pos? n)
(channels s)
(recur (dec n))))))
Hah. No.
(let [inc-t (/ 1.0 44100)
max-t 60.0
s (null-sound)]
(time
(loop [t 0.0]
(when (< t max-t)
(sample s t)
(recur (+ t inc-t))))))
Much faster!
(let [inc-t (/ 1.0 44100)
max-t 60.0
s (read-sound "sin.wav")]
(time
(loop [t 0.0]
(when (< t max-t)
(sample s t)
(recur (+ t inc-t))))))
Huh! Not bad. Let’s whip out a profiler and see what’s slow. Signs point to it being vec inside of read-sound’s reification of amplitudes.
(let [inc-t (/ 1.0 44100)
max-t 60.0
s (read-sound "sin.wav")]
(time
(loop [t 0.0]
(when (< t max-t)
(sample s t)
(recur (+ t inc-t))))))
Quite a bit better. And over 60x realtime. I call that a victory. Still, the profiler says that we’re spending a lot of time in `second`. Let’s see if replacing that with nth is any better.
(let [inc-t (/ 1.0 44100)
max-t 60.0
s (read-sound "sin.wav")]
(time
(loop [t 0.0]
(when (< t max-t)
(sample s t)
(recur (+ t inc-t))))))
Hmm. Not much change. But I found other places where we’re calling first and second
(let [inc-t (/ 1.0 44100)
max-t 60.0
s (read-sound "sin.wav")]
(time
(loop [t 0.0]
(when (< t max-t)
(sample s t)
(recur (+ t inc-t))))))
Wow! Much better again! Up to 120x realtime.
Now the profiler says it’s get. So let’s see if we can refactor that out of there.
(let [inc-t (/ 1.0 44100)
max-t 60.0
s (read-sound "sin.wav")]
(time
(loop [t 0.0]
(when (< t max-t)
(sample s t)
(recur (+ t inc-t))))))
A little better. I think we might be at the point of diminishing returns. Profiler says the dominating factor is now self-time in sample. Which I’m not sure how to optimize any further easily. Let’s move on.
(let [inc-t (/ 1.0 44100)
max-t 60.0
s (read-sound "sin.wav")]
(time
(loop [t 0.0]
(when (< t max-t)
(sample s t)
(recur (+ t inc-t))))))
Slightly worse, but the other way was actually broken due to some laziness issue I never figured out. Let’s call this good for now.
(let [inc-t (/ 1.0 44100)
delta-t (/ inc-t 4)
s (read-sound "sin.wav")
max-t 60.0]
(time
(loop [t 0.0]
(when (< t max-t)
(oversample s t 4 delta-t)
(recur (+ t inc-t))))))
Wow. Slow, slow, slow. Profiler says it’s probably seq-related, since oversample does a whole bunch of sequence processing. Seems like a good time to try reducers out…
(let [inc-t (/ 1.0 44100)
delta-t (/ inc-t 4)
s (read-sound "sin.wav")
max-t 60.0]
(time
(loop [t 0.0]
(when (< t max-t)
(oversample s t 4 delta-t)
(recur (+ t inc-t))))))
Which is only slightly better. We need something else.
(let [inc-t (/ 1.0 44100)
delta-t (/ inc-t 4)
s (read-sound "sin.wav")
max-t 60.0]
(time
(loop [t 0.0]
(when (< t max-t)
(oversample s t 4 delta-t)
(recur (+ t inc-t))))))
Much better. The profiler tells us that nth is the culprit. Maybe we can get rid of that.
(let [inc-t (/ 1.0 44100)
delta-t (/ inc-t 4)
s (read-sound "sin.wav")
max-t 60.0]
(time
(loop [t 0.0]
(when (< t max-t)
(oversample s t 4 delta-t)
(recur (+ t inc-t))))))
At this point, I’m not sure I can do much better without restructuring the way the code works.
(let [inc-t (/ 1.0 44100)
delta-t (/ inc-t 4)
s (read-sound "sin.wav")
max-t 60.0]
(time
(loop [t 0.0]
(when (< t max-t)
(sample s t 0)
(recur (+ t inc-t))))))
(let [inc-t (/ 1.0 44100)
delta-t (/ inc-t 4)
s (read-sound "sin.wav")
max-t 60.0]
(time
(loop [t 0.0]
(when (< t max-t)
(sample s t 0)
(recur (+ t inc-t))))))
Which is pretty good. At this point we think the performance is gated by the boxing that’s going on because we’re using a protocol. Next step: use an interface instead.
(let [inc-t (/ 1.0 44100)
delta-t (/ inc-t 4)
s (read-sound "sin.wav")
max-t 60.0]
(time
(loop [t 0.0]
(when (< t max-t)
(sample s t 0)
(recur (+ t inc-t))))))
No faster. Hmm.
(let [inc-t (/ 1.0 44100)
delta-t (/ inc-t 4)
s (read-sound "sin.wav")
max-t 60.0]
(time
(loop [t 0.0]
(when (< t max-t)
(sample s t 0)
(recur (+ t inc-t))))))
Now, how does that compare to a not-file sound?
(let [inc-t (/ 1.0 44100)
delta-t (/ inc-t 4)
s (silence 60.0)
max-t 60.0]
(time
(loop [t 0.0]
(when (< t max-t)
(sample s t 0)
(recur (+ t inc-t))))))
OK, so that’s sort of the theoretical minimum. What about oversampling?
(let [inc-t (/ 1.0 44100)
delta-t (/ inc-t 4)
s (silence 60.0)
max-t 60.0]
(time
(loop [t 0.0]
(when (< t max-t)
(oversample4 s t 0 delta-t)
(recur (+ t inc-t))))))
And how does that compare to a file sound?
(let [inc-t (/ 1.0 44100)
delta-t (/ inc-t 4)
s (read-sound "sin.wav")
max-t 60.0]
(time
(loop [t 0.0]
(when (< t max-t)
(oversample4 s t 0 delta-t)
(recur (+ t inc-t))))))
Interesting. What about that is slow, exactly?
(let [inc-t (/ 1.0 44100 4.0)
delta-t (/ inc-t 4)
s ^dynne.sound.impl.ISound (read-sound "sin.wav")
max-t 60.0]
(time
(loop [t 0.0]
(when (< t max-t)
(.amplitude s t 0)
(recur (+ t inc-t))))))
OK. And I just noticed that there are reflection and performance warnings. Let’s fix those.
(let [inc-t (/ 1.0 44100)
delta-t (/ inc-t 4)
s (silence 60.0)
max-t 60.0]
(time
(loop [t 0.0]
(when (< t max-t)
(oversample4 s t 0 delta-t)
(recur (+ t inc-t))))))
Much better! Was about 480 msecs before.
Previous results: 1200 ms
(let [inc-t (/ 1.0 44100)
delta-t (/ inc-t 4)
s (read-sound "sin.wav")
max-t 60.0]
(time
(loop [t 0.0]
(when (< t max-t)
(oversample4 s t 0 delta-t)
(recur (+ t inc-t))))))
So not really any better. But that has to mostly be in the file stuff, because sampling silence is substantially faster.
(let [inc-t (/ 1.0 44100 4.0)
delta-t (/ inc-t 4)
s ^dynne.sound.impl.ISound (read-sound "sin.wav")
max-t 60.0]
(time
(loop [t 0.0]
(when (< t max-t)
(.amplitude s t 0)
(recur (+ t inc-t))))))
Yep. So most of the time we’re spending in oversample is actually calling into .amplitude. The profiler says that most of our time is in self-time of the above function, which doesn’t seem likely, since if I comment out the call to .amplitude, it’s only 30msec. But other than that, it complains that “clojure.lang.Numbers.lt” is the next-biggest thing!
Calling into .amplitude directly on a file-based sound again, but with casting to try to get lt to be faster
(let [inc-t (/ 1.0 44100 4.0)
delta-t (/ inc-t 4)
s ^dynne.sound.impl.ISound (read-sound "sin-long.wav")
max-t 3600.0]
(time
(loop [t 0.0]
(when (< t max-t)
(.amplitude s t 0)
(recur (+ t inc-t))))))
(let [inc-t (/ 1.0 44100 4.0)
delta-t (/ inc-t 4)
s ^dynne.sound.impl.ISound (read-sound "sin-long.wav")
max-t 3600.0]
(time
(loop [t 0.0]
(when (< t max-t)
(.amplitude s t 0)
(recur (+ t inc-t))))))
That’s actually a fair improvement. Profiler says nothing useful at this point. Might need to try YourKit to see if it’s any better. Still: good progress. Let’s see what oversampling looks like.
Previous results:
(let [inc-t (/ 1.0 44100)
delta-t (/ inc-t 4)
s (read-sound "sin.wav")
max-t 60.0]
(time
(loop [t 0.0]
(when (< t max-t)
(oversample4 s t 0 delta-t)
(recur (+ t inc-t))))))
Yep: much better. Going to call it a day and commit this.
(let [inc-t (/ 1.0 44100)
delta-t (/ inc-t 4)
s (read-sound "sin.wav")
max-t (duration s)]
(time
(loop [t 0.0]
(when (< t max-t)
;;(sample s t)
(oversample4 s t 0 delta-t)
(recur (+ t inc-t))))))
Previous results:
At commit 96de01a
(let [inc-t (/ 1.0 44100)
delta-t (/ inc-t 4)
s (read-sound "sin.wav")
max-t 60.0]
(time
(loop [t 0.0]
(when (< t max-t)
(oversample4 s t 0 delta-t)
(recur (+ t inc-t))))))
Wow - much slower. Let’s try again with an older commit.
(let [inc-t (/ 1.0 44100)
delta-t (/ inc-t 4)
s (read-sound "sin.wav")
max-t 60.0]
(time
(loop [t 0.0]
(when (< t max-t)
(oversample4 s t 0 delta-t)
(recur (+ t inc-t))))))
Still really slow. WTF?!
(let [inc-t (/ 1.0 44100)
delta-t (/ inc-t 4)
s (read-sound "sin.wav")
max-t 60.0]
(time
(loop [t 0.0]
(when (< t max-t)
(oversample4 s t 0 delta-t)
(recur (+ t inc-t))))))
Do not understand…
(let [inc-t (/ 1.0 44100)
delta-t (/ inc-t 4)
s (sinusoid 600 440)
max-t 600.0]
(time
(loop [t 0.0]
(when (< t max-t)
(oversample4 s t 0 delta-t)
(recur (+ t inc-t))))))
Wow. That should not be anywhere near that slow…how was it as fast as it was before?
The weird thing is, the profiler is reporting that ll the time is in the benchmarking code, not in the dynne code itself. Maybe I should try this on a different OS.
(let [inc-t (/ 1.0 44100)
delta-t (/ inc-t 4)
s (sinusoid 600 440)
max-t 600.0]
(time
(loop [t 0.0]
(when (< t max-t)
(oversample4 s t 0 delta-t)
(recur (+ t inc-t))))))
None, because it ran so long I killed it. Going to reboot and see if that helps.
(let [inc-t (/ 1.0 44100)
delta-t (/ inc-t 4)
s (sinusoid 600 440)
max-t 1.0]
(time
(loop [t 0.0]
(when (< t max-t)
(oversample4 s t 0 delta-t)
(recur (+ t inc-t))))))
Holy crap, we’re actually slower than real-time. Profiler time!
(let [inc-t (/ 1.0 44100)
delta-t (/ inc-t 4)
s (sinusoid 600 440)
max-t 1.0]
(time
(loop [t 0.0]
(when (< t max-t)
(oversample4 s t 0 delta-t)
(recur (+ t inc-t))))))
Still slower than real-time
(let [inc-t (/ 1.0 44100)
delta-t (/ inc-t 4)
s (sinusoid 600 440)
max-t 10.0]
(time
(loop [t 0.0]
(when (< t max-t)
(oversample4 s t 0 delta-t)
(recur (+ t inc-t))))))
Still incredibly slow. Profiler says it’s java.lang.reflect.getParameterTypes, followed by java.lang.Class.forName, seemingly from within sample. Hmm.
(let [inc-t (/ 1.0 44100)
delta-t (/ inc-t 4)
s (sinusoid 600 440)
max-t 60.0]
(time
(loop [t 0.0]
(when (< t max-t)
(oversample4 s t 0 delta-t)
(recur (+ t inc-t))))))
Wah! That certainly helped.
Previous results: “Elapsed time: 819.078953 msecs”
(let [inc-t (/ 1.0 44100)
delta-t (/ inc-t 4)
s (read-sound "sin.wav")
max-t 10.0]
(time
(loop [t 0.0]
(when (< t max-t)
(oversample4 s t 0 delta-t)
(recur (+ t inc-t))))))
Back to slower than real time.
Previous results:
;; "Elapsed time: 1278.849554 msecs"
(let [inc-t (/ 1.0 44100)
delta-t (/ inc-t 4)
s (sinusoid 600 440)
max-t 60.0]
(time
(loop [t 0.0]
(when (< t max-t)
(oversample4 s t 0 delta-t)
(recur (+ t inc-t))))))
Wah! That certainly helped.
(let [inc-t (/ 1.0 44100)
delta-t (/ inc-t 4)
s (read-sound "sin.wav")
max-t 60.0]
(time
(loop [t 0.0]
(when (< t max-t)
(oversample4 s t 0 delta-t)
(recur (+ t inc-t))))))
Woot!
(let [dur 60
path (str "test" dur ".wav")]
(when-not (.exists (io/file path))
(save (->stereo (sinusoid dur 440)) path 44100))
(-> path
read-sound
;;(sinusoid dur 440)
->stereo
(fade-in 20)
(fade-out 20)
(mix (-> (square-wave 60 880)
(timeshift (/ dur 2))
(->stereo)))
;;(visualize)
;;(save (str "test-out" dur ".wav") 44100)
oversample-all
time
))
Meh. OK at 6.1x realtime. Hope to do better.
(let [dur 60
path (str "test" dur ".wav")]
(when-not (.exists (io/file path))
(save (->stereo (sinusoid dur 440)) path 44100))
(-> path
read-sound
;;(sinusoid dur 440)
->stereo
(fade-in 20)
(fade-out 20)
(mix (-> (square-wave 60 880)
(timeshift (/ dur 2))
(->stereo)))
;;(visualize)
;;(save (str "test-out" dur ".wav") 44100)
oversample-all
time
))
Significantly better: 6.9x realtime. Still not great.
(let [dur 60
path (str "test" dur ".wav")]
(when-not (.exists (io/file path))
(save (->stereo (sinusoid dur 440)) path 44100))
(-> path
read-sound
;;(sinusoid dur 440)
->stereo
(fade-in 20)
(fade-out 20)
(mix (-> (square-wave 60 880)
(timeshift (/ dur 2))
(->stereo)))
;;(visualize)
;;(save (str "test-out" dur ".wav") 44100)
oversample-all
time
))
I’m pretty sure that the way I was using :inline metadata was totally wrong.
Previous results:
;; "Elapsed time: 8605.301345 msecs"
(let [dur 60
path (str "test" dur ".wav")]
(when-not (.exists (io/file path))
(save (->stereo (sinusoid dur 440)) path 44100))
(-> path
read-sound
;;(sinusoid dur 440)
->stereo
(fade-in 20)
(fade-out 20)
(mix (-> (square-wave 60 880)
(timeshift (/ dur 2))
(->stereo)))
;;(visualize)
;;(save (str "test-out" dur ".wav") 44100)
oversample-all
time
))
Yep, quite a bit better again: 16x realtime. And at this point I’m pretty much out of ideas for how to make it faster without changing the underlying metaphor of functional composition. Maybe something to talk to other people about.
(let [dur 60
s1 (sinusoid dur 440)
s2 (sinusoid dur 1234)]
(-> s1
(multiply s2)
(multiply s2)
(multiply s2)
(multiply s2)
(multiply s2)
(multiply s2)
(multiply s2)
oversample-all
time
))
(let [dur 60
s1 (sinusoid dur 440)
s2 (sinusoid dur 1234)
i1 (ops/input :s1)
i2 (ops/input :s2)
op (ops/compile
(-> i1
(ops/multiply i2)
(ops/multiply i2)
(ops/multiply i2)
(ops/multiply i2)
(ops/multiply i2)
(ops/multiply i2)
(ops/multiply i2)))]
(-> (op {:s1 s1 :s2 s2})
oversample-all
time
))
That’s not bad. Maybe 25% better.
(require '[hiphip.double :as dbl])
(require '[primitive-math :as p])
(defn mono-chunk-seq [chunk-size]
(repeat [(double-array chunk-size 1.0)]))
(defn stereo-chunk-seq [chunk-size]
(repeat [(double-array chunk-size 1.0) (double-array chunk-size 1.0)]))
How fast can we simply iterate through a sequence of vectors of double arrays, just looking at each element without doing anything to it?
(time
(let [chunk-size 44100
num-chunks (* 60 60)]
(doseq [[chunk] (take num-chunks (mono-chunk-seq chunk-size))]
(dotimes [n chunk-size]
(dbl/aget chunk n)))))
Not bad! That’s an hour’s worth of samples at 44100, with a 1.0-second chunk size.
(time
(let [chunk-size 10000
num-chunks (->> chunk-size (/ 44100) (* 60 60 ))]
(doseq [[chunk-l chunk-r] (take num-chunks (stereo-chunk-seq chunk-size))]
(dbl/amap [l chunk-l r chunk-r] (p/+ l r)))))
OK, wow, that’s pretty good. Let’s go build something on this.