Add 0-latency convolution reverbs, OSX Mojave support, bump stack lts…

… versions (#25)
OlivierSohn · Oct 18, 2018 · 14a745f · 14a745f
1 parent c99dfaa
commit 14a745f
Show file tree

Hide file tree

Showing 26 changed files with 737 additions and 203 deletions.
diff --git a/.travis.yml b/.travis.yml
@@ -8,12 +8,13 @@ os:
   - osx
   - linux
 
-# earlier clang versions may induce "symbols not found" errors
-osx_image: xcode9.3
+# not sure if that is required: at the time of writing this comment,
+# the default travis xcode version is 9.4 and may work fine as well.
+osx_image: xcode10
 
 env:
-  - RESOLVER="lts-11.9"           # ghc 8.2.2
-  - RESOLVER="nightly-2018-06-26" # ghc 8.4.3
+  - RESOLVER="lts-11.22"          # ghc 8.2.2
+  - RESOLVER="lts-12.13"          # ghc 8.4.3
   - RESOLVER="nightly"
 
 matrix:
@@ -55,20 +56,11 @@ before_install:
     sudo unlink /usr/bin/gcc && sudo ln -s /usr/bin/gcc-7 /usr/bin/gcc
     gcc --version
   fi
-# We need to install clang on osx
 - |
   if [[ $TRAVIS_OS_NAME == 'osx' ]]
   then
     brew bundle --verbose
     brew info llvm
-    brew update
-    brew install llvm
-    brew info llvm
-    export CC="/usr/local/opt/llvm/bin/clang"
-    export CXX="/usr/local/opt/llvm/bin/clang++"
-    export LDFLAGS="-L/usr/local/opt/llvm/lib"
-    export CPPFLAGS="-I/usr/local/opt/llvm/include"
-    export PATH="/usr/local/opt/llvm/bin:$PATH"
   fi
 - mkdir -p ~/.local/bin
 - export PATH=$HOME/.local/bin:$PATH

diff --git a/BACKLOG.md b/BACKLOG.md
@@ -1,13 +1,76 @@
-- add reverb in game-synths.
+- we could stream audio at 200 kBytes per second ( 4bytes per frame, i.e 16 bits per channel in stereo)
+i.e 1.6 Mb / s
+
+- write a paper on auto optimizations used in the singlethread 0-latency convolution reverb algorithm.
+Buzzwords:
+. Robust (no need to synchronize with other threads)
+. Predictable (pure computations, auto-benchmarks)
+. Callback-length aware : we optimize for
+  worst case cost per audio callback (max duration over all possible phases)
+
+- release imj-audio
+. Document linux performance limitation for fft:
+we could optimize imj-fft by using assume_aligned
+and/or using fft libraries
+
+. documentation will likely fail to build, I'll need to upload it manually.
+
+- test MIDI input on linux
+
+- chose a better default for midi polling, it is using 40% CPU !
+. Also, ideally Haskell could be used to setup the thread, but the midipolling should occur outside ghc's
+scope to avoid GC pauses, and the Haskell overhead.
+
+- Effects
+. Allow to play multiple notes of the same wind at the same time (I'm not sure it works today)
+. Allow to play different kinds of Wind at the same time
+. give each wind preset a Haskell constructor.
+. do the same thing for robots
+
+- Panning
+. allow panning of instruments
+. in game-synths, each player could be automatically assigned a different position
+in the stereo field.
+When using true stereo reverbs, we could pan each player to far left / far right.
+
+- reverbs in imj-game-synths
+
+. embed some reverbs
+. make an argument with "path to reverbs" to allow adding more reverbs
+
+. let user adjust reverb gain (in addition to dry/wet)
+
+. send the reverb file OTN : the key is "file path + hash", to see if the client has it.
+(hash should be part of spaceResponse_t)
+
+. take the cost of the first convolution into account when dynamically optimizing:
+
+worst case:
+
+early part (once every 2^n sample):
+----------
+for k : [nDropped..n]
+  1 forward FFT (size 2^k),
+  1 inverse FFT (size 2^k),
+  2 multiply add  (size 2^k)
+
+late part ():
+---------
+
+if the number of late coefficients is much bigger than the number of early ones,
+the cost of early will be negligible.
+but if it's similar, we may have a problem, because early worst is 1.5x late worst in that case
+
+A first approach is to measure the peak overhead, by sample, of early coefficients handling,
+and give this information to the algorithm, along with the timing (what is in sync with which grain?).
+
+- visual feedback when the compressor kicks in (rt thread sets a flag, non rt thread checks every second)
 
 - should we compress network messages? to analyze traffic:
 sudo tcpdump -i lo0 -v -nnXSs 0
 
-- change the sound in real time, when harmonics change.
-.. enqueue parameter changes :
-   change volume / phase of harmonic 6 to ...
-the changes should be linearily smoothed over a long time so that there is
-  no sound discontinuity
+- harmonics param changes should change the sound in real time
+(use same technique as reverb wet)
 
 - When playing a loop, the server should offset the miditimestamps by
 period of the loop * loop number, else jitter compensation will not work.

diff --git a/README.md b/README.md
@@ -60,12 +60,13 @@ List of packages, inverse-topologically sorted wrt dependencies, with keywords /
      - In the terminal
   - Player input, window management.
 - [imj-space](/imj-space)
-  - Creates random 2D game levels, given some topological constraints.
+  - Randomized creation of 2D game levels, given some topological constraints.
 - [imj-particlesystem](/imj-particlesystem) (formerly `imj-animation`)
   - Physics-based and geometric particle systems.
 - [imj-measure-stdout](/imj-measure-stdout)
   - An executable to measure the maximum capacity of stdout, and observe
-  the effect of different buffering modes.
+  the effect of different buffering modes. This was used while developing
+	[delta-rendering](/imj-base/src/Imj/Graphics/Render/Delta.hs).
 - [imj-server](/imj-server)
   - Using [websockets] to communicate between server and clients.
   - Broadcast messages to all clients
@@ -146,11 +147,13 @@ Passing no command line argument will run the games in single player mode:
 stack exec <game-executable>
 ```
 
-# Run the games in Multi-player mode
+# Deploy a game server for Multi-player mode
 
 Use the [deployment script] to host the games on a [Heroku] server.
 
-## Connect to a running game server
+# Run the games in Multi-player mode
+
+### Connect to a running game server
 
 ```shell
 stack exec -- <game-executable> -n <serverName> -p<serverPort>

diff --git a/imj-audio/README.md b/imj-audio/README.md
@@ -1,12 +1,85 @@
 # What is it?
 
-Haskell bindings to a C++ audio engine
+Haskell bindings to a C++ lockfree audio engine, using portaudio underneath to connect to
+the audio platform.
+
+# Design goals
+
+## audio callback deadlines
+
+To minimize the likelyhood of a missed audio deadline,
+we use no lock, and we make sure that our data is properly placed in memory
+to benefit from data locality and to minimize cache misses.
+
+When appropriate, we use auto-adjusting algorithms that probe the run-time conditions
+to chose the best (fastest) way of performing a computation.
+For example, the 0-latency convolution reverb adjusts its parameters to the actual hardware,
+and to the length of the audio callback, so as to minimize
+the estimated worst case cost per audio callback.
+
+## Numeric noise
+
+We chose to run all computations in double precision, so as to
+minimize the numeric noise induced when doing long FFTs.
+
+## Features:
+
+- Polyphonic synthesizers:
+  - Using oscillators:
+    - sine
+    - sine with loudness volume compensation, to produce a sound of equal perceived loudness
+    on all frequencies
+    - triangle
+    - saw
+    - square
+  - Multiple oscillators running at different frequencies (harmonics) can be combined
+    to produce a complex tone.
+  - AHDSR envelopes are used to shape the amplitude of the sound.
+    Attack, Decay and Release interpolations can be customized with multiple
+    easing options.
+  - Autorelease mode to skip the Sustain phase.
+- The count of simultaneously used synthesizers is limited only by the amount of RAM
+  that is present on your machine.
+- Postprocessing:
+  - Zero-latency, single thread convolution reverbs. Very long responses can be used
+  and the computation scheme uses dynamic optimization to figure out the best
+  way to carry the computation, so that every audio callback finishes in time.
+
+  No response compression occurs, so responses are used at their full resolution,
+  even for the tail of very long responses.
+
+  CPU usage (in percentages of a single core) for a 2015 MacBook Air laptop,
+    Intel Core i7 / 2,2 GHz:
+
+                                      using FFTs from:
+                                     Accelerate   imj-fft (naive)
+    2-channels, 17 seconds long   :   12%          30%
+    4-channels, 12 seconds long   :   17%          45%, with buffer underruns.
+
+  Accelerate is available on OSX, so on Linux only shorter room responses can be used
+  without underruns. This could be fixed by using an optimized FFT library on linux, too.
+
+  - A compressor limits the audio output to prevent it from clipping.
+
+# What's next ?
+
+- The ability to modify the sound characteristics while a note is being played.
+- Make more audio engine instruments available:
+  some are based on frequency sweeps, to emulate birds singing, others are
+  based on markov chains to emulate robotic sounds.
+- Better convolution performance on Linux by using the equivalent of Accelerate on OSX (Blas or FFTW)
+- Investigate using OpenCL for (some FFTs of a) convolution reverbs, using
+  fast submission to avoid OpenCL submission latency
+  (see https://www.iwocl.org/wp-content/uploads/iwocl-2016-gpu-daemon.pdf).
+  We could parallelize early and late coefficients handling.
+- Lower the room response tail resolution to reduce CPU usage (only when there
+  is not enough CPU available)
 
 # Supported platforms
 
 Officially supported client platforms are macOS and Ubuntu.
 
 # Build
 
-The c++ sources use C++17, hence recent enough compilers (`clang`, `gcc`)
+The C++ sources use C++17, hence recent enough compilers (`clang`, `gcc`)
 are needed to build the package.
diff --git a/imj-audio/c/cpp.algorithms b/imj-audio/c/cpp.algorithms
diff --git a/imj-audio/c/cpp.audio b/imj-audio/c/cpp.audio
diff --git a/imj-audio/c/extras.h b/imj-audio/c/extras.h
@@ -17,7 +17,7 @@
 namespace imajuscule {
   namespace audioelement {
 
-    using AudioFloat = float;
+    using AudioFloat = double;
 
     // in sync with the corresponding Haskel Enum instance
     enum class OscillatorType {
@@ -121,18 +121,18 @@ namespace imajuscule {
     };
 
     template<typename Env>
-    std::pair<std::vector<float>, int> envelopeGraphVec(typename Env::Param const & envParams) {
+    std::pair<std::vector<double>, int> envelopeGraphVec(typename Env::Param const & envParams) {
       Env e;
       e.setAHDSR(envParams);
       // emulate a key-press
       e.onKeyPressed(0);
       int splitAt = -1;
 
-      std::vector<float> v, v2;
+      std::vector<double> v, v2;
       v.reserve(10000);
       for(int i=0; e.getRelaxedState() != EnvelopeState::EnvelopeDone1; ++i) {
         e.step();
-        v.push_back(static_cast<float>(e.value()));
+        v.push_back(e.value());
         if(!e.afterAttackBeforeSustain()) {
           splitAt = v.size();
           if constexpr (Env::Release == EnvelopeRelease::WaitForKeyRelease) {

diff --git a/imj-audio/c/wrapper.cpp b/imj-audio/c/wrapper.cpp
@@ -36,8 +36,8 @@ namespace imajuscule::audioelement {
 
 
   template<typename Env>
-  float* envelopeGraph(typename Env::Param const & rawEnvParams, int*nElems, int*splitAt) {
-    std::vector<float> v;
+  double* envelopeGraph(typename Env::Param const & rawEnvParams, int*nElems, int*splitAt) {
+    std::vector<double> v;
     int split;
     std::tie(v, split) = envelopeGraphVec<Env>(rawEnvParams);
     if(nElems) {
@@ -49,10 +49,10 @@ namespace imajuscule::audioelement {
     auto n_bytes = v.size()*sizeof(decltype(v[0]));
     auto c_arr = imj_c_malloc(n_bytes); // will be freed by haskell finalizer.
     memcpy(c_arr, v.data(), n_bytes);
-    return static_cast<float*>(c_arr);
+    return static_cast<double*>(c_arr);
   }
 
-  float* analyzeEnvelopeGraph(EnvelopeRelease t, AHDSR p, int* nElems, int*splitAt) {
+  double* analyzeEnvelopeGraph(EnvelopeRelease t, AHDSR p, int* nElems, int*splitAt) {
     static constexpr auto A = getAtomicity<audio::Ctxt::policy>();
     switch(t) {
       case EnvelopeRelease::ReleaseAfterDecay:
@@ -277,7 +277,7 @@ extern "C" {
     return convert(midiEventAHDSR(osc, t, {hars, har_sz}, p, n, maybeMts));
   }
 
-  float* analyzeAHDSREnvelope_(imajuscule::audioelement::EnvelopeRelease t, int a, int ai, int h, int d, int di, float s, int r, int ri, int*nElems, int*splitAt) {
+  double* analyzeAHDSREnvelope_(imajuscule::audioelement::EnvelopeRelease t, int a, int ai, int h, int d, int di, float s, int r, int ri, int*nElems, int*splitAt) {
     using namespace imajuscule;
     using namespace imajuscule::audio;
     using namespace imajuscule::audioelement;
@@ -301,6 +301,37 @@ extern "C" {
     }
     return convert(stopPlaying(windVoice(),getAudioContext().getChannelHandler(),*getXfadeChannels(),pitch));
   }
+
+  bool getConvolutionReverbSignature_(const char * dirPath, const char * filePath, spaceResponse_t * r) {
+    using namespace imajuscule::audio;
+    return getConvolutionReverbSignature(dirPath, filePath, *r);
+  }
+
+  bool dontUseReverb_() {
+    using namespace imajuscule::audio;
+    if(unlikely(!getAudioContext().Initialized())) {
+      return false;
+    }
+    dontUseConvolutionReverbs(getAudioContext().getChannelHandler());
+    return true;
+  }
+  bool useReverb_(const char * dirPath, const char * filePath) {
+    using namespace imajuscule::audio;
+    if(unlikely(!getAudioContext().Initialized())) {
+      return false;
+    }
+    return useConvolutionReverb(getAudioContext().getChannelHandler(), dirPath, filePath);
+  }
+  bool setReverbWetRatio(double wet) {
+    using namespace imajuscule::audio;
+    if(unlikely(!getAudioContext().Initialized())) {
+      return false;
+    }
+    getAudioContext().getChannelHandler().enqueueOneShot([wet](auto & chans) {
+      chans.getPost().transitionConvolutionReverbWetRatio(wet);
+    });
+    return true;
+  }
 }
 
 #endif
diff --git a/imj-audio/example/Main.hs b/imj-audio/example/Main.hs
@@ -17,8 +17,13 @@ import           Imj.Music.Compositions.Vivaldi
 main :: IO ()
 main = void $ usingAudioOutput -- WithMinLatency 0
      $ do
-  --stressTest
-  --threadDelay 10000
+  -- comment the following line out to do benchmarks: it will generate a lot of note events
+  -- in a short period of time, and allows to produce the priority inversion effect
+  -- when a global lock is used:
+  {-
+  _ <- stressTest
+  threadDelay 10000
+  --}
   putStrLn "playing tech"
   uncurry (flip playVoicesAtTempo techInstrument) tech >>= print
   threadDelay 10000