# Auditory Demonstrations 

These demos are wrappers around selected demos from the AUDITORY DEMONSTRATIONS CD
made in 1987 at the Institute for Perception Research (IPO)
Eindhoven, The Netherlands with
support by the Acoustical Society of America.

**Reference**:  [**Auditory Demonstrations**](https://research.tue.nl/en/publications/auditory-demonstrations), *A.J.M. Houtsma, Th.D. Rossing, W.M. Wagemakers*, Technische Universiteit Eindhoven, Institute for Perception Research, 1987.    More detailed documentation may be found  [here].(https://pure.tue.nl/ws/portalfiles/portal/79033010/402660.pdf)

The **Jupyter notebook** embedding allows for additional functionality such as viewing waveforms, spectrograms and feature functions in a simple framework.  You are encouraged to develop and testt your personal feature functiona and test them in view of the presented demonstrations. 

*Dirk Van Compernolle, June 2021*

In [None]:
%matplotlib inline

from urllib.request import urlopen
from IPython.display import display, Audio, HTML, clear_output
import ipywidgets as widgets
import math
import numpy as np


### REQUIRED PACKAGES
# 
# You need the pyspch package to be installed to run these demos
# installing pyspch will automatically install additional packages such as soundfile, librosa
# 
try:
  import google.colab
  IN_COLAB = True 

except:
  IN_COLAB = False

try:
  import pyspch
except:
  ! pip install git+https://github.com/compi1234/pyspch.git
    
from pyspch import spectrogram as specg
from pyspch import audio
import pyspch.display as spch_disp

In [None]:
# Energy function in dB
# we measure intensity as  10. * log10(SUM(x^2)/N) + 14
# - the +15 offset is arbitrary
# - the EPS of 1.e-10 yields a range of around 60dB
# 
EPS = 1.e-10
OFFSET = 15.0

def en_db(y,sample_rate=16000,f_shift=0.01):
    n_shift = int(f_shift*sample_rate)
    nfr = int(len(y)/n_shift)
    energy = np.zeros((nfr,))
    for ifr in range(0,nfr):
        energy[ifr] = np.sum(np.square(y[ifr*n_shift:(ifr*n_shift+n_shift)]))/float(n_shift)
    ftr = 10*np.log10(energy+EPS) + OFFSET
    return(ftr)


def ftr_en(fig,data):
    spch_disp.add_line_plot(fig,en_db(data),row=3,ylabel='Energy (dB)',yrange=[-50,0],dx=.01)

# a default boxed layout
def box_layout():
     return widgets.Layout(
        border='solid 1px black',
        margin='0px 10px 10px 0px',
        padding='5px 5px 5px 5px'
     )
demos = {}
demos['Demo4: The Decibel Scale (a)'] = {'filename':'TrackNo08.wav','split':12.,
                      'commentary':'Broadband Noise is reduced in 10 steps of 6 decibels. Demonstrations are repeated once.'}
demos['Demo4: The Decibel Scale (b)'] = {'filename':'TrackNo09.wav','split':7.,
                      'commentary':'Broadband noise is reduced in 15 steps of 3 decibels.'}
demos['Demo4: The Decibel Scale (c)'] = {'filename':'TrackNo10.wav','split':1.,
                      'commentary':'Broadband noise is reduced in 20 steps of 1 decibel.'}
demos['Demo4: The Decibel Scale (d)'] = {'filename':'TrackNo11.wav','split':7.,
                      'commentary':'Free-field speech of constant power at various distances from the microphone.'}

demos['Demo29: Effect of Tone Envelope on Timbre (a)'] = {'filename':'TrackNo54.wav','split':10.,
         'commentary': "You will hear a recording of a Bach chorale played on a piano"}
demos['Demo29: Effect of Tone Envelope on Timbre (b)'] = {'filename':'TrackNo55.wav','split':4.0,
         'commentary': "Now the same chorale will be played backwards"}
demos['Demo29: Effect of Tone Envelope on Timbre (c)'] = {'filename':'TrackNo56.wav','split':9.0,
         'commentary': "Finally the tape of the last recording is played backwards so that the chorale is heard forward again,but with an interesting difference"}




class AuditoryDemos(widgets.HBox):
    def __init__(self,feature=None):
        super().__init__()
            
        self.name = 'Demo4: The Decibel Scale (a)'
        self.demo = demos[self.name]
        self.root = 'http://homes.esat.kuleuven.be/~compi/demos/AuditoryDemonstrations/'
        self.fig = None
        self.dpi = 200
        self.feature = feature
        
        # create the widgets
        self.wg_name = widgets.Dropdown(options=demos.keys(),value=self.name,description="",layout=widgets.Layout(height='60px',width='100%'))
        self.wg_name.observe(self.update_demo,'value')
        self.output = widgets.Output()

        self.text_commentary = widgets.Output() 
        self.audio_commentary = widgets.Output()
        self.audio_demo = widgets.Output()
        self.left = widgets.VBox([self.output, self.audio_demo],layout=box_layout())
        self.left.layout.width = '60%'
        self.right = widgets.VBox([self.wg_name, self.text_commentary], layout=box_layout())
        self.right.layout.width = '40%'
        self.children = [self.left, self.right]
        self.update()
    
    def update_demo(self,change):
        self.name = change.new
        self.demo = demos[self.name]
        self.update()
        
    def update(self):
        with self.text_commentary:
            print("\n\nloading new demo , hang on   ...")
        
        self.wavdata, self.sample_rate = audio.load(self.root+self.demo['filename'],sample_rate=16000,mono=True)
            
        indx = int(self.demo['split']*self.sample_rate)
        instr_data = self.wavdata[:indx]
        demo_data = self.wavdata[indx:] 
        with self.text_commentary:
            clear_output(wait=True)
            print("\n",self.name,"\n")
            print(self.demo['commentary'])
        #with self.audio_commentary:
        #    clear_output(wait=True)
        #    print("Listen to the commentary")
        #    display(Audio(data=instr_data,rate=self.sample_rate,autoplay=False))
        with self.audio_demo:
            clear_output(wait=True)
            print("Listen to the demo")
            display(Audio(data=demo_data,rate=self.sample_rate,autoplay=False))            
            
        spg = specg.spectrogram(demo_data,sample_rate=self.sample_rate)
        if self.feature is None:
            self.fig = spch_disp.plot_spg(spg,wav=demo_data,sample_rate=self.sample_rate,dpi=self.dpi)
        else :
            self.fig = spch_disp.plot_spg(spg,wav=demo_data,sample_rate=self.sample_rate,dpi=self.dpi,figsize=(10,5),ftr_axis=True,ftr_height=2)
            self.feature(self.fig,demo_data)
        with self.output:
            clear_output(wait=True)
            display(self.fig)
           
        


## Sound Pressure, Power and Loundness

In a sound wave there are extremely small periodic variations in atmospheric pressure to which our ears respond in a rather complex manner. The minimum pressure
fluctuation to which the ear can respond is less than one billionth ($10^{-9}$ ) of atmospheric
pressure. (This is even more remarkable when we consider that storm fronts can cause
the atmospheric pressure to change by as much as 5 to 10% in a few minutes.) The
threshold of audibility, which varies from person to person, typically corresponds to
a sound pressure amplitude of about $2x10^{-5} N/m2$ at a frequency of 1000 Hz. The
threshold of pain corresponds to a pressure amplitude approximately one million ($10^6$)
times greater, but still less than 1/1000 of atmospheric pressure.
Because of the wide range of pressure stimuli, it is convenient to measure sound
pressures on a logarithmic scale, called the decibel (dB) scale. Although a decibel scale
is actually a means for comparing two sounds, we can define a decibel scale of sound level
by comparing sounds with a reference sound having a pressure amplitude $p_0 = 2 x 10^{-5}
N/m^2$ assigned a sound pressure level of 0 dB. Thus we define sound pressure level as:
$$
L_p = 20 log p/p_0
$$
Expressed in other units, 
$$
p_0 =20 \mu Pa = 2 \times 10^{-4} dynes / cm2 = 2 \times 10^{-4} \mu bars
$$
For comparison, atmospheric pressure is $10^5 N/m^2$ , or $lO^6 \mu bars$. Sound pressure levels are
measured by a sound level meter, consisting of a microphone, an amplifier, and a meter
that reads in decibels.
In addition to the sound pressure level, there are other levels expressed in decibels,
so one must be careful when reading technical articles about sound or regulations on
environmental noise. One such level is the sound power level, which identifies the total
sound power emitted by a source in all directions. Sound power, like electrical power,
is measured in watts (one watt equals one joule of energy per second). In the case of
sound, the amount of power is very small, so the reference selected for comparison is
the picowatt (10- 12 watt). The sound power level (in decibels) is defined as
$ L_w = 10 log W_p /W_0 $,  
where W_p is the sound power emitted by the source, and the reference power $W_0 = 10^{-12} W$.
Another quality described by a decibel level is sound intensity, which is the rate
of energy flow across a unit area. The reference for measuring sound intensity level is
[ 0 = 10-12 $W/m^2$ , and the sound intensity level is defined as
$$
L_i = 10logI/I_0.
$$
For a free progressive wave in air (e.g., a plane wave traveling down a tube or a spherical
wave traveling outward from a source), sound pressure level and sound intensity level
are nearly equal ($L_p \approx L_i$). This is not true in general, however, because sound waves
from many directions contribute to sound pressure at a point.
The relationship between sound pressure level and sound power level depends on
several factors, including the geometry of the source and the room. If the sound power
level of a source is increased by 10 dB, the sound pressure level also increases by 10
dB, provided everything else remains the same. If a source radiates sound equally in all
directions and there are no reflecting surfaces nearby (a free field), the sound pressure
level decreases by 6 dB each time the distance from the source doubles.
Loudness is a subjective quality. While loudness depends very much on the sound
pressure level, it also depends upon such things as the frequency, the spectrum, the
duration, and the amplitude envelope of the sound, plus the environmental conditions
under which it is heard and the auditory condition of the listener.
Loudness is frequently expressed in sones. One sone is equal to the loudness of a
1000-Hz tone at a 40-dB sound pressure level, and two sones describes a sound that is
judged twice as loud, etc. The dependence of subjective loudness on sound pressure is
discussed in connection with Demonstration 7.

## The Decibel Scale (Demo 4, Tracks 08-11) 

##### The Demo

In the first part of this demonstration, we hear broadband noise reduced in steps of
6, 3, and 1 dB in order to obtain a feeling for the decibel scale.
In the latter part, a voice is heard at distances of 25, 50, 100, and 200 em from an
omni-directional microphone in an anechoic room. Under these conditions, the sound
pressure level decreases about 6 dB each time the distance is doubled. (In a normal
room this will not be the case, since considerable sound energy reaches the microphone
via reflections from walls, ceiling, floor, and objects within the room.)  

##### The figure contains following displays:
- the time waveform
- spectrogram
- feature fuction: energy on a dB scale

##### Things to investigate
1. The full dynamic range of the auditory system is about 120dB. However we need much less for our day to day usage. From the above examples, what level difference can you have between a soft and loud speech signal such that they are both comfortable and understandable without excessive effort from the listener?

2. Explain the difference in acoustic behaviour between an anechoic room and the computer lab that you are in. Would you dare estimating the SPL differences when someone is talking to you from 25, 50, 100 and 200 cm in this room?

3. What is the theoretical dynamic range that can be captured using 16bit quantization as on a standard CD recording ?

In [1]:
AuditoryDemos(feature=ftr_en)

NameError: name 'AuditoryDemos' is not defined

## The Effect of Spectrum on Timbre (Demo , Tracks )

##### Background
Timbre can be defined as "that attribute of auditory sensation in terms of which a listener can judge that two sounds similarly presented and having the same loudness and pitch are dissimilar...". According to this definition, timbre is the subjective correlate of all those sound properties that do not directly influence pitch or loudness. These properties include the sound's spectral power distribution, its temporal envelope, rate and depth of amplitude or frequency modulation, and the degree of non-harmonicity of its partials. The timbre of a sound therefore depends on many physical variables.

The concept of timbre plays a very important role in the orchestration of traditional music and in the composition of computer music. There is, however, no satisfying comprehensive theory of timbre perception. Neither is there a uniform nomenclature to designate or classify timbre. This poses considerable problems in communicating or teaching the skills of orchestration and computer score writing to student-composers.

##### The demonstration
In the following demonstrations, one can hear the influence of spectral make-up on the perceived timbre of sounds of musical instruments.

Demonstration 1: Carillon bell
 
The sound of a Hermony carillon bell, having a strike-note pitch around 500 Hz (B4), is synthesized in eight steps by adding successive partials with their original frequency, phase and temporal envelope. The partials added in successive steps are:
 
- Hume note (251 Hz)
- Prime or Fundamental (501 Hz)
- Minor Third and Fifth (603, 750 Hz)
- Octave or Nominal (1005 Hz)
- Duodecime or Twelfth (1506 Hz)
- Upper Octave (2083 Hz)
- Next two partials (2421, 2721 Hz)
- Remainder of partials

Demonstration 2: Guitar
 
The sound of a guitar tone with a fundamental frequency of 251 Hz is analyzed and re-synthesized in a similar manner, the partials added in successive steps are:
 
- Fundamental
- 2nd harmonic
- 3rd harmonic
- 4th harmonic
- 5th and 6th harmonics
- 7th and 8th harmonics
- 9th, 10th and 11th harmonics
- Remainder of partials

##### Things to investigate
- How will the guitar tone be perceived when the fundamental frequency is left out?
- Can you give a common situation where the fundamental frequency is absent?

## The Effect of Tone Envelope on Timbre (Demo 29, Tracks 54-56)

##### In this demo you will hear
- "A recording of a J.S. Bach chorale played on a piano"  ("Als der gütige Gott") 
- "Next the same chorale will be played backwards" 
- "Finally the tape of the last recording is played backwards so that the chorale is heard forward again, but with an interesting difference". 

##### Some Background
The purpose of this demonstration is to show that the temporal envelope of a sound, i.e. the time course of the sound's smoothed amplitude, has a significant influence on the perceived timbre of the sound.

A typical tone envelope may include an attack, a steady-state, and a decay portion (e.g. wind instrument tones), or may merely have an attack immediately followed by a decay portion (e.g. plucked or struck string tones).

By removing the attack segment of an instrument's sound, or by substituting the attack segment of another musical instrument, the perceived timbre of the sound may change so drastically that the instrument can no longer be recognised.

##### Things to investigate
- How does the instrument sounds when the tape is played in reverse? - What is the effect of the reversing on the power spectrum of each note?

In [None]:
AuditoryDemos(feature=ftr_en)