## Computational Perception Assignment 3

Brennan McFarland

### Imports

In [1]:
import numpy as np

### Exercise 1. The Duplex Theory (25 points)

**Figure IIDvsAz: Interaural Intensity Difference vs. Azimuth (at different frequencies)**

<img src="moore-fig6-2.jpg" align="left" style="width: 500px;"/>

**Figure MAAvsF: Minimum audible angle vs. Frequency (ploted by azimuth)**

<img src="moore-fig6-5.jpg" align="left" style="width: 400px;"/>

1.1 (10 pts) Describe three features of figure IIDvsF that would be predicted by duplex theory.

In [2]:
# duplex theory described in L9-Sound Localization, TODO: I think he means figure IIDvsAz, but double check

Duplex theory postulates that in human hearing, ITDs are used to localize sounds of lower frequency and IIDs are used to localize sounds at higher frequencies.

1. One feature of the figure predictable by duplex theory is that as frequency increases, IID as a function of angle from azimuth changes from almost an entirely flat line at around 200Hz to more of an arch/parabola around 6000Hz.  That is, at higher frequencies the IID as a function of angle has a higher variance.  Duplex theory predicts this as a higher variance in the underlying signal we're trying to detect means a higher SNR (assuming a fixed noise level) and thus a higher capacity (as in entropic capacity) for what can be retrieved from the noisy signal.  Thus, at higher frequencies we can get more accurate predictions of the angle from azimuth given only the IID than at lower frequencies, which fits with duplex theory's assertion of IIDs being used for sound localization at higher frequencies as that is when they would be more effective.

2. Another feature duplex theory would predict based on its assertion that IIDs are useful for us to do sound localization at all is that the functions for IID as a function of angle from azimuth is approximately symmetric for a given frequency, with a symmetry axis around 90 degrees.  This might seem like evidence against the utility of IIDs for sound localization as even when the curves have sufficient variance to distinguish them from noise as explained in 1., the localization ambiguity is still only reduced to two possible locations (or regions, given that there is also noise): the two points with the same IID, lying at approximately the same angle from 90 degrees past the azimuth.  However, this makes sense in the case of human biology because sound localization is aided by vision, that is, if the source of a sound is within our visual field we can infer based on what we see where the sound is coming from.  Thus, our brains can make inferences about which of these two points the sound is coming from: if we can see it, then we can infer the origin of the sound is less than 90 degrees from the azimuth (in front of us), and if we can't, we can infer it is not, ie, it is more than 90 degrees from the azimuth (behind us).  Although the human visual span is not quite 180 degrees, the missing range of angles is small enough, the two potential localizations are small enough/the points on the curve are close enough that we still get a good enough estimate of the direction the sound is coming from.  Thus duplex theory predicts that less resources need to be dedicated to localizing sound with IIDs due to other factors usable in conjunction, still making it an effective tool for sound localization.

3. Similarly, the changing slope of the IIDs as functions of angle from azimuth for a given frequency could be predicted by duplex theory.  As can be seen in the figure, these slopes are greatest around 0 and 180 degrees from the azimuth, which means higher variance in the signal in those regions and thus better, more accurate localizations, for the reasons explained in 1., than at around 90 degrees, where the slope is usually close to 0.  This implies higher resolution sound localization from close to directly in front and behind compared to from the sides, which would likely be more valuable from an evolutionary standpoint as it would allow us to better localize the sounds from whatever areas we're looking at, searching or observing, and to better pinpoint the direction of potential threats that could be chasing or sneaking up on us from behind.  Thus duplex theory would predict that better sound localization from in front and behind would be favored, something apparent from the IID vs angle measurements in the figure.

1.2 (10 pts) Describe a feature of the curves in either figure IIDvsF or MAAvsAz that is not explained by duplex theory.  What might be the source of this?

In the IIDvsAz figure, the curves of IID as a function of angle from the azimuth are generally more noisy between 90 and 180 degrees than they are between 0 and 90 degrees, which would seem to hamper effective sound localization using IIDs even at higher frequencies, which duplex theory says is its main purpose, and from behind, which as stated in part 3. of the last problem would be especially useful from an evolutionary standpoint.  The source of this noise is likely due to the shape of the head and ears.  The IIDs of incoming sounds are shaped by the way sound is muffled and obstructed by the shape of the head and the pinnae, the latter of which in particular would have a more profound disruptive effect on sounds coming from behind due to their shape.

1.3 (5 pts) Describe the functional significance of the pinnae.

Several theories have attempted to explain the pinnae's functional significance.  These include its use as a mechanism for gathering sound waves, as vestigial structures from when we could move our ears, and as a way to help shape incoming sounds so as to better distinguish sounds coming from the front versus the back.  More recent theories like Batteau's postulate that the pinnae create echoes that create lateral and elevation cues to aid in localization.  The pinna then, along with the ear canal, results in a system of acoustic resonators that enable more accurate sound localization in 3D space.

### Exercise 2. Lateralization with ITD (30 points)

<img src="head-soundwaves.pdf" align="left" style="width: 250px;"/>

In this problem you will create a set of sound waveforms to listen to and experiment with.  You will need headphones or earphones.

We will define that locations in front of the head have 0$^\circ$
azimuth and locations in front of the left ear have -90$^\circ$
azimuth.  Assume a simplified model of the human head in which the
head is spherical.  Also assume that sound sources are infinitely far away, so that sound reaches the ears in straight lines (see
figure).  With this model the ITD ($\Delta t$) is:

$$ \Delta t = \frac{r}{c} (\theta+\sin(\theta)) $$

where $r=9~\textrm{cm}$ is the radius of the head. Assume the speed of sound is $c= 345~m/s$ (at $23^\circ$C).

2.1 (5 points) Write a function to create a sine wave of a specified frequency (Hz), duration (seconds), and sampling rate (Hz).  Illustrate it with plots.

2.2 (10 points) Write a function that takes a sound waveform and an angle (in degrees) as input and returns (or creates) a stereo sound composed of both the left and right signals that reach the ears of the model listener.  Onset or offset transients will adversely affect the lateralization perception, so your function should remove them (using for example a Hanning window).  Demonstrate your function with a sine wave and with a real sound.

2.3 (5 points)  Using the functions you created above, create a set of sounds at different "pure-tone" frequencies (include at least 200Hz and 2000Hz) that come from different angles.  Show the output by plotting the sounds with the left and right channels in different colors on the same axes.  

2.4 (5 points) Listen to those sounds with headphones.  Do you
perceive the sounds as coming from the same location or different
locations?  Describe your observations and explain.

2.5 (5 points) Now try the 'note.wav' and 'sound.wav' sounds included with the assignmnet.  How does the perception of spatial position of the sine waves compare to the sine waves?

### Exercise 3. Estimating ITD (25 points)

In the previous problem, you synthesized binaural sounds with given ITD. Now you will go the other direction.

3.1 (5 points) As a preliminary exercise, use your functions above to illustrate phase ambiguity.

3.2 (5 points) Show how increasing the bandwith of the sound eliminates the ambiguity.

3.3 (10 points) Write a function to estiamte the ITD, $\Delta t$, from a binaural sound as input.  Illustrate with examples.

3.4 (5 points)  Use the idealized model above to estimate the lateralization $\theta$ from the ITD.  Illustrate with examples.  Does this correspond to your own perception?  Try different cases and report your results

**Considerations**

Since the function for $\Delta t$ in terms of $\theta $ is not algebraically reducible to an analytic expression for $\theta $ in terms of $\Delta t$, you will need to numerically estimate this function.  You can do this, for instance, by defining a lookup table by taking different values of $\Delta t$ to $\theta$, or by iteratively
searching for $\theta$ by repeatedly computing $\Delta t(\theta )$ for some estimate of $\theta$ and using the result to improve your estimate, or by iteratively solving the function using an optimization package.

### Q4. Exploration (20 points)

Select a concept or topic you want to understand better that is related to the problems in this assignment or is in the readings.  Explore it, and write up and illustrate what you tried and learned.  The general idea is for you to teaching yourself, and it should read like a (relatively brief) tutorial.

Here is the grading rubric:
- Clarity of explanation. Could another student read and do this? (5 pts)
- Novelty or distinctness. Does it complement or go beyond what was covered above? (5 pts)
- Does the exercise teach something about the concept(s)? (5 pts)
- How deeply does it explore the concept(s)? (5 pts)