# Audio Channel measurements

**Insight**
* How does a real audio channel typically look like?
* Is it well estimated using the deconvolution algorithm(s)?
* Which transmit sequence should be used?
* If we compare two channel estimates, can we estimate a difference in propagation time? And how does that translate into physical distance?

These insights and results are very vital for this project.


## Introduction to Audio Beacon

The audio beacon, once switched on, continuously transmits a sequence of pulses. It is controlled by a programmable microcontroller. It is possible to change the number of pulses, duration of each pulse, the sequence itself, and the period after which the sequence is repeated. The pulse sequence is a modulated binary code sequence with "on-off keying" (OOK). If a bit in the sequence is 0, nothing is transmitted; if the bit is 1, a modulation carrier frequency is transmitted during a certain period.

Besides the actual bit sequence (code word), the parameters that determine the signal are:
* Length of the code sequence (number of bits; parameter `NCodebits`), at most 64 bits;
* Modulation carrier frequency (parameter `Timer0`), at most 30 kHz, although this is probably beyond the specs of the loudspeaker and microphones;
* Duration of a single bit (parameter `Timer1`), this defines the rate at which the modulation carrier signal is switched on or off by the bits in the code word;
* Repetition rate of the bit sequence (parameter `Timer3`). After the code sequence has been played over the loudspeaker, it will be silent for a certain period, and then the sequence is transmitted again, at a rate determined by Timer 3.

To see what these different parameters mean, consider the image below:

```{figure} Beacon.png
---
height/width: 150px
name: beacon-figure
---
Beacon description
```

In the audio beacon in EPO4, these parameters can be modified by sending specific commands to the car, and this programs the microcontroller that generates the audio signal.

The maximal repetition rate is 10 Hz (corresponding to a period of 100 ms). If 64 bits are used at the lowest rate of Timer 1 (1 kHz), then the duration of the sequence will be 64 ms. You will have to choose settings such that the channel impulse response dies out during the remaining period of silence, before the next pulse sequence starts.

Possible values for the Timer parameters are listed in the table below; instead of the actual values, the Timer Index values are used. In the microcontroller, instead of setting the frequencies of the various timers directly, one would instead select the corresponding index (0 through 9).

## Timer Frequencies

| Timer Index           | 0     | 1      | 2      | 3      | 4      | 5      | 6      | 7      | 8      | 9      |
|-----------------------|-------|--------|--------|--------|--------|--------|--------|--------|--------|--------|
| Carrier Freq (Timer0) | 5 kHz | 10 kHz | 15 kHz | 20 kHz | 25 kHz | 30 kHz |        |        |        |        |
| Code Freq (Timer1)    | 1.0 kHz | 1.5 kHz | 2.0 kHz | 2.5 kHz | 3.0 kHz | 3.5 kHz | 4.0 kHz | 4.5 kHz | 5.0 kHz |        |
| Repeat Freq (Timer3)  | 1 Hz  | 2 Hz   | 3 Hz   | 4 Hz   | 5 Hz   | 6 Hz   | 7 Hz   | 8 Hz   | 9 Hz   | 10 Hz  |

## Default Setting of the Audio Beacon

The default setting of the audio beacon is a code sequence bit-stream of 32 bits with:

| Setting             | Value  | Parameter           |
|---------------------|--------|---------------------|
| Code Length         | 32 bits | `NCODEBITS = 32`    |
| Carrier Frequency   | 20 kHz | `TIMER0_INDEX = 3`  |
| Code Frequency      | 5 kHz  | `TIMER1_INDEX = 8`  |
| Repeat Frequency    | 2 Hz   | `TIMER3_INDEX = 2`  |
| Code Word (hex)     | 92340f0f |                     |

These are example values and you cannot assume that these are "optimal"!


## Design Considerations

* It is important to choose settings that will give an optimal channel estimate in the presence of noise and interference. This will generally require long sequences. However, we have to wait until one or two pulse sequences have been received before we can do a channel estimation, and this will from the basis for the location estimate. Faster updates will result in better tracking. For this, it is important to have short sequences! [Activity: determine the maximum duration of a single pulse sequence. Which code parameters determine this?]

* Furthermore, we need to plan for a "guard interval" of silence between two sequences, long enough for the channel response to return to zero. For a large room and a maximal distance of 5 to 6 meters between the beacon and the microphone, what is that duration (in ms)? This determines the maximal repitition rate that you can hope to achieve.

* Another aspect to consider is the dynamic range. The microphone gain will have to be set such that it will not clip even if the transmitter is very close to it, because we want to avoid nonlinear effects. But, in another extreme, over a distance of 5 to 6 meters, the audio signal is already significantly attenuated and may drown in the noise. However, we will have to be able to estimate the channel even over such distances, also in the presence of a nasty interferer close to the microphone. For this, we need long sequences, or to average over several repetitions of the sequence, so that the noise is averaged out and you remain with the channel impulse response.

* We found that the best channel estimation results are obtained if the probing signal is wideband, i.e. covers a large bandwidth. Which timer parameter determines the bandwidth of the signal?  

* We also looked at the effect of carrier frequencies. Which parameter determines this? Is there a reason to use very high carrier frequencies?

* What sample rate should be used? Considerations are the Nyquist condition, but also the computational complexity (at a higher rate you will need to process more samples $N$ and the channel length $L$ is also higher) and the time resolution at which you can detect peaks in the impulse response.

* Finally, we have to think of the practical situation where the microphone signal contains beacon signals of more than one user. Consider what happens if you do the deconvolution using a reference signal that does not match the transmitted signal. E.g., in the Matched Filter, we correlate the received signal with our own code sequence, and hopefully the correlation of someone else's code with our own code is small. Thus, the filter will filter out the other signals.

In [1]:
""" Fs_TX = 44100
Nbits = 64
Timer0 = 3
Timer1 = 8
Timer3 = 2
code = 0x92340f0faaaa4321

x, _ = refsignal(Nbits, Timer0, Timer1, Timer3, code, Fs_TX)
X = # your code here

period = # your code here
t = # your code here
f = # your code here

fig, ax = plt.subplots(2, 1, figsize=(10, 7))

ax[0].plot(t, x)
ax[0].set_title("Audio Beacon in the Time Domain")
ax[0].set_xlabel("Time [s]")
ax[0].set_ylabel("Magnitude")
ax[0].set_xlim([0, 0.015])
ax[0].set_ylim([0, 2])

ax[1].plot(f, np.real(X))
ax[1].set_title("Audio Beacon in the Frequency Domain")
ax[1].set_xlabel("Frequency [kHz]")
ax[1].set_ylabel("Magnitude")
ax[1].set_xlim([0, Fs_TX / 1000])
ax[1].set_ylim([0, max(np.real(X))])

fig.tight_layout() """

' Fs_TX = 44100\nNbits = 64\nTimer0 = 3\nTimer1 = 8\nTimer3 = 2\ncode = 0x92340f0faaaa4321\n\nx, _ = refsignal(Nbits, Timer0, Timer1, Timer3, code, Fs_TX)\nX = # your code here\n\nperiod = # your code here\nt = # your code here\nf = # your code here\n\nfig, ax = plt.subplots(2, 1, figsize=(10, 7))\n\nax[0].plot(t, x)\nax[0].set_title("Audio Beacon in the Time Domain")\nax[0].set_xlabel("Time [s]")\nax[0].set_ylabel("Magnitude")\nax[0].set_xlim([0, 0.015])\nax[0].set_ylim([0, 2])\n\nax[1].plot(f, np.real(X))\nax[1].set_title("Audio Beacon in the Frequency Domain")\nax[1].set_xlabel("Frequency [kHz]")\nax[1].set_ylabel("Magnitude")\nax[1].set_xlim([0, Fs_TX / 1000])\nax[1].set_ylim([0, max(np.real(X))])\n\nfig.tight_layout() '

## TDOA Estimation

In the EPO4 project, we will try to locate a car using an audio beacon. We will use time-difference of arrival (TDOA) measurements made at microphones positioned at known locations. The audio beacon transmits signals which are received by up to 5 microphones. Depending on the distance to each microphone, the signal arrives a little bit earlier or later, and we can convert that into physical distances. For each pair of microphones, we will compute this TDOA, or the physical difference in propogation distance. If we have a large enough number of microphones (4 should work...), then we can calculate the $(x,y)$ location of the transmitter using a Least Squares algorithm. You will do this in the EPO4 project.

Before we can do localization, we have to work on this question: Given the impulse responses measured by two microphones, how is the Time Difference of Arrival (TDOA) estimated? That is the topic of this assignment.

### Algorithm Outline

The audio beacon transmits a continous stream of pulses, using a certain repetition period $T_R$ (specified via the `Timer3` parameter, e.g., 100 ms). If we are not synchronized to the beacon, there is no guarantee that an entire pulse sequence is captured in a period $T_R$: you might have the tail of one sequence, and the head of the next. It is probably easire to capture samples for at least $2T_R$ seconds, i.e., 2 intervals.

We aim to apply the deconvolution algorithm to a single full interval in the received data. So the next step is to synchronize to the start of a received pulse in our data. For this Labday, we will do this step by hand, although in EPO4 it will have to be automated:
* Locate the start of a pulse
* Isolate the entire pulse; try to remove as much of the "silent period" as possible.
* Since we have two microphones, we have to crop two received signals. Make sure you crop them on precisely the same intervals.

To guide the cropping, you will need to consider what is the duration of the beacon pulse, how much longer can it be extended due to the convolution by the audio channel, and how long is the silent period. Use your beacon parameters, consider a maximal propagation distance of 5 m, and convert all times to number of samples.

For an automated method, we would look for a silent period of at least some duration $T_I$, followed by a sample that is above some threshold. As threshold, we could take 50% of the maximal amplitude in the data. To be sure, we would include some of the silent interval in the cropped signal.

Next, apply your deconvolution algorithm to the cropped received data. We can use two methods:
* deconvolve using a reference signal. A suitable reference signal is obtained from a recording at 1 cm.
* deconvolve one microphone signal using the signal from the second microphone (preferably, the strongest one).

Using the first method, you obtain two channel estimates. Locate the peaks in both estimates: their time difference is the TDOA.
For the second method, you have to locate only a single peak. Its time index is the TDOA. But you have to be careful: the TDOA could be negative, and if you used deconvolution algorithm, the channel estimate is periodic and the peak could occur at the far end of the estimate.  

Finally, convert the TDOA into a physical distance (knowing the speed of sound).

```{figure} TDOA.png
---
height/width: 150px
name: tdoa-figure
---
TDOA Estimation
```

The image above shows TDOA estimation: for the impulse response $h_1[n]$ of the first microphone, find the first peak after the silent interval, then go to the second microphone $h_2[n]$ and look for a matching peak in the search window.