# Introduction to electromagnetics and antennas

## 1. Introduction to electromagnetics  
All electromagnetism starts with Maxwell's equations.  Maxwells equations are a set of coupled partial differential equations which provide mathematical models for how electric and magnetic are generated by charges, currents, and the variation of fields.  Maxwells equations are:
| Equation    | Name |
| -------- | ------- |
| $\nabla \cdot \mathbf{E} = \frac{\rho}{\epsilon}$  |   Gauss' law  |
| $\nabla \cdot \mathbf{B} = 0$  | Gauss' law of magnetism |
| $\nabla \times \mathbf{E} = \frac{\partial \mathbf{B}}{\partial t}$     | Faraday's law    |
| $\nabla \times \mathbf{B} = \mu_0 (\mathbf{J} + \epsilon_0 \frac{\partial \mathbf{E}}{\partial t})$     | Ampere-Maxwell's law    |   

A few notes about these equations:
* $\mathbf{E}$ is the electric field in units of volts/meter
* $\mathbf{B}$ is the magnetic field in units of teslas
* $\epsilon_0$ is the permitivity of free space ~ $8.8 \times 10^{-12}$
* $\mu_0$ is the magnetic permability in a vaccum ~ $1.2 \times 10^{-6}$
* $\mathbf{E}$ and $\mathbf{B}$ are vector quantaties and are a function of position and time. i.e) $\mathbf{E} = E(x, y, z, t)$ and $\mathbf{B} = B(x, y, z, t)$
* $\nabla \cdot \mathbf{A}$ is referred to as the divergence operator.  Sometimes written as $\text{div} \mathbf{A} = \nabla \cdot \mathbf{A}$
    - Defined as $\nabla \cdot \mathbf{A} = \langle \frac{\partial}{\partial x}, \frac{\partial}{\partial y}, \frac{\partial}{\partial z}\rangle\cdot \langle A_x, A_y, A_z\rangle = \frac{\partial A_x}{\partial x} + \frac{\partial A_y}{\partial y} + \frac{\partial A_z}{\partial z}$
* $\nabla \times \mathbf{A}$ is referred to as the curl operator.  Sometimes written as $\text{curl} \mathbf{A} = \nabla \times \mathbf{A}$
    - Defined as $\nabla \times \mathbf{A} = (\frac{\partial A_z}{\partial y} - \frac{\partial A_y}{\partial z})\mathbf{\hat{x}} + (\frac{\partial A_x}{\partial z} - \frac{\partial A_z}{\partial x})\mathbf{\hat{y}} + (\frac{\partial A_y}{\partial x} - \frac{\partial A_x}{\partial y})\mathbf{\hat{z}}$  

We will now begin to form a solution for the electric and magnetic field for a source free homogenous medium.  Since we are source free $\mathbf{J}$ becomes zeo and our equations can be rewritten as  

$$\nabla \cdot \mathbf{E} = \frac{\rho}{\epsilon} $$
$$\nabla \cdot \mathbf{B} = 0$$
$$\nabla \times \mathbf{E} = \frac{\partial \mathbf{B}}{\partial t}$$
$$\nabla \times \mathbf{B} = \mu_0 \epsilon_0 \frac{\partial \mathbf{E}}{\partial t}$$  

We will start with taking the fourier transform of the equations to arrive at the time harmonic form of maxwell's equations
$$\nabla \cdot \widetilde{\mathbf{E}} = \frac{\rho}{\epsilon} $$
$$\nabla \cdot \widetilde{\mathbf{B}} = 0$$
$$\nabla \times \widetilde{\mathbf{E}} = j\omega \widetilde{\mathbf{B}}$$
$$\nabla \times \widetilde{\mathbf{B}} = \mu_0 \epsilon_0 j\omega \widetilde{\mathbf{E}}$$  

We can take the curl of faraday's law to arrive at
$$\nabla \times \nabla \times \widetilde{\mathbf{E}} = \nabla \times j\omega \widetilde{\mathbf{B}}$$
$$\nabla \times \nabla \times \widetilde{\mathbf{E}} = j\omega \nabla \times \widetilde{\mathbf{B}}$$

We can now plug in ampere's law 
$$\nabla \times \nabla \times \widetilde{\mathbf{E}} = j\omega \mu_0 \epsilon_0 j\omega \widetilde{\mathbf{E}}$$
$$\nabla \times \nabla \times \widetilde{\mathbf{E}} = -\omega^2 \mu_0 \epsilon_0 \widetilde{\mathbf{E}}$$  

Now using the identity $\nabla \times \nabla \times A = \nabla \nabla \cdot A - \nabla^2A$ we can re-write this as
$$\nabla \nabla \cdot \widetilde{\mathbf{E}} - \nabla^2\widetilde{\mathbf{E}} = -\omega^2 \mu_0 \epsilon_0 \widetilde{\mathbf{E}}$$  
But from Gauss' law we know that $\nabla \cdot \mathbf{E} = 0$ so we can rewrite this as

$$- \nabla^2\widetilde{\mathbf{E}} = -\omega^2 \mu_0 \epsilon_0 \widetilde{\mathbf{E}}$$  
$$\nabla^2\widetilde{\mathbf{E}} = \omega^2 \mu_0 \epsilon_0 \widetilde{\mathbf{E}}$$  
$$\nabla^2\widetilde{\mathbf{E}} - \omega^2 \mu_0 \epsilon_0 \widetilde{\mathbf{E}} = 0$$
we define $k^2 = -\omega^2 \mu_0 \epsilon_0$ as the propogation constant which means we can write this as
$$\nabla^2\widetilde{\mathbf{E}} + k^2\widetilde{\mathbf{E}} = 0$$  
This result is known as the Helmholtz equation and is the "spatial" part of solving the wave equation (https://en.wikipedia.org/wiki/Helmholtz_equation)  
This same derivation can be followed to arrive at the same equation for the magnetic field:
$$\nabla^2\widetilde{\mathbf{B}} + k^2\widetilde{\mathbf{B}} = 0$$  


This is a well studied function and its solution (can be derived by using seperation of variables) for our specific boundary conditions is of the form
$$\widetilde{\mathbf{E}}(r) = Ee^{j\mathbf{k}\cdot \mathbf{r}}$$  

where $\mathbf{k} = k_x \hat{x} + k_y \hat{y} + k_z \hat{z}$ and $|k| = \omega \sqrt{\mu \epsilon} = \frac{2\pi}{\lambda}$ 

Where $|\mathbf{k}| = 1$ denotes a direction where the phase of the electric field in a plane perpendicular to the direction of the propogation is constant.  Essentially $\hat{k}\cdot \mathbf{r} = \text{constant}$ defines a plane perpendicular to $\hat{k}$ (essentially draw a vector from the wave in any direction around the propogation direction and the dot product should be close to 1). We call this type of function $e^{j\mathbf{k}\cdot \mathbf{r}}$ a plane wave  

![plane_wave](./images/plane_wave.png)  
We have just shown from maxwells equations that from maxwells equations we can derive the wave equation and the solution to the wave equation are plane waves.  To find the solution to the magnetic field we can use  $\nabla \times \widetilde{\mathbf{E}} = j\omega \widetilde{\mathbf{B}}$.  If we rewrite this we can see  
$$\widetilde{\mathbf{B}} = \frac{1}{j\omega} \nabla \times \widetilde{\mathbf{E}}$$
we can plug in our solution for the electric field to get  
$$\widetilde{\mathbf{B}} = \frac{1}{j\omega} \nabla \times \widetilde{\mathbf{E}}$$

This can be simplified to 
$$\widetilde{\mathbf{B}} = \hat{k} \times Ee^{j\mathbf{k}\cdot \mathbf{r}}$$  

which means that the magnetic field is perpendicular to the direction of propogation and the electric field.  So we have derived that:

$$\widetilde{\mathbf{E}}(r) = Ee^{j\mathbf{k}\cdot \mathbf{r}}$$  
$$\widetilde{\mathbf{B}}(r) = \hat{k} \times Ee^{j\mathbf{k}\cdot \mathbf{r}}$$  

![em_wave](./images/EMwave.jpg)  


# 2. Antennas
Antennas are devices which convert a propogating wave in a circuit to a wave which propogates in free space.  Antennas can be used in two modes:
* Transmit
    - sending electromagnetic waves through free space
* Receive
    - receiving electromagnetic waves in free space and converting them to AC circuit

## Types of antennas
There are many types of antennas.  Here are a few common ones:
### Dipole
![dipole](./images/dipole.png)  
### Horn
![horn](./images/horn_antenna.jpg)  

### Patch
![patch](./images/patch.jpg)  

### Helical
![helical](./images/helical_antenna.jpg)  

### Parabolic reflector
![parabolic](./images/parabolic_reflector.jpg)  

## Radiation conditions (near field vs far field)
Radiation occurs because time varying voltage and current.  There are three regions of radiated energy:
* Reactive near field
* radiating near field (fresnel region) 
* Far field (Fraunhofer region)  

![radiation_regions](./images/radiation_regions.PNG)  

### Reactive near field
The reactive near-field regions is defined as "that portion of the near-field region immediately surrounding the antenna wherein the reactive field predominates" For most antennas, the outer boundary of this region is commonly taken to exist at a distance $R <
0.62\sqrt{\frac{D^3}{\lambda}}$ from the antenna surface, where $\lambda$ is the wavelength and $D$ is the largest dimension of the antenna.
### radiating near field (fresnel region) 
Radiating near-field (Fresnel) region is defined as “that region of the field of an antenna between the reactive near-field region and the far-field region wherein radiation fields predominate and wherein the angular field distribution is dependent upon the distance from the antenna
### Far field (Fraunhofer region)
Far-field (Fraunhofer) region is defined as “that region of the field of an antenna where the angular field distribution is essentially independent of the distance from the antenna.  This is known as the plane wave condition where electric field, magnetic field, and propogation direction are all independent.  

This region is defined by $ R > \frac{2D^2}{\lambda}$.  Antenna performance is typically measured in the far field operating region.  Antennas are expected to be operating the the far field region 
## radiaion patterns
A radiation pattern (sometimes referred to as antenna pattern) is a mathematical function/graphical representation of the radiation properties of the antenna as a function of spatial coorinates.  They describe the intensity of the electric and magnetic field as a function of space.  

We will define the following angular quantities from rectangular coordinates. I'll use the defintion from this website: https://www.antenna-theory.com/definitions/sphericalCoordinates.php

$$R = \sqrt{x^2 + y^2 + z^2}$$
$$\theta = \text{arccos}{\left(\frac{z}{\sqrt{x^2 + y^2}}\right)}$$
$$\phi = \text{arctan}{\left(\frac{y}{x}\right)}$$


The radiation pattern's "shape" will be the fourier transform of the antenna aperture.  This is a well known phenomina and is explained in more depth here https://en.wikipedia.org/wiki/Fraunhofer_diffraction_equation

In [None]:
import matplotlib.pyplot as plt
import numpy as np
phi = np.linspace(-np.pi, np.pi, 1024)

dipole_pattern = np.sin(phi)**2
horn_pattern = np.sinc(phi)

fig, axs = plt.subplots(subplot_kw={'projection': 'polar'}, ncols=2, figsize=(12, 12))
axs = axs.flatten()
axs[0].plot(phi, 20*np.log10(np.abs(dipole_pattern)))
axs[0].set_rmax(0)
axs[0].set_rmin(-40)
axs[0].set_title('dipole antenna')

axs[1].plot(phi, 20*np.log10(np.abs(horn_pattern)))
axs[1].set_rmax(0)
axs[1].set_rmin(-40)
axs[1].set_title('horn antenna')
plt.show()

fig, axs = plt.subplots(ncols=2, figsize=(12, 6))
axs = axs.flatten()
axs[0].plot(np.rad2deg(phi), 20*np.log10(np.abs(dipole_pattern)))
axs[0].set_ylim([-40, 0])
axs[0].set_title('dipole antenna')
axs[0].set_xlabel('$\phi$ (deg)')

axs[1].plot(np.rad2deg(phi), 20*np.log10(np.abs(horn_pattern)))
axs[1].set_ylim([-40, 0])
axs[1].set_title('horn antenna')
axs[1].set_xlabel('$\phi$ (deg)')
plt.show()


Above we looked at a slice of a three dimensional radiation pattern.  Let's examine what the full radiation pattern in three dimensional space looks like for the examples above

In [None]:
phi = np.linspace(-np.pi, np.pi, 1024)
theta = np.linspace(0, np.pi, 1024)
theta, phi = np.meshgrid(theta, phi)
r = np.sinc(phi)

x = r * np.sin(phi) * np.cos(theta)
y = r * np.sin(phi) * np.sin(theta)
z = r * np.cos(phi)

fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
ax.plot_surface(x, y, z, cmap='viridis')
ax.set_title('3D Antenna Pattern')
plt.show()


## beamwidth
Beamwidth is defined as the angular seperation between two identical points on the opposite side of the pattern's main beam.  There are two quantities commonly used for beamwidth
* Half power beamwidth (HPBW)
    - Angle at which the peak power drops by half (3 dB)
* First null beamwidth (FNBW)
    - Angle between the first null of the pattern  

Beamwidth is used to describe the resolution of the antenna, mainly its capability to distinguish between two adjacent sources. This is the same concept as range resolution in radar!  

Antenna beamwidth is characterised in both the $\theta$ and $\phi$ dimension



In [None]:
import matplotlib.pyplot as plt
import numpy as np
phi = np.linspace(-np.pi, np.pi, 1024)

horn_pattern = np.cos(phi)


half_power_bw_idx = np.argwhere(np.isclose(np.abs(horn_pattern)**2, 0.5, rtol=0.01)).ravel()
fnbw_idx = np.argwhere(np.isclose(np.abs(horn_pattern)**2, 0.0, atol=0.0001)).ravel()
mid_idx = fnbw_idx.shape[0] // 2
fnbw_lower_bound = phi[fnbw_idx[mid_idx-1]]
fnbw_upper_bound = phi[fnbw_idx[mid_idx+1]]
fnbw = fnbw_upper_bound - fnbw_lower_bound 
hpbw_lower_bound = phi[half_power_bw_idx.min()]
hpbw_upper_bound = phi[half_power_bw_idx.max()]
half_power_bw = hpbw_upper_bound - hpbw_lower_bound 
fig, axs = plt.subplots(subplot_kw={'projection': 'polar'})
# axs = axs.flatten()
axs.plot(phi, 20*np.log10(np.abs(horn_pattern)))
axs.plot([hpbw_upper_bound, hpbw_upper_bound], [-40, 0], '--r', label=f'HPBW = {np.rad2deg(half_power_bw):.1f} deg')
axs.plot([hpbw_lower_bound, hpbw_lower_bound], [-40, 0], '--r')
axs.plot([fnbw_upper_bound, fnbw_upper_bound], [-40, 0], '--m', label=f'FNBW = {np.rad2deg(fnbw):.1f} deg')
axs.plot([fnbw_lower_bound, fnbw_lower_bound], [-40, 0], '--m')
axs.set_rmax(0)
axs.set_rmin(-40)
axs.set_title('horn antenna')
plt.legend()
plt.show()

## bandwidth
Bandwidth describes the range of requencies over which the antenna can properly radiate or receieve energy.  Bandwidth is typically characterized by the voltage standing wave ratio (VSWR) or by inspecting the reflection coefficient:
$$\text{VSWR} = \frac{1 + \Gamma}{1 - \Gamma}$$  
Where $\Gamma$ is the reflection coefficient  
![antenna_bandwidth](./images/antenna_bandwidth.png)  
## directivity  
Directivity of an antenna is defined as the ratio of radiation intensity in a given direction from the antenna to the radiation intensity averaged over all directions.
$$D = \frac{U}{U_0} = \frac{4\pi U}{P_{rad}}$$  

Ofthen times people will represent directivity with respect to the direciton of maximum radiation intensity and expess directivity as 
$$D_{max} = \frac{4\pi U_{max}}{P_{rad}}$$  
Where 
* $U$ is the ratiaiton intensity in (Watts/unit solid angle)
* $P$ is the total radiated power in Watts  

More generally, we can compute directivity as 
$$D(\theta, \phi) = \frac{4\pi U(\theta, \phi)}{\int_0^{2_\pi} \int_0^{\pi} U(\theta, \phi)\sin\theta d\theta d\phi}$$  
One important quantity arises from this equation known as the beam solid angle.  The beam solid angle $\Theta_A$ is defined as 

$$\Theta_A = \int_0^{2_\pi} \int_0^{\pi} U(\theta, \phi)\sin\theta d\theta d\phi$$  

You can think of the beam solid angle as the fixed angle through which all the power of the antenna would flow if its radiation intensity is constant for all angles within $\Theta_A$  

Often times we dont have closed form expressions for the radiation pattern of an antenna and approximate the beam solid angle in terms of its bi-directional beamwidth:
$$\Theta_A \approx \Theta_{\theta} \Theta_{\phi}$$  
Where $\Theta_{\theta}$ and  $\Theta_{\phi}$ are the respective beamwidths in both angular dimensions
## antenna efficieny
Antenna efficiency (sometimes called radiation efficiency) is how effective the antenna is at converting RF power at the input to radiated power at the output.  This quantity is typically measured for each antenna and is denoted by $\eta$
## gain
Gain is how much the signal is amplified in a particular direction.  This is the same as amplified gain except now it's a function of the radiation direction.  Mathematically this is defined as:
$$G = \eta D$$

where $\eta$ is the antenna efficiency and $D$ is the directivity of the antenna


# 3. Antenna arrays/beamforming  
Antenna arrays are when multiple antennas are placed closely together to form an "array"
![real_arr](./images/real_antenna_array.jfif)  

Initially we'll study a well known antenna array configuration known as the uniform linear array.  For the following we will use the following coordinate system:
![phased_array_coord](./images/phased_array_geometry.png)  

## Uniform linear array (ULA)  
A uniform linear array is an array of antennas positioned such that each antenna is a distance $d$ away from one another.  

![ula_geom](./images/ula_example.png)  

For this linear array we will have $N$ antennas spaced a distance $d$ meters and sum the signals from each antenna together.  Using the diagram above we can see that the signal from a planar wave arrives at each antenna at a time $kd\cos{\theta}$ later.  This means for a plane wave we can write the received signal at each antenna as:
$$Y = e^{jkd(0)\cos{\theta}} + e^{jkd(1)\cos{\theta}} + e^{jkd(2)\cos{\theta}} + ...$$  
$$Y = \sum_{n=0}^{N-1}e^{jkdn\cos{\theta}}$$

The expression $Y$ is commonly known as the array factor and is the effect of placing your antennas spaced at a distance $d$ apart.  With that in mind it can be re-write this as:
$$AF = \sum_{n=0}^{N-1}e^{jkdn\cos{\theta}}$$  

Here we assumed at each antenna was radiating isotropically (had unit gain), however in reality that is not the case.  Let's add in the contribution of each antenna pattern $R(\theta, \phi)$
$$Y = \sum_{n=0}^{N-1}R_n(\theta, \phi)e^{jkdn\cos{\theta}}$$  
If we assume that we use the same antenna for each element then we can remove the dependence on $n$ in the equation and use $R_n(\theta, \phi) = R(\theta, \phi)$ adn rewrite our expression as

$$Y = R(\theta, \phi)\sum_{n=0}^{N-1}e^{jkdn\cos{\theta}}$$
$$Y = R(\theta, \phi)\cdot AF$$  

Which means the radiation pattern of our antenna array will be the radiation pattern of an individual element times the array factor.  

### Effect of number of antennas
To see how number of antennas effects the array response, let's do some plotting:

In [None]:
import numpy as np
import matplotlib.pyplot as plt
from scipy.constants import speed_of_light
freq = 1e9
wavelen = speed_of_light / freq
d = wavelen / 2
n_antenna = 8
k = 2*np.pi/wavelen
theta = np.linspace(0, 2*np.pi, 1024)
AF = np.zeros(theta.size, dtype=np.complex128)
for n in range(n_antenna):
    AF += np.exp(-1j * k * n * d * np.cos(theta))

fig = plt.figure(figsize=(12, 6))
ax1 = plt.subplot(121, projection='polar')
ax2 = plt.subplot(122)

ax1.plot(theta, np.abs(AF))
ax1.set_title('Array Factor')
distance = d * np.arange(n_antenna)
ax2.plot(distance, np.zeros(n_antenna), 'rx')
ax2.set_title('Antenna positions')
ax2.set_xlabel('Distance (meters)')
plt.show()

here we see that increasing the number of elements narrows our beamwidth!  

### Effect on antenna spacing
To see how antenna spacing effects the array response, let's do some plotting:

In [None]:
import numpy as np
import matplotlib.pyplot as plt
from scipy.constants import speed_of_light
freq = 1e9
wavelen = speed_of_light / freq
d = wavelen / 2
k = 2*np.pi/wavelen
n_antenna = 4
theta = np.linspace(0, 2*np.pi, 1024)
AF = np.zeros(theta.size, dtype=np.complex128)
for n in range(n_antenna):
    AF += np.exp(-1j * k * n * d * np.cos(theta))

fig, axs = plt.subplots(subplot_kw={'projection': 'polar'})
axs.plot(theta, np.abs(AF))
plt.show()

We see that the maximum we can space our antennas without seeing duplicates (aliases) of our beam is $d \leq \frac{\lambda}{2}$.  You can think of this as the "spatial nyquist theorem" for antennas meaning that if we dont spatially sample at a rate od $\frac{\lambda}{2}$ then we will have spatial aliases.  These aliases are known as grating lobes in antenna theory.  

### Steering the beam
Up until this point we have worked with broadside antennas meaning they radiate directly down their line of sight (boresight).  If we want to focus our energy in another direction then we need to physically move our antenna.  

Let's examine what happens to the array factor if we perturb the phase by a constant $\beta = kd\cos{\theta_0}$. i.e)
$$AF = \sum_{n=0}^{N-1}e^{jkdn\cos{\theta} + \beta}$$  


In [None]:
import numpy as np
import matplotlib.pyplot as plt
from scipy.constants import speed_of_light
import ipywidgets as widgets

freq = 1e9
wavelen = speed_of_light / freq
d = wavelen / 2
n_antenna = 16
k = 2*np.pi/wavelen
theta = np.linspace(-np.pi, np.pi, 1024)

def show_steering(steer_angle_deg = 20):
    AF = np.zeros(theta.size, dtype=np.complex128)
    for n in range(n_antenna):
        AF += np.exp(-1j * k * n * d * (np.cos(theta) - np.cos(np.deg2rad(steer_angle_deg))))

    fig = plt.figure(figsize=(12, 6))
    ax1 = plt.subplot(121, projection='polar')
    ax2 = plt.subplot(122)

    ax1.plot(theta, np.abs(AF))
    ax1.set_title('Array Factor')
    distance = d * np.arange(n_antenna)
    ax2.plot(distance, np.zeros(n_antenna), 'rx')
    ax2.set_title('Antenna positions')
    ax2.set_xlabel('Distance (meters)')
    plt.show()

widgets.interact(show_steering, steer_angle_deg=(-180, 180), )

We see that this gives us the ability to steer the direction in which the array is focusing energy.  This technique of focusing energy of the array is known as beamforming.  This can be achieved in a few ways:
### Analog
* Butler matrix
* phased array
### Digital
* Digital beamforming
    - Capon/Barlett beamforming


# 4. Angle of arrival estimation  

Angle of arrival estimation (sometimes referred to as direction of arrival) is a technique for estimating the direction in which a signal came from.  Angle of arrival estimation and beamforming are quite similar however not the same.  For instance most beamforming techniques can be used for estimating the angle of arrival of signals, however, not all angle of arrival estimating techniques can be used for beamforming  


## barlett beamformer (delay and sum)
The delay and sum beamforming technique will compute the array response over a range of angles. The angle which produces the highest array response is known as the angle of arrival for the given the given signal.  

For the barlett beamformer we will use a uniform linear array with $N$ elements and define our signal model as follows:
$$\mathbf{x}(t) = \mathbf{a}(\theta)\mathbf{f}(t) + \mathbf{n}(t)$$  
where:
* $\mathbf{x}(t) \in \mathbb{C}^{N\times 1}$ is the received signal at each array element at time $t$
* $\mathbf{a}(\theta) \in \mathbb{C}^{N\times D}$ is the matrix of steering vectors for $D$ total signals
* $\mathbf{f}(t) \in \mathbb{C}^{D\times 1}$ is a zero mean random vector that contains our desired signal and possibly other undesired signals
* $\mathbf{n}(t)\in \mathbb{C}^{N\times 1}$ is complex guassian noise  

We make the following assumptions:
1. our desired signals contained in $\mathbf{f}(t)$ are sinusoidal with a constant single frequency
2. There is only a single signal of interest in $\mathbf{f}(t)$ and all the other signals are interferers
3. \mathbf{f}(t) is a wide sense stationary process
   - Its mean and autocorrelation do not vary with time 

The beamformer output for this is:
$$\mathbf{y}(t) = \mathbf{w}^{H} \mathbf{x}(t)$$  
Where $\mathbf{w}$ is our beamforming weights.  If the beamforming is ideal then we can obtain a perfect reconstruction of the original singla i.e) if we set the weights of the beamformer to be the spatial vector (steering vector) of its true spatial vector then we will reconstruct the original signal.  Mathematically this means

$$\mathbf{f}(t) = \mathbf{y}(t) = \mathbf{a}^H(\theta)\mathbf{x}(t)$$  

For simplicity I'm going to drop the function of time and angle.  Let's look at estimating the angular power spectrum of this equation.
$$P = Var(\mathbf{f})$$
$$P = E[(\mathbf{f} - \mu_f)^2]$$  
But we assumed our singla was zero mean
$$P = E[\mathbf{f}^2]$$
$$P = E[|\mathbf{a}^H\mathbf{x}|^2]$$
Using the definition of expected value (mean) we can rewrite this as
$$P = \frac{1}{T}\sum_{t=0}^{T}|\mathbf{a}^H\mathbf{x}|^2$$  

To derive the beamformer we'll examine this expression in more depth
$$P = \frac{1}{T}\sum_{t=0}^{T}|\mathbf{a}^H\mathbf{x}|^2$$  
$$P = \frac{1}{T}\sum_{t=0}^{T}(\mathbf{a}^H\mathbf{x})(\mathbf{a}^H\mathbf{x})^{H}$$
$$P = \frac{1}{T}\sum_{t=0}^{T}\mathbf{a}^H\mathbf{x}\mathbf{x}^{H}\mathbf{a}$$
$$P = \mathbf{a}^H\left(\frac{1}{T}\sum_{t=0}^{T}\mathbf{x}\mathbf{x}^{H}\right)\mathbf{a}$$  
$$P = \mathbf{a}^H\left(E[\mathbf{x}\mathbf{x}^H]\right)\mathbf{a}$$ 
$$P = \mathbf{a}^H\mathbf{R}\mathbf{a}$$ 
Where $\mathbf{R}$ is the signals estimated covariance matrix.  


In [None]:
import numpy as np
import matplotlib.pyplot as plt

sample_rate = 1e6
N = 10000 # number of samples to simulate

# Create a tone to act as the transmitter signal
t = np.arange(N)/sample_rate
f_tone = 0.02e6
tx = np.exp(2j * np.pi * f_tone * t)

d = 0.5 # half wavelength spacing
Nr = 32  # Number of receive antennas
angle_of_arrival = 20
theta = angle_of_arrival / 180 * np.pi 
s = np.exp(-2j * np.pi * d * np.arange(Nr) * np.sin(theta)) # Steering Vector

s = s[:, None] # Nrx1
tx = tx[None, :] # 1xN

X = s @ tx # NrxN

# Add noise
n = np.random.randn(Nr, N) + 1j*np.random.randn(Nr, N)
X = X + 0.2*n

In [None]:
plt.figure()
for ii in range(Nr):
    plt.plot(np.asarray(X[ii,:]).squeeze().real[0:50])
plt.show()

Plot beamformer output

In [None]:
n_scan = 1000
theta_scan = np.linspace(-1*np.pi, np.pi, n_scan)
results = np.zeros(n_scan, dtype=float)
for ii, theta_i in enumerate(theta_scan):
   # Conventional, aka delay-and-sum, beamformer (barlett)
   w = np.exp(-2j * np.pi * d * np.arange(Nr) * np.sin(theta_i)) 
   X_weighted = w.conj().T @ X # apply weights
   results[ii] = 10*np.log10(np.abs(np.var(X_weighted)))

results -= np.max(results)

estimated_angle_of_arrival = np.rad2deg(theta_scan[np.argmax(results)])
print(f'Configured angle of arrival is: {angle_of_arrival:.2f}')
print(f'Estimated angle of arrival is {estimated_angle_of_arrival}')

plt.figure()
plt.plot(theta_scan*180/np.pi, results)
plt.plot([estimated_angle_of_arrival], [0], 'rx')
plt.xlabel("Theta [Degrees]")
plt.ylabel("DOA Metric")
plt.grid()
plt.show()

Talk about only practically estimating between -90:90 because anything beyond that is behind our array

In [None]:
plt.figure()
plt.plot(theta_scan*180/np.pi, results)
plt.plot([estimated_angle_of_arrival], [0], 'rx')
plt.xlabel("Theta (Degrees)")
plt.ylabel("DOA Metric")
plt.xlim([-90, 90])
plt.grid()
plt.show()

Plot spectrum of our array

In [None]:
Nr = 32
d = 0.5
N_fft = 1000
angle_of_arrival = 20
theta = angle_of_arrival / 180 * np.pi
# conventional beamformer (barlett)
w = np.exp(-2j * np.pi * d * np.arange(Nr) * np.sin(theta))
w = np.conj(w)
w_padded = np.concatenate((w, np.zeros(N_fft - Nr)))
w_fft_dB = 10*np.log10(np.abs(np.fft.fftshift(np.fft.fft(w_padded)))**2)
w_fft_dB -= np.max(w_fft_dB)

# Map the FFT bins to angles in radians
theta_bins = np.arcsin(np.linspace(-1, 1, N_fft))

theta_max = np.rad2deg(theta_bins[np.argmax(w_fft_dB)])

plt.figure()
plt.plot(np.rad2deg(theta_bins), w_fft_dB)
plt.plot([theta_max], [np.max(w_fft_dB)], 'rx')
plt.grid()
plt.show()


Show that angle of arrival estimation is the same thing as estimating the spectrum of our array.  Barlett's method, capon's method, MUSIC are all just spectral estimation techniques trying to estimate the array spectrum

In [None]:
fig, ax = plt.subplots()
ax.plot(np.rad2deg(theta_bins), w_fft_dB, label='angle spectrum')
plt.plot(np.rad2deg(theta_scan), results, label='angle of arrival') 
plt.grid()
plt.legend()
plt.xlim([-90, 90])
plt.xlabel('angle of arrival (deg)')
plt.show()

## Capon beamformer (minimum variance distortionless response)  

We'll now explore a more complicated beamformer known as the MVDR (capon) beamformer.  The idea behind MVDR is to keep the signal at the angle of interest at a fixed gain of 1 (0 dB), while minimizing the total variance/power of the resulting beamformed signal. If the signal of interest is kept fixed then minimizing the total power means minimizing interferers and noise as much as possible.  

Using our same signal model and assumptions from before we write the beamformer output as:
$$\mathbf{y}(t) = \mathbf{w}^{H} \mathbf{x}(t)$$  
Last time we didn't fully expand this, let's do that now:
$$\mathbf{y} = \mathbf{w}^{H} (\mathbf{x} + \mathbf{n})$$  
$$\mathbf{y} = \mathbf{w}^{H} \mathbf{x} + \mathbf{w}^{H} \mathbf{n}$$  
$$\mathbf{y} = \mathbf{f} + \mathbf{w}^{H} \mathbf{n}$$  

The noise term will cause issues with our estimate.  What we want to do is minimize the variance of the noise to minimize the power, but the issue is what we sense is the signal + noise (they are not seperable).  This means that we really need to minimize the variance of the overall response $\mathbf{y}$.  Since our objective is to mimize the variance of $\mathbf{y}$ let's compute what the variance is:

$$Var(\mathbf{y}) = E[|\mathbf{y}|^2] + E[\mathbf{y}]^2$$
We assumed our signal and noise are zero mean which means we can ignore the second term
$$Var(\mathbf{y}) = E[|\mathbf{y}|^2]$$
$$Var(\mathbf{y}) = E[|\mathbf{w}^H\mathbf{x}|^2]$$
which is the same expression we arrived at for the barlett beamformer which we know eventually shows up as:
$$Var(\mathbf{y}) = \mathbf{w}^H\mathbf{R}\mathbf{w}$$


We'll now present an optimization problem to solve for this:
$$\min_{\mathbf{w}} \mathbf{w}^H\mathbf{R}\mathbf{w} \text{  subject to } \mathbf{w}^H\mathbf{a} = 1$$  

This requires a bit of effort (and lagrange multipliers) to solve but it can be shown that the solution for the weights is:
$$\mathbf{w} = \frac{\mathbf{R}^{-1}\mathbf{a}}{\mathbf{a}^{H}\mathbf{R}^{-1}\mathbf{a}}$$

In [None]:
def w_mvdr(theta, r):
    s = np.exp(-2j * np.pi * d * np.arange(r.shape[0]) * np.sin(theta))
    s = s.reshape(-1,1)
    R = np.cov(r) # covariance matrix
    Rinv = np.linalg.pinv(R)
    # MVDR weights 
    w = (Rinv @ s)/(s.conj().T @ Rinv @ s)
    return w

def power_mvdr(theta, r):
    s = np.exp(-2j * np.pi * d * np.arange(r.shape[0]) * np.sin(theta))
    s = s.reshape(-1,1)
    R = np.cov(r)
    Rinv = np.linalg.pinv(R)
    return 1/(s.conj().T @ Rinv @ s).squeeze()

Nr = 32
theta1 = 20 / 180 * np.pi
theta2 = 30 / 180 * np.pi
theta3 = -40 / 180 * np.pi
s1 = np.exp(-2j * np.pi * d * np.arange(Nr) * np.sin(theta1)).reshape(-1,1)
s2 = np.exp(-2j * np.pi * d * np.arange(Nr) * np.sin(theta2)).reshape(-1,1)
s3 = np.exp(-2j * np.pi * d * np.arange(Nr) * np.sin(theta3)).reshape(-1,1)

tone1 = np.exp(2j*np.pi*0.01e6*t).reshape(1,-1)
tone2 = np.exp(2j*np.pi*0.02e6*t).reshape(1,-1)
tone3 = np.exp(2j*np.pi*0.03e6*t).reshape(1,-1)
r = s1 @ tone1 + s2 @ tone2 + 0.1 * s3 @ tone3
n = np.random.randn(Nr, N) + 1j*np.random.randn(Nr, N)
r = r + 0.05*n

theta_scan = np.linspace(-1*np.pi, np.pi, 1000)
results = []
for theta_i in theta_scan:
    w = w_mvdr(theta_i, r)
    r_weighted = w.conj().T @ r
    power_dB = 10*np.log10(np.var(r_weighted))
     # compare to using equation for MVDR power which should match
    results.append(10*np.log10(power_mvdr(theta_i, r)))

results -= np.max(results)



# fig, ax = plt.subplots(subplot_kw={'projection': 'polar'})
# ax.set_ylim([-10, 0])
plt.figure()
plt.plot(np.rad2deg(theta_scan), results)
plt.xlim([-90, 90])
plt.show()

Compare capon spectral estimate to array spectrum 

In [None]:
Nr = 32
d = 0.5
N_fft = 1000
angle_of_arrival = 20
theta = angle_of_arrival / 180 * np.pi
theta1 = 20 / 180 * np.pi
theta2 = 30 / 180 * np.pi
theta3 = -40 / 180 * np.pi
s1 = np.exp(-2j * np.pi * d * np.arange(Nr) * np.sin(theta1))
s2 = np.exp(-2j * np.pi * d * np.arange(Nr) * np.sin(theta2))
s3 = np.exp(-2j * np.pi * d * np.arange(Nr) * np.sin(theta3))
w = s1 + s2 + s3
w = np.conj(w)
w_padded = np.concatenate((w, np.zeros(N_fft - Nr)))
w_fft_dB = 10*np.log10(np.abs(np.fft.fftshift(np.fft.fft(w_padded)))**2)
w_fft_dB -= np.max(w_fft_dB)
plt.figure()
plt.plot(np.rad2deg(theta_bins), w_fft_dB)
plt.plot(np.rad2deg(theta_scan), results)
plt.xlim([-90, 90])
plt.show()

## MUSIC
MUltiple SIgnal Classification (MUSIC) is an angle of arrival (not beamforming) approach for estimating the angle of arrival of a signal.  The previous two techniques fell into the "delay and sum" category, but MUSIC is known as a subspace technique.  

MUSIC assumes that you know the number of sources in your signal and will categorize $N$ signals as the sources and the others as noise.  How it achieves this is by performing an eigen decomposition on the estimated signal covariance matrix and grouping $N$ eigenvectors with the $N$ strongest eigenvalues as the signal subspace and the other remaining eigenvectors as noise.  

We saw that the signals covariance matrix $\mathbf{R}$ is computed by:
$$\mathbf{R} = \mathbf{a}\mathbf{R}_{s}\mathbf{a}^{H}$$  

This is a little too complicated so i'll just skip to the result:
$$\theta = \argmax{\left(\frac{1}{\mathbf{a}\mathbf{V}_{n}\mathbf{V}_{n}^{H}\mathbf{a}}\right)}$$  

Where $$

When a spatial vector is orthogonal to the noise subspace, the peaks of the pseudospectrum are infinite (remember dot product between two orthogonal vectors is 0). In practice, because there is noise, and because the true covariance matrix is estimated by the sampled covariance matrix, the arrival vectors are never exactly orthogonal to the noise subspace. The angles at which the MUSIC psuedo spectrum has finite peaks are the sources angle of arrival. Because the pseudospectrum can have more peaks than there are sources, the algorithm requires that you specify the number of sources, D, as a parameter. Then the algorithm picks the D largest peaks

In [None]:
Nr = 32
theta1 = 20 / 180 * np.pi
theta2 = 25 / 180 * np.pi
theta3 = -40 / 180 * np.pi
s1 = np.exp(-2j * np.pi * d * np.arange(Nr) * np.sin(theta1)).reshape(-1,1)
s2 = np.exp(-2j * np.pi * d * np.arange(Nr) * np.sin(theta2)).reshape(-1,1)
s3 = np.exp(-2j * np.pi * d * np.arange(Nr) * np.sin(theta3)).reshape(-1,1)
tone1 = np.exp(2j*np.pi*0.01e6*t).reshape(1,-1)
tone2 = np.exp(2j*np.pi*0.02e6*t).reshape(1,-1)
tone3 = np.exp(2j*np.pi*0.03e6*t).reshape(1,-1)
r = s1 @ tone1 + s2 @ tone2 + 0.1 * s3 @ tone3
n = np.random.randn(Nr, N) + 1j*np.random.randn(Nr, N)
r = r + 0.05*n

# MUSIC Algorithm
num_expected_signals = 3
R = r @ r.conj().T
w, v = np.linalg.eig(R) 

fig, (ax1) = plt.subplots(1, 1, figsize=(7, 3))
ax1.plot(10*np.log10(np.abs(w)),'.-')
ax1.set_xlabel('Index')
ax1.set_ylabel('Eigenvalue [dB]')
plt.show()

eig_val_order = np.argsort(np.abs(w)) # find order of magnitude of eigenvalues
v = v[:, eig_val_order] # sort eigenvectors
V = np.zeros((Nr, Nr - num_expected_signals), dtype=np.complex64) # Noise subspace is the rest of the eigenvalues
V = v[:, :(Nr - num_expected_signals)] 


theta_scan = np.linspace(-1*np.pi, np.pi, 1000)
results = []
for theta_i in theta_scan:
    s = np.exp(-2j * np.pi * d * np.arange(Nr) * np.sin(theta_i)).reshape(-1,1)
    # MUSIC psuedo spectrum
    metric = 1 / (s.conj().T @ V @ V.conj().T @ s)
    metric = np.abs(metric.squeeze())
    metric = 10*np.log10(metric)
    results.append(metric)
results -= np.max(results)

plt.figure()
plt.plot(np.rad2deg(theta_scan), results)
plt.xlim([-90, 90])
plt.show()


Introduce ideas of adaptive beamforming ny training covariance (least means squares, MMSE (wiener fileter/STAP solution)).  Talk about 2D planar arrays and what 2D lets you do (estimate azimuth and elevation)

In [None]:
%matplotlib widget
from matplotlib.animation import FuncAnimation

# Parameters
num_elements = 8  # Number of array elements
wavelength = 1.0  # Wavelength of the signal
d = 0.5 * wavelength  # Distance between array elements
theta_target = 0  # np.pi / 4  # Target beam direction (in radians)
c = 3e8  # Speed of light
f = c / wavelength  # Frequency of the signal
omega = 2 * np.pi * f  # Angular frequency

# Time vector
t = np.linspace(0, 10 / f, 100)

# Spatial grid for visualization
extent = 20
x = np.linspace(-extent, extent, 200)
y = np.linspace(-extent, extent, 200)
X, Y = np.meshgrid(x, y)

# Array element positions
positions = np.array([np.array([(i - num_elements / 2) * d, 0]) for i in range(num_elements)])

# Steering delays for target direction
def steering_delay(position, theta):
    return np.sin(theta) * position[0] / c

# Compute the wave field
def wave_field(X, Y, t):
    field = np.zeros_like(X, dtype=np.complex128)
    for pos in positions:
        delay = steering_delay(pos, theta_target)
        distance = np.sqrt((X - pos[0])**2 + (Y - pos[1])**2)
        field += np.exp(1j * (2 * np.pi * distance / wavelength - omega * (t - delay)))
    return np.real(field)

# Create the figure and axis
fig, ax = plt.subplots(figsize=(6, 6))
ax.set_xlim(-5, 5)
ax.set_ylim(-5, 5)
ax.set_title("Beamforming Wave Animation")
ax.set_xlabel("x")
ax.set_ylabel("y")

# Initialize the wave plot
im = ax.imshow(np.zeros_like(X), extent=(-5, 5, -5, 5), origin='lower', cmap='viridis', animated=True)

# Animation function
def update(frame):
    field = wave_field(X, Y, t[frame])
    im.set_data(field)
    im.set_clim(vmin=np.min(field), vmax=np.max(field))
    return im,

# Create the animation
ani = FuncAnimation(
    fig, update, frames=len(t), interval=50, blit=True
)

plt.show()
