# Introduction to Gravitational Lensing

FIXME redo with physical deflection angle, reduced deflection angle, and dimensionless deflection angle???

Here we provide a primer course on gravitational lensing. It includes interactive examples using `caustics` to demo various effects.

In [None]:
%load_ext autoreload
%autoreload 2

import torch
import matplotlib.pyplot as plt
import matplotlib.animation as animation
from IPython.display import HTML
import numpy as np

import caustics

In [None]:
# setup stuff to define a grid of points to make images

n_pix = 100
res = 0.05
upsample_factor = 2
z_l = torch.tensor(0.5, dtype=torch.float32)
z_s = torch.tensor(1.0, dtype=torch.float32)

## Lensing Formalism

### Configuration of a Lens

Before we get too deep into the weeds, lets set the stage. A photon has arrived in your telescope and you are wondering where it came from. Well if it encountered some mass on its path to you (and really what photon hasn't?) then the path it took might look something like this:

![Lens configuration](assets/lensconfiguration.png)

*Figure taken from Bartelmann and Schneider 2001 (figure 11)*

The center of your image is your (0,0) coordinate and this corresponds to the *optical axis* which is the dashed line. The source of the photon is said to exist in the *source plane* which is a distance of $D_s$ from the observer. The mass that deflected the photon is said to exist in the *lens plane* which is a distance $D_l$ (I use $D_l$ rather that $D_d$ for the sake of my sanity) from the observer and $D_{ls}$ from the source plane. Finally, you receive the photons in the *image plane* at the observer. Note that cosmological distances are so large that these may be very accurately treated as infinitely thin planes, even if the source and the mass are in fact 3D structures. We'll talk a little about extended mass distributions in the [Multiplane Lensing](#multiplane-lensing) section. In the source plane the photons are coming from a point $\vec{\eta}$ away from the optical axis, this is a physical distance you could measure in kilometers if you like. In the lens plane, the photon intersected at $\vec{\xi}$ away from the optical axis (again physical units). In the image plane we can say the photon was essentially at the optical axis, though it was probably a few cm/meters away from it in reality when it struck the telescope.

As an observer, you measure everything in angular units. The photon is the solid line which you see at some position $\vec{\theta}$ off from the optical axis. Had there been no mass intercepting the photon, it would have never reached your telescope, instead a photon coming from the same source would have arrived at $\vec{\beta}$ angular units from your optical axis. The correction at the source plane which shows how much the photon deviated from its original path is called the *deflection angle* $\hat{\vec{\alpha}}$ and it is essentially a measure of the impact lensing has had on your observation. Note that in the figure the deflection angle is drawn as if the photon traveled backwards, leaving your telescope at angle $\vec{\theta}$ getting to the lens plane, getting deflected by $\hat{\vec{\alpha}}$, and finally arriving at the source plane. This is on purpose since the geometry of lensing works in either direction, but computationally it is easier to go backwards (we'll see why in [the lens equation section](#the-lens-equation)).

These are the names and symbols used to describe the configuration of most lensing systems. So far we haven't told you how any of this came to be, but hopefully it will give you a reference point to understand what comes next.

### The Lens Equation

With some simple trigonometry it is possible to relate the physical sizes and the angular sizes:

$$\vec{\eta} = \vec{\beta} D_s$$

$$\vec{\xi} = \vec{\theta} D_l$$

Just note that the distances must be in angular diameter distances for these relations to hold (essentially by definition). We can also go a step further and get $\vec{\eta}$ using $\vec{\theta}$ and $\hat{\vec{\alpha}}$:

$$\vec{\eta} = \vec{\theta} D_s - \hat{\vec{\alpha}} D_{ls}$$

From here we get the lens equation using the two $\vec{\eta}$ formulas:

$$\vec{\beta} D_s = \vec{\theta} D_s - \hat{\vec{\alpha}} D_{ls}$$

We are tantalizingly close to just cancelling those $D_s$ distances. How about we make a definition $\vec{\alpha} \equiv \frac{D_{ls}}{D_s}\hat{\vec{\alpha}}$ then we can write the lens equation as:

$$\vec{\beta} = \vec{\theta} - \vec{\alpha}$$

Excellent! Just note that we had to change $\hat{\vec{\alpha}}$ which is the real angular deflection angle that we can do geometry with, to now be $\vec{\alpha}$ which has a convenient refactoring such that the equation is easier to work with. The modified $\vec{\alpha}$ is called the *reduced deflection angle* and usually other quantities that have had a similar transformation to make the lens equation come out nice are called reduced quantities.

This is a very nice and simple equation, one might be fooled into thinking it is easy to work with. Now lets explain why one typically works backwards with photons "coming from" the observer and going to the source plane. The deflection angle for almost all lenses is a function of the position in the sky (more deflection closer to the mass, less deflection further away) meaning that it is in fact a function $\vec{\alpha}(\vec{\theta})$. So to compute $\vec{\beta}$ when you know $\vec{\theta}$ means simply computing $\vec{\alpha}(\vec{\theta})$ and the taking the difference $\theta - \vec{\alpha}(\vec{\theta})$. But if you want to compute $\vec{\theta}$ only knowing $\vec{\beta}$, then you will need to search around for a value to input into $\vec{\alpha}(\vec{\theta})$ until you can get the equation to be satisfied, which means you need to know $\vec{\theta}$ to compute $\vec{\theta}$! This is a source of great consternation in gravitational lensing researchers, but is a fact we must deal with. In the next sections we'll get into more specifics on all the interesting results that come from this simple equation.

### Lens Convergence

Lensing is sensitive to all mass on the line of sight. Typically, we describe astronomical massive objects via a 3D density $\rho(\vec{\theta},z)$, but in gravitational lensing we may describe objects as existing on an infinitely thin lens plane. The relevant quantity is then the *surface density* which we may determine by integrating over $z$ to get:

$$\Sigma(\vec{\theta}) = \int\rho(\vec{\theta}, z)dz$$

This is often normalized by the *critical surface density* at the lens plane $\Sigma_{cr} = \frac{c^2D_s}{4\pi G D_lD_{ls}}$ which is a cosmological quantity. This allows us to compute a dimensionless surface density or *convergence*:

$$\kappa(\vec{\theta}) = \frac{\Sigma(\vec{\theta})}{\Sigma_{cr}}$$

The convergence is very useful as it is a dimensionless way to describe the cumulative mass distribution along the line of sight (at least within our lens plane, we'll get more into this in the [multiplane section](#multiplane-lensing)). It is the most easily translated quantity to say numerical simulations since we can simply take a projected mass histogram to get the convergence.

Below we have plotted some convergence maps for various lensing mass distributions. In the second subfigure we included a convergence map computed from a numerical simulation (TNG) by essentially taking a 2D histogram of particle masses.

In [None]:
cosmology = caustics.FlatLambdaCDM()
theta_x, theta_y = caustics.utils.meshgrid(res, n_pix)

fig, axarr = plt.subplots(1, 5, figsize=(25, 5))
plt.subplots_adjust(wspace=0.01)
fig.suptitle("Lens Convergence Examples")
lenses = [
    caustics.SIE(cosmology=cosmology, x0=0, y0=0, q=0.6, phi=np.pi / 3, b=1, z_l=z_l),
    caustics.PixelatedConvergence(
        cosmology=cosmology,
        x0=0,
        y0=0,
        convergence_map=np.load("assets/kappa_maps.npz")["kappa_maps"][1],
        pixelscale=res,
        z_l=z_l,
    ),
    caustics.NFW(cosmology=cosmology, x0=0, y0=0, m=1e12, c=5, z_l=z_l),
    caustics.PseudoJaffe(
        cosmology=cosmology,
        x0=0,
        y0=0,
        mass=1e12,
        core_radius=0.2,
        scale_radius=1.0,
        z_l=z_l,
    ),
    caustics.Multipole(
        cosmology=cosmology, m=(3,), x0=0, y0=0, a_m=[2], phi_m=[np.pi / 3], z_l=z_l
    ),
]
for i, lens in enumerate(lenses):
    k = lens.convergence(theta_x, theta_y, z_s=z_s)
    k = k.log() if i < 4 else k.tanh()
    axarr[i].imshow(k.numpy())
    axarr[i].set_title(lens.__class__.__name__)
    axarr[i].axis("off")
plt.show()

### Lens Potential

There are many ways to describe a gravitational lens, with various advantages. Here we will describe the lensing potential $\Psi$, which is much like the gravitational potential except projected into the 2D plane of our infinitely thin lens:

$$\hat{\Psi}(\vec{\theta}) = \frac{2D_{ls}}{D_lD_sc^2}\int\phi(D_l\vec{\theta},z)dz$$

And the dimensionless version of this potential is:

$$\Psi(\vec{\theta}) = \frac{D_l^2}{\xi_0^2}\hat{\Psi}$$

A convenient property of the lensing potential is that it encodes all the lens properties locally

### Relating the deflection angle, convergence, and potential

The simplest to convert is the potential:

$$\vec{\alpha}(\vec{\theta}) = \vec{\nabla}_{\vec{\theta}}\Psi(\vec{\theta})$$

$$\kappa(\vec{\theta}) = \frac{1}{2}\triangle_{\vec{\theta}}\Psi(\vec{\theta})$$

The potential encodes the lensing information locally, what this means is that one may take derivatives to get the other quantities. There is no need to perform any integrals over all space with the potential. This is a useful property, especially in `caustics` where all functions are automatically differentiable.

Next we can see how the convergence transforms:

$$\Psi(\vec{\theta}) = \frac{1}{\pi}\int\kappa(\vec{\theta}')\ln|\vec{\theta} - \vec{\theta}'|d\vec{\theta}'$$

$$\vec{\alpha}(\vec{\theta}) = \frac{1}{\pi}\int\kappa(\vec{\theta}')\frac{\vec{\theta}-\vec{\theta}'}{|\vec{\theta}-\vec{\theta}'|}d\vec{\theta}'$$

Opposite to the potential, one must perform integrals over all space to convert the convergence into other quantities. One convenience is that these integrals are framed as convolutions, so one may use efficient numerical algorithms (Fast Fourier Transforms) to compute reasonable approximations of these integrals.

It is not common to convert from the deflection angle to either the potential or convergence as this would involve inverting the above equations. This cannot be done in general and so an iterative method would likely be needed.

## Deflection of a Point Mass

In this tutorial we will not be too concerned with the derivation of lensing quantities, instead focusing on intuition and core concepts (see [Fleury et al. 2022](https://iopscience.iop.org/article/10.1088/1361-6382/abea2d), the [Meneghetti lecture notes](https://www.ita.uni-heidelberg.de/~jmerten/misc/meneghetti_lensing.pdf), and [Bartelmann and Schneider 2001](https://doi.org/10.1016/S0370-1573(00)00082-X)). So now let's see the deflection angle for a point mass:

$$|\hat{\vec{\alpha}}| = \frac{4GM}{c^2|\vec{\theta}|}$$

We use $|\hat{\vec{\alpha}}|$ and $|\vec{\theta}|$ because angular positions on the sky are 2D vectors, but in this equation we only need to know the magnitudes. A point mass is symmetric about its center with the deflection always pointing inwards. For simplicity we have assumed the point mass is on the optical axis, you can of course translate the point mass to other parts of the field of view. Notice that the deflection drops off with distance from the point mass, it should not be too surprising that the amount of deflection decreases further from the mass.

The above deflection is the physical deflection angle which is in regular units. If we convert to reduced deflection angle then things become simpler (as usual).

$$|\vec{\alpha}| = \frac{R_e^2}{|\vec{\theta}|}$$

where $R_e$ is the Einstein radius, which is the angular radius at which the point mass has a critical curve with (technically) infinite magnification. We'll get to [magnification](#magnification-and-shear) in a later section. The Einstein radius is defined as:

$$R_e = \sqrt{\frac{4GMD_{ls}}{c^2D_lD_s}}$$

Let's see what it looks like for something to be lensed by a point mass. We will use a gaussian blob on the source plane and have it pass behind the point mass.

In [None]:
# Define a gaussian blob for the source plane
source = caustics.Sersic(q=1.0, phi=0.0, n=0.5, Re=0.3, Ie=1.0)
# Define a cosmology for the lensing
cosmology = caustics.FlatLambdaCDM()
# Define a point mass for the lens plane
lens = caustics.Point(cosmology=cosmology, x0=0.0, y0=0.0, th_ein=1.0, z_l=z_l)

# Make a bunch of theta values at which to raytrace
theta_x, theta_y = caustics.utils.meshgrid(res, n_pix)
# Evaluate the reduced deflection angles
alpha_x, alpha_y = lens.reduced_deflection_angle(theta_x, theta_y, z_s)
# Compute the lens equation at these coordinates
beta_x, beta_y = theta_x - alpha_x, theta_y - alpha_y

# Evaluate the brightness with the source moving along the x-axis
source_x = torch.linspace(-2, 2, 100)
source_y = torch.zeros(100)
unlensed_images = torch.vmap(lambda x: source.brightness(theta_x, theta_y, x))(
    torch.stack((source_x, source_y), dim=1)
)
lensed_images = torch.vmap(lambda x: source.brightness(beta_x, beta_y, x))(
    torch.stack((source_x, source_y), dim=1)
)

In [None]:
# Visualize the reduced deflection angle
fig, (ax1, ax2, ax3) = plt.subplots(1, 3, figsize=(10, 4))
ax1.imshow(alpha_x, origin="lower", cmap="seismic")
ax1.set_title("Reduced deflection angle [$\\alpha_x$]")
ax1.axis("off")
im = ax2.imshow(alpha_y, origin="lower", cmap="seismic")
fig.colorbar(im, ax=ax3, label="deflection magnitude [arcsec]")
ax3.axis("off")
ax2.set_title("Reduced deflection angle [$\\alpha_y$]")
ax2.axis("off")
plt.show()

In [None]:
# Create animation
fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(4, 8))

# Display the first frame of the image in the first subplot
im1 = ax1.imshow(
    unlensed_images[0].numpy(), cmap="grey", interpolation="bilinear", vmin=0, vmax=2
)
ax1.set_title("unlensed")
ax1.axis("off")
im2 = ax2.imshow(
    lensed_images[0].numpy(),
    cmap="grey",
    interpolation="bilinear",
    vmin=0,
    vmax=2,
    extent=[-2, 2, -2, 2],
)
ax2.scatter([0], [0], color="r")
ax2.set_title("lensed")
ax2.axis("off")


def update(frame):
    """Update function for the animation."""
    # Update the image in the first subplot
    im1.set_array(unlensed_images[frame].numpy())
    im2.set_array(lensed_images[frame].numpy())

    return im1, im2


ani = animation.FuncAnimation(fig, update, frames=lensed_images.shape[0], interval=60)

plt.close()
HTML(ani.to_jshtml())

## Deflection of an Extended Mass

Some astronomical objects like black holes and stars are often represented as point masses, but many others such as galaxies and clusters are extended mass distributions. These may be thought of as many point masses combined:

$$\hat{\vec{\alpha}}(\vec{\theta}) = \sum_i\hat{\vec{\alpha}}_i(\vec{\theta}_i - \vec{\theta}) = \frac{4G}{c^2}\sum_iM_i\frac{\vec{\theta}_i - \vec{\theta}}{|\vec{\theta}_i - \vec{\theta}|^2}$$

In the limit that the extended mass is a continuous distribution, the $M_i$ become a density distribution $\rho(\vec{\theta}, z)$ where $z$ is the dimension along the line of sight. 

### Singular Isothermal Sphere (SIS)

There are many mass distributions defined by different surface densities. One of the simplest comes from assuming that the mass of the lens is in the form of an ideal gas in a spherical potential. In thermal and hydrostatic equilibrium the density is given as:

$$\rho(r) = \frac{\sigma_v^2}{2\pi Gr^2}$$

$$\Sigma(|\vec{\theta}|) = \frac{\sigma_v^2}{2G|\vec{\theta}|}$$ 

<!--fixme reduced/dimensionless??? -->

$$\kappa = \frac{1}{2|\vec{\theta}|}$$

And the deflection angle is given by:

$$\vec{\alpha}(\vec{\theta}) = \frac{\vec{\theta}}{|\vec{\theta}|}$$

### Elliptical lens Mass

For any spherically symmetric mass distribution, such as the SIS it is possible to determine a change of coordinates such that the mass has been compressed along one axis, making it an elliptical mass distribution. The *axis ratio* of the ellipse $q = \frac{b}{a}$ is the ratio of semi-minor to semi-major axis lengths and it defines the ellipse. This looks like:

$$r\to R = \sqrt{\frac{x_1^2}{q} + qx_2^2}$$

In this transformed coordinate space the deflection angles become:

$$\alpha_x = \frac{x_1}{qR}\tilde{\alpha}_x(R)$$

$$\alpha_y = \frac{qx_2}{R}\tilde{\alpha}_y(R)$$

where $\tilde{\alpha}$ is the unmodified deflection angle of the spherical mass distribution.

### Singular Isothermal Ellipsoid (SIE)

An SIE is just like an SIS except we've converted it into an elliptical mass distribution. Lets try a video like we had in the [point mass section](#deflection-of-a-point-mass). Now we will use an SIE with an axis ratio of 0.5 to see what happens to the background source.

In [None]:
# Define a gaussian blob for the source plane
source = caustics.Sersic(q=1.0, phi=0.0, n=0.5, Re=0.3, Ie=1.0)
# Define a cosmology for the lensing
cosmology = caustics.FlatLambdaCDM()
# Define a point mass for the lens plane
lens = caustics.SIE(cosmology=cosmology, x0=0.0, y0=0.0, q=0.5, phi=0, b=1.0, z_l=z_l)
# Define a caustics simulator to handle the raytracing
sim = caustics.LensSource(lens, source, pixelscale=res, pixels_x=n_pix, z_s=z_s)

# Evaluate the brightness with the source moving along the x-axis
source_x = torch.linspace(-2, 2, 100)
source_y = torch.zeros(100)
unlensed_images = torch.vmap(lambda x: sim(x, lens_source=False))(
    torch.stack((source_x, source_y), dim=1)
)
lensed_images = torch.vmap(sim)(torch.stack((source_x, source_y), dim=1))

In [None]:
# Create animation
fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(4, 8))

# Display the first frame of the image in the first subplot
im1 = ax1.imshow(
    unlensed_images[0].numpy(), cmap="grey", interpolation="bilinear", vmin=0, vmax=2
)
ax1.set_title("unlensed")
ax1.axis("off")
im2 = ax2.imshow(
    lensed_images[0].numpy(),
    cmap="grey",
    interpolation="bilinear",
    vmin=0,
    vmax=2,
    extent=[-2, 2, -2, 2],
)
ax2.scatter([0], [0], color="r")
ax2.set_title("lensed")
ax2.axis("off")


def update(frame):
    """Update function for the animation."""
    # Update the image in the first subplot
    im1.set_array(unlensed_images[frame].numpy())
    im2.set_array(lensed_images[frame].numpy())

    return im1, im2


ani = animation.FuncAnimation(fig, update, frames=lensed_images.shape[0], interval=60)

plt.close()
HTML(ani.to_jshtml())

Interestingly we see a lot more complexity in this example. The elliptical mass distribution allows for multiple images of the source to appear, rather than just a ring. If you download the tutorial for yourself, you can play around with the parameters to see all sorts of interesting configurations!

## Point Mass Potential

For a point mass the lensing potential is:

$$\Psi(\vec{\theta}) = R_e\ln(|\vec{\theta}|)$$

In [None]:
F = 100
# Define a cosmology for the lensing
cosmology = caustics.FlatLambdaCDM()
# Define a point mass for the lens plane
sie_lens = caustics.SIE(
    cosmology=cosmology, x0=0.0, y0=0.0, q=0.5, phi=0, b=1.0, z_l=z_l
)
# Define a batched plane lens to combine many point masses
point_lens = caustics.BatchedPlane(
    cosmology=cosmology, lens=caustics.Point(cosmology=cosmology, z_l=z_l), z_l=z_l
)

thx, thy = caustics.utils.meshgrid(res, n_pix)
r = np.abs(np.random.randn(F)) * 0.1
th = np.random.rand(F) * 2 * np.pi
point_locs = torch.tensor(
    np.stack((r * np.cos(th), r * np.sin(th)), axis=1), dtype=torch.float32
)
images = [torch.zeros(n_pix, n_pix)]
for n in range(1, F):
    images.append(
        point_lens.potential(
            thx,
            thy,
            z_s,
            [point_locs[:n, 0], point_locs[:n, 1], torch.ones(n) / np.sqrt(n)],
        )
    )

In [None]:
# Create animation
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(8, 4))

# Display the first frame of the image in the first subplot
im1 = ax1.imshow(
    np.zeros((n_pix, n_pix)), cmap="grey", interpolation="bilinear", vmin=0, vmax=2
)
ax1.set_title("point masses [000]")
ax1.axis("off")
im2 = ax2.imshow(
    sie_lens.potential(thx, thy, z_s).numpy(),
    cmap="grey",
    interpolation="bilinear",
    extent=[-2, 2, -2, 2],
)
ax2.set_title("SIE")
ax2.axis("off")


def update(frame):
    """Update function for the animation."""
    # Update the image in the first subplot
    im1.set_array(images[frame].numpy())
    ax1.set_title(f"point masses [{frame:03d}]")
    # im2.set_array(lensed_images[frame].numpy())

    return im1


ani = animation.FuncAnimation(fig, update, frames=F, interval=100)

plt.close()
HTML(ani.to_jshtml())

### Convergence

### Magnification and Shear

<!--- fixme maybe show an image like einstein under a mass sheet and under a shear field--->

### Time Delay

<!--- fixme make a fluctuating source and show the fluctuating lens image --->

### Multiple Images

## Critical Lines and Caustics

## Multiplane Lensing

![Multiplane Lensing](assets/multiplanelensing.png)