# Astronomical Telescopes and Instruments 2020

# Lecture 1 - Foundations of Optics

## 1.1 - Origin of Electromagnetic Waves

![](figs/fig1_1.png)

Maxwell's equations lead directly to eletromagnetic waves. They describe electromagnetic waves, not matter. So if we want to study optics, which is defined as the interaction of electromagnetic waves with matter, we also need equations that describe matter. This can be done with the linear material equations.

$$ \vec{D} = \epsilon \vec{E} \quad \quad \vec{B} = \mu \vec{H} \quad \quad \vec{j} = \sigma \vec{E} $$

The dielectric constant $\epsilon$(also known as the vacuum permittivity) relates the electric displacement to the electric field. A dieletric (material) is an electric insulator that can be polarized by an applied electric field. In a dielectric material, the presence of an electric field $\vec{E}$ causes the bound charges in the material (atomic nuclei and their electrons) to slightly separate, inducing a local electric dipole moment.  
The permeability $\mu$ is the measure of magnetization that a material obtains in response to an applied magnetic field. $\vec{B}$ is the magnetic induction and $\vec{H}$ the magnetic field.  
The electrical conductivity $\sigma$ represents a material's ability to conduct electric current. $\vec{j}$ is the electric current density.
    
These equations are linear relations and are assumptions of the simplest form that explain a lot of optics. These are also a good approximation in astronomy, since light does not change in material. 

In vacuum, $\epsilon = 1$, $\mu = 1$ and $\sigma = 0$, because there is no matter and no charge  carriers. In isotropic media, $\epsilon$ and $\mu$ are scalars. In anisotropic media (e.g. crystals), $\epsilon$ and $\mu$ are tensors of rank 2 (matrices). This is because in different directions the matter behaves differently. However, the assumption of isotropy depends on wavelength. For example, if the wavelength becomes as small as the spacing between atoms, you can not make the assumption of isotropy anymore.

Electromagnetic waves are vector waves, which leads to polarization.

Assuming a static, homogeneous medium with no net electric charges ($\rho = 0$, and for most materials $\mu = 1$), combining the Maxwell equations and linear material equations lead to these differential quations for damped (vector) wave.

![](figs/fig1_2.png)

There is a second derivative to space, a second derivative to time, and a first derivative to time where the damping comes from. The damping is controlled by the conductivity. The electric and magnetic field are the same except for a multiplicative, complex scalar. So, it is sufficient to only consider the electric field. However, at boundaries (the transition from one piece of matter to another piece of matter) you have to consider both, because the boundary conditions are different.

## 1.2 - Description of Electromagnetic Waves

Guess the solution to the wave equation (ansatz):

$$ \vec{E} = \vec{E}_0 e^{i(\vec{k} \cdot \vec{x} - \omega t)} $$

We hope that this ansatz is a solution.

That absolute value of the wave vector is called the wave number. If complex, this describes damping. Plane waves are so useful because the sum of any solution is also a solution.

If you take the temporal derivative of the wave equation, you obtain the Helmholtz equation. Taking the spatial derivatives leads to the disperion relation between $\vec{k}$ and $\omega$. This leads to the copmlex index of refraction $\tilde{n}$. You can split this complex index of refraction $\tilde{n} = n + ik$ into a real part $n$ and an imaginary part $k$, which is the extinction coefficient.

When substituting the plane-wave solution into Maxwell's equations, additional constraints arise. The electric field, magnetic field and the wave vector are all orthogonal to each other, and both the electric and magnetic field are not in the direction of the wave so they are transverse waves. The form a right-handed vector triple. This was all  in isotropic media. When in conducting media, there is a complex index of refraction, and the electric and magnetic field are out of phase.

In isotropic media, the energy flow is proportional to the index of refraction (the absolute value of the complex index of refraction). In anisotropic materials (e.g. crystals), energy propagation and wave vector are not parallel.

Different types of light:
- Monochromatic light. Light of a single frequency, purely theoretical concept, is always fully polarized. That means it oscillates in one given direction. A laser is an example, but it actually has a small frequency range, so it actually is quasi-monochromatic.
- Quasi-monochromatic light. A superposition of mutually incoherent monochromatic light beams. $\delta \lambda / \lambda << 1$. Material properties assumed to be constant over $\delta \lambda$
- Polychromatic light or white light. Think of light from the sun. $\delta \lambda / \lambda \sim 1 $. Incoherent sum of quasi-monochromatic beams. Cannot wirte electric field vector in a plane-wave form. You have to take into that material properties change as function of frequency. Intensity of polychromatic light is given by the sum of intensities of constituing quasi-monochromatic beams.





## 1.3 - Material Properties

No electric conductivity means there is only a real index of refraction. This means that transparent, dieelectric materials have real indices of refraction and conducting materials (metals) have a complex index of refraction.

**Dispersion**: The index of refraction depends on wavelength. It also changes with temperature. It is roughly proportional to the density. Index typically increases with decreasing wavelength. The Abbé numer is a measurement of how small/large the dispersion is, it takes the indices of refraction at different wavelengths.

You can of course of measure $n$ for each wavelength for each glass and tabulate everything, and these obtains can be obtained. The most common alternative way of calculating $n$, instead of getting it from an empirically obtained table, is using the Sellmeier equation. Each glass has three Sellmeier coefficients, and the equation has as input a wavelength and as output an index of refrection. However, this equation is only applicable over a certain wavelength.

Internal transmission means how transparent is the glass. Almost all glass absorbs above 2 $\mu$m. Typically strong absorption in the blue and UV.

## 1.4 - Electromagnetic Waves Across Interfaces

How do the electric and magnetic field look like at boundaries/interfaces?

![](figs/fig1_3.png)

Snell's law can be obtained from Maxwell's equations and their boundary conditions.

$$ \tilde{n}_1 \sin \theta_i = \tilde{n}_1 \sin \theta_r = \tilde{n}_2 \sin \theta_t $$

What happens at interfaces w.r.t amplitude (not angles, Snell's law describes that already)? See slide 36.

## 1.5 - Fresnel Equations

Two special polarization cases that show how reflection and transmission happens in terms of its amplitude:

1. p/TM: Electric field parallel to plane of incidence, transverse magnetic field.
2. s/TE: Electric field perpendicular (German: senkrecht) to plane of incicence, transverse electric field.

The ratios of reflected and transmitted to incident wave amplitudes give the Fresnel equations.

![](figs/fig1_4.png)

## 1.6 - Application of Fresnel Equations

Generally, $t_s, t_p, r_s, r_p$ are complex. 

Only the ratio of refractive indices is relevant. So every index is w.r.t to air, because the index of air is set to 1 (as definition).

# Lecture 2 - Geometrical Optics

## 2.1 - Lenses

For a lens to be perfect, all of the ray through the lens have to reach the same focal point. Solution to this situation is a hyperbola with eccentricity $e$ equaling the index of refraction $n$, which is always larger than 1. In principle, we can make a perfect lense.

**Paraxial optics**: In paraxial optics a number of assumptions and approximations are made, that greatly simplify all calculations:

- Snell's law for small angles: $\sin \phi \approx \phi$, $\tan \phi \approx \phi = r/f$
- The ray height $r$ is small $\rightarrow$ optics curvature can be neglected $\rightarrow$ plane optics $\cos \phi \approx 1$

Paraxial optics gives a perfect performance, and this approximation is available in most optical design software. However, it only works for angles approximately smaller than 10 degrees.

Spherical lenses: biconvex, plano-convex, convec-concave, meniscus, plane-concave, biconcave.

Surface error requirement less than $\lambda/10$. This means for a typical lens with 5cm diameter, a 1ppm accuracy needs to be achieved. So making a lens requires a lot of accuracy. Most optical surface are spherical due to ease of manufacturing.

Nomenclature of lenses:

- For a positive/converging spherical lens: The two surfaces of a lens have to radii of curvature $R_1$ and $R_2$. $R_1 > 0$ because its center of curvature is on the positive side, and $R_2 < 0$ because its center is on the negative side. Center thickness $d$. Positive focal length $f > 0$.
- For a negative/diverging spherical lens: $R_1 < 0$ and $R_2 > 0$, virtual focal point.
- General lens setup for real image: Axis through center at normal incidence is optical axis. Lens focal length $f$, Object distance $S_1 > 0$, object height $h_2$. Image distance $S_2$, image height $h_2$. Distance $S_2 > 0$ for a real image. Chief ray through the center maintains direction.
- General lens setup for imaginary image: Object is inside focal distance $|S_2| < |f|$. Distance $S_2 < 0$ for virtual image.

Lens equations

![](figs/fig1_5.png)

Because of dispersion, different wavelengths have different focal points. To solve this, you can make an achromatic lens. This lense consists of 2 lenses, with different glass dispersions. This makes the result much better, and it also gives less spherical abberation. An achromatic lens will always give a better performance, even at a single wavelength.

## 2.2 - Mirrors

Mirrors vs. lenses:
- Mirrors are completely achromatic; same performace for different frequencies.
- Mirrors are reflective over large wavelength range (UV to radio).
- Big advantage for mirrors: can be supported from the back, can be segmented.
- Disadvantage of mirrors: They reflect light. So, the wavegront error is twice that of surface, lens is $n-1$ times surface.
- Another disadvatnage is that a surface only has 1 surface to play with.

Spherical mirrors:
- Easy to manufacture
- Really good at focusing light from center of curvature onto itself.

Parabolic mirrors make sure that an object at infinity will get a perfect image. Parabolic mirror always has an optical axis.

Mirrors are often conic section. You can set up a conic constant $K$, which is a single parameter that can describe all conic section together with radius of curvature $R$. $K$ is nothing more than minus the eccentricity squared, $K = -e^2$.
- Sphere $K = 0$
- Oblate ellipsoid $K > 0$
- Prolate ellipsoid $0 > K > -1$
- Parabola $K = -1$
- Hyperbola $K < -1$

All conic sections are almost spherical close to origin. Analytical ray intersections allow fast computation.

Conic mirror perfectly images between focal points. A sphere has a single focus. An ellipse has two foci, this means it perfectly images from one focal point to the other. A parabola has one finite and one focus at infinity. The hyperbola has two focal points, one of them on the backside, one of them is virtual. 

## 2.3 - Optical Systems

An optical system is a combination of lenses and mirrors, and other optical elements. They can consist of a multitude of setups of lenses and mirrors. Examples: cameras, microscopes, telescopes, instruments. Multiple optical elemnts provide more possibilities.

The most simple optical system is the combination of two thin-lenses. Distance $>$ sum of focal lengths $\rightarrow$ real image between lenses $\rightarrow$ apply single-lens equation successively.

Every optical system has an aperture or stop that limits the incomoing beam, it typically has a maximum diameter. Aperture size is important for flux and diffraction effects.

The F-number is defined as the ratio of diameter to focal length. It describes how much light can be gathered by an optical system. $F = f/D$. Also known as the the $f$-ratio, written as $f/F$. The bigger $F$, the better the paraxial approximation works. Fast system for $F < 2$, a slow system is $F > 2$.

In general, chief rays from different directions are not parallel before image. That means the magnification changes with focus position. Pupil image may be very close to image. A telecentric arrangement (imaging lens is one focal length from pupil), has as result  that looking from image pupil is at infinity since I. Chief rays are parallel and magnification does not change with focus positions.

## 2.4 - Abberations

Plane of least confusion is the location where image of point source has smallest diameter. The spot diagram shows ray locations in plane of least confusion. Spot diagrams are closely connected with wavefronts. A wavefront is the set of all points where the save has the same phase of the sinusoid. In the ideal case a perfect lens would create a spherical wavefront that converges in focus. However, abberations are deviations from spherical wavefront.

**Seidel abberations**: Introduced by Ludwig von Seidel (1857). The Taylor expansion of $\sin \phi$ will create a series of odd powers of $\phi$. Paraxial optics can be seen as the first-order taylor polynomial, that is why paraxial optics is also knows as first-order optics $\sin \phi \sim \phi$. Seidel abberations are third-order optics $\sin \phi \sim \phi - \frac{\phi^3}{3!}$ Seidel abberations describes spherical abberation, astigmatism, coma, field curvature, distortion.

Trivial abberations are tip, tilt and defocus.

**Zernike polynomicals**: Introduced by Frist Zernike in 1934. Orthonormal basis on the unit circle. Radial order $n$, azimuthal order $m$, $n \geq m \geq 0$. The lowest order of Zernike polynomials correspond to Seidel abberations.

Spherical abberation of spherical lens. Different foal lengths of rays close and far from axis. Longitudinal spherical abberation along optical axis. At the plane of least confusion, you can measure the abberation, this is known as the transverse (or lateral) sperical abberation in image plane. Spherical abberation is more prominent in fast systems w.r.t slow systems. To minimize spherial abberation you have to correct for the orientation of the lens, you want to stay as paraxial as possible. An aspheric lens with a single surface with conic constant $K = -n^2$ makes a perfect lens. These are difficult to manufacture but it is possible. Another trick is a spherical glass with aspherical plastic, that is more affordable.

## 2.5 - Off-Axis Abberations

**Coma**: Typically seen for object points away from optical axis. Leads to 'tails' on stars, that is where the name coma comes from, it looks like the a comet tail. It is an abberation resulting from off-axis point sources, inherent to certain optical designs.

**Astigmatism**: Focal length not the same in two orthogonal planes. Spot diagram though focus shows good focusing only in one direction. It results from tilted glass plates in a converging beam, such as beam shifters, filters, beamsplitters etc. Optometrists refer to astigmatism as cylinder.

**Field curvature**: Image lies on curved surface, but typically the detectors are flat. Curvature comes from the lens thickness variation across the aperture. So, it depends on the index of the glass and the focal length of the lens. Petzval curvature is independent of lens position! Possible solution is a field flattener (lens) close to the focus. Fiel flattener also makes the image much more telecentric.

**Distortion**: Image is sharp but distorted. Pincushion (positive) distortion. Barrel (negative) distortion. Crucial where position of object must be known accurately. For example, astrometry in general, and exoplanet orbit determination.

**Vignetting**: Image dims toward edges. Typically happens when something is obstructing the beam. It does not influence how sharp the image looks like. Only influenes transmission, not image sharpness. Effective aperture stop depends on position in object.

# Lecture 3 - Physical Optics

## 3.1 - Diffraction

Diffraction occurs when you have an obstable that blocks part of a wave, and that wave then bends around the corner and spreads out. If the aperture is large, the spread is small and vice versa.

If we want to calculate diffraction, you have to solve the wave equation together with boundary constraints. This is difficult, and analytical solutions exist only for a few special cases. There are many numerical approaces to solve such problems. 

The Huygens-Fresnel Principle is useful for many applications. This principle states that every point on a wavefront is itself the source of spherical wavelets, and the secondary wavelets emanating from different points mutually interfere.

**The full diffraction calculation: Angular spectrum**:

The angular spectrum method is a way to calculate diffraction.

[Insert a lot of math]

$$ E(k_x, k_y) = \int \int A(x,y) e^{i(k_yy+k_zz)} \text{ d}x \text{ d}y $$

**Validity of Diffraction Approximations**:

The Fresnel number determines how accurate certain difraction approximations are. It is given by:

$$ F = \frac{a^2}{R\lambda} $$

Where $a$ is the greatest width of the aperture, $\lambda$ is the wavelength and $R$ is the distance between the detector and the aperture.

$$ F > 1 \quad \quad \text{ Near-field} $$

$$ F < 1 \quad \quad \text{ Far-field} $$

If $F >> 1$ you have to use the angular spectrum method, it is the only way to get the correct diffraction pattern. If, on the other hand, $F \approx 1$, you can use the Fresnel approximation. Lastly, if $F << 1$, you can use the Fraunhofer approximation, because the waves can be approximated by plane waves in the far field.

In a telescope, the Fraunhofer approximation is valid, because there is optics between the aperture and the detector. These optics bring the far field to a finite distance.

## 3.2 - Fraunhofer Diffraction

Fraunhofer diffraction for an arbitrary aperture is easy to calculate.

Complex electric field distribution in Fraunhofer diffraction given by
Fourier transform of electric field distribution across aperture:

$$ E(k_x, k_y) = \int \int A(x,y) e^{i(k_yy+k_zz)} \text{ d}x \text{ d}y $$

Complex aperture function $A(x,y)$ describes transmissivity and phase
delays. Relative size of aperture to array determines resolution in focal plane. Dimension of array determines field of view in focal plane

Another thing that you can do with fraunhofer diffraction is **controlled abberrations**. can modify phase in aperture to change in PSF at will. Dark hole for high-contrast imaging of exoplanets.

Another application of Fraunhofer diffraction has to do with spectrographs. Using a grating, there will not be uniform illumination, but the intensity will scale with a sinc profile:

$$ I(\theta) = I(0) \bigg( \frac{\sin\beta}{\beta} \bigg)^2 $$



## 3.3 - Transfer Functions

Point Spread Function (PSF): Image of a point source. Image produced by an on-axis point source, which is the fourier transform of the pupil image, or the absolute value of the fourier transform of the complex aperture function.

If we then we to know what an image of an arbitrary object is, assuming it is made of incoherent emitters, the image that we observe is nothing more than than the convolution of the true object and the PSF.

**PSF of a Two-pinhole interfereometer**:

PSF is:

$$ I_0 \cos^2 \bigg( \frac{yd\pi}{s\lambda} \bigg) $$

Tilting the incoming beam with an angle $\alpha$, changes the phase/position of the fringe pattern. The separation of the fringes is determined by the wavelength and the distance between the pinhole. If you tilt with a factor $\lambda/d$, you would gain the original pattern.

If $\lambda$ gets smaller, the fringes repeat more frequently.

Fringe spacing only depends on separation of holes and wavelength. The larger the apertures, the smaller the 'illuminated' area. Fringe envelope is Airy pattern (diffraction pattern of a single hole)

**Optical Transfer function**:

Another concept that is useful to be aware of, is the optical transfer function.

$$ i = o \ast s $$

$i$ is image, $o$ is object, and $s$ is PSF. If we now take the inverse Fourier transform of this, we get:

$$ I = O \cdot S $$

$I$ is the inverse Fourier transform of the image, $O$ is the inverse Fourier transform of the object and $S$ is the **Optical transfer function (OTF)**. So the OTF is the inverse Fourier transform of the PSF. This also implies that the OTF is the autocorrelation of complex aperture function $A$.

The Modulation Transfer Function (MTF) is nothing more than the absolute value of the OTF. The MTF describes what happens to the image of a sinusoidal function. It simply tells you by how much the observed variability of the sinusoidal amplitude is reduced.

**General Transfer Fuctions**:

Generally, a transfer function starts with a black box, characterized by transfer function $h$. Operates on time, position, angle, etc. If you want to measure the black box, you put in a $\delta$-function, and out comes the transfer function $h$. Output of general input if convolution of input with transfer function. If a transfer function is known with good accuracy $\rightarrow$ deconvolution to recover original signal (e.g. Richardson-Lucy deconvolution).

![](figs/fig3_1.png)

## 3.4 - Coherence and Interference

The two-pinhole interferometer: Extended source $S$, fully coherent waves from holes $S_1, S_2$ with intensities $I_0$. A useful tool is the **mutual coherence function** of the electric field at $S_1$ and $S_2$ with time delay $\tau$:

$$ \tilde{\Gamma_{12}}(\tau) = \mathcal{E} \Big[ \tilde{E_1}(t+\tau) \tilde{E^\ast_2}(t) \Big] $$

Normalizing the mutual coherence function defines the complex degree of coherence.

THe complex degree of coherence function measures both the spatial coherence at $S_1$ and $S_2$, and the temporal coherence throught time lag $\tau$.

The **Van Cittert–Zernike theorem**, named after physicists Pieter Hendrik van Cittert and Frits Zernike, is a formula in coherence theory that states that under certain conditions the Fourier transform of the intensity distribution function of a distant, incoherent source is equal to its complex visibility.

Last part not entirely clear, but might also be not that important.



# Lecture 4 - Telescopes

## 4.1 - History

First telescope, the Netherlands 1608, Hans Lippershey.

A telescope uses two transmitting lenses, one with a positive power and one with a negative power, to produce a non-inverted image of distant objects. The first telescopes did not have large magnifcations, typically three times or so.

The patent inspired Galilei to make his own telescope.

**Limitations refractive telescopes**:
- **Chromatic abberations**: Different wavelengths will have different focal lengths. This can be solved using multiple kinds of glass.
- **Long telescope tubes**: Because astromers desire large magnifcations, telescopes became larger and larger. This is a limitations, because it increases the weight and the mounting of the telescope needs to be stronger, which makes it harder to fabricate.
- **Magnification requires stabilisation and guiding**: A larger telescope also means a larger sensitivity to external foces.

The hard thing about making a good lense, is making the glass homogeneous. The cooling process of the glass needs to be controlled, on order to prevent air bubles or strays. Glass also sags under gravity. This means that there is a limit that determines the maximum size of a lens.

## 4.2 - Reflecting Telescopes and Conic Sections

**Newtonian Telescope**: This telescope contains a mirror. There are no more chromatic abberations, only one surface to polish. There is central obscuration: less light, diffraction. By having the mirror set up at the bottom of the telescope tube, much larger telescopes could be made. The primary focus is awkward to get to. Introducing a secondary mirror can relay the focus to a more convenient location.

The majority of mirrors used in telescopes can be described by conic sections. All of the curves of the conic sections can be described by the following equation:

$$ y^2 - 2Rz + (1 + e^2)z^2 =  0 \quad \quad K = -e^2 \text{  , the conic constant} $$

The value and sign of $e$ determines which of the conic sections is selected.

Unless $K = -1$, the focal distance $f$ changes with radius $r$, meaning you have sperhical abberation.

The family of conic mirrors:
- $e = 1, K = -1$: One focus at infinity, concave mirror - paraboloid.
- $0 < e < 1, -1 < K < 0$: Two finite foci, concave mirror - ellipsoid.
- $e > 1, K < -1$: Two finitic foci, convex mirror - hyperboloid

## 4.3 - Two Mirror Telescopes

There are two families of telescopes with a paraboloid as primary mirror.
- **The Gregorian telescope**, which has an ellipsoid mirror as secondary mirror. The advantage of Gregorian over Cassegrain becomes clear when using a solar telescope. Because there is a focus at the primary mirror, it means you can set up a heat stop, which is important for a solar telescope. For large telescopes which have adaptive optics, a concave (ellipsoid) mirror makes testing and calibration during the day easier w.r.t a convec (hyperbolic) mirror.
- **The Cassegrain telescope**, which has a hyperboloidal mirror as secondary mirror. This shortens the length of the telescope up to 20 percent, w.r.t. the Gregorian, but its access to the primary access is worse. By choosing the right value for the conic constant of the second mirror, the spherical abberation can be canceled out. However, if you move off-axix, there is still coma and astigmatism.

All two mirror telescopes have field curvature. All astronomical sources off-axis require a curved focal plane to get them in focus.

Refracting telescopes require very large tube to obtain high magnifications. By using a two-mirror telescope, you can obtain the same magnification but with a much smaller tube.

The **Ritchey-Chrétien Telescope**: There is an infinite number of $K_1$ and $K_2$ values that give zero spherical abbertation and coma. Telescopes with these values are known as Richey-Chrétien telescopes. Many telescopes use this design, for example the VLC, GTC, Subaru, Keck (2x) and the Gemini (2x).

## 4.4 - Wide Fields, Big Mirrors

The Smidt telescopes have a larger field of view than the Cassegrain and Gregorian designs, this allowed the catalogue-ing of stars and galaxies to happen. The principle of the Schmidt telescope is that it has a spherical mirror as primary. It has a curved focal plane as secondary mirror. The spherical abgeration will be minimized by having an undersized stop at the radius of curvature. An aspheric corrector plate widens the field of view and can reduce the spherical abberation. The addition of a secondary mirror can send the light through the primary, to make it a Schmidt-Cassegrain design.

The classical method of mirror making is by first taking a cube and then polishing it until it reaches the required conic shape. However, large mirrors made in this way can't cool fast enough w.r.t. the outside air. Because of this, the quality of the mirror drops fast over time. A new method was developed of mirror making which solves several of these problems. This method is known as spin casting. The glass melts while the oven spins. Also, using this method air can go throught the whole in the middle. There is temperature control because of air jet cooling.

Fused silica doped with titanium makes sure the CTE is close to 0 at room temperature. This is about 20 times lower than regular float glass.

## 4.5 - Segmented Telescopes and Steering Telescopes

Mirrors can be put next to each other to form one coherent optical structure. The challenge is adaptive optics; the whole structure of mirrors need to be aligned with each other through adaptive optics.

**Stressed-mirror polishing**: The problem is the off-axis paraboloids are difficult to make, but spherical surface are easy to polish and test. Mirror segments are bent before polishing. Spherical surface is manufactured in the segment. Release the stress and it bends back into a paraboloid.

Using an Altitude-azimuth mouting is the prefferd option because of computer controlling. Each object can be tracked using the two axis of rotation. However, the Zenith is inaccessible due to azimuth drive speed. Also, the object rotates throughout the night in the field of view. This can be canceled by rotating the entire instrument or a derotator: K-mirror, dove prism or anything rotatable with an odd number of reflections.

Other unconventional designs also exist, such as the coelostat in China. Only one mirror to steer across sky. 

A cheaper alternatiave for building a large telescope is the fixed elevation mount. 

# Lecture 5 - Polarization

## 5.1 - Origin of Polarization

$$ \vec{E} = \vec{E}_0 e^{i(\vec{k} \cdot \vec{x} - \omega t)} $$

$\vec{E}_0$ is constant in space and time, and in plane perpendicular to $\vec{k}$. To understand polarization, represent $\vec{E}_0$ in 2-D basis with unit vectors $\vec{u}_x$ and $\vec{u}_y$.

Polarization is nothing more then the $\vec{E}_0$ vector:

$$ \vec{E}_0 = e_x \vec{u}_x + e_y \vec{u}_y $$

$e_x$ and $e_y$ are two arbitrary complex scalars, 4 degrees of freedom. From this, multiple kinds of polarization can develop. There is linear polarization (vertical and horizontal), circular polarization (left and right), and elliptical polarization.

![](figs/fig5_1.png)

All light is polarized. If you look in the universe you will find it anywhere. Polarization due to anisotropy $\rightarrow$ not all directions are equal. Typical anisotropies introduced by geometry: scattering, reflection ,refraction or temperature gradients: stellar atmosphere, density gradients, magnetic fields, eletrical fields. Anisotropic materials (crystals, stressed glass, elongated dust grains).

Scattering part is not entirely clear to me yet.

## 5.2 - Jones Formalism

Another way to respresent is known as Jones representation.

![](figs/fig5_2.png)

The nice things about Jones vectors is that you can actually add them together. Jones vector of sum of waves equals to sum of Jones vectors of individual waves. Jones formalism always describes fully polarized light only. It operatores on amplitudes, not on intensities. However, elements of Jones vectors are not directly observable. What you can measure are products of Jones vectors, i.e. intensity $\vec{I} = \vec{e} \cdot \vec{e}^* = e_x e_x^* + e_y e_y^* $.

Influence of medium on polarization desribed by complex 2 by 2 Jones matrix $J$. Assumes that medium is not affected by polarization state of light, which is almost always true in astronomy. Different media $1$ to $N$ in order of wave direction $\rightarrow$ combined influence decscribed by $J = J_N J_{N-1} ... J_2 J_1$. Reverse order of matrices in product is crucial.

## 5.3 - Stokes and Mueller Formalisms

Another way to represent polarized light: Stokes and Mueller formalisms.

A Stokes vector is a formalism to describe polarization of quasi-monocrhomatig light. The beauty of the Stokes formalism is that it is linear, you can add them together and directly related to measureable intensities. You can directly observe the components of the Stokes vector (Contrary to Jones formalism).

![](figs/fig5_3.png)

Stokes vectors can be directly derived from the Jones vectors. Stokes vectors also have the advantage that it is able to describe unpolarized light, when $Q = U = V = 0$.

![](figs/fig5_4.png)

![](figs/fig5_5.png)

Similar to turning Jones vectors into Jones matrices, you can turn Stokes vectors into *Mueller matrices*. Real 4x4 Mueller matrices describe (linear) transformation between Stokes vectors when passing through or reflecting from media. Not all 16 matrix elements are independent of each other, Mueller matrices never have 16 degrees of freedom. Combining Mueller matrix works similar to Jones matrices, you multiply them in reverse order.

If we have an optical element, sometimes it will be handy to know what its Mueller matrix is when it is rotated:

$$ \boldsymbol{M}(\theta) = \boldsymbol{R}(-\theta) \boldsymbol{M} \boldsymbol{R}(\theta) $$

![](figs/fig5_6.png)

It is simply a change of coordinate system around the propagation direction.

## 5.4 - Polarizers

Polarizer are optical element that produce polarized light from unpolarized input light, so they make light polarized. There are multiple kinds of polarizers: linear, circular or in general elliptical polarizers. Linear polarizers are by far the most common, this is because technically they are the easiest to produce. Large variety of ways to manufacture polarizers.

![](figs/fig5_7.png)

**Wire grid polarization**:
A very different way to do polarization is what's called a wire grid. If you have wires that are conducting (made out of metal), and you set them close together, it will act as a polarizer. Parallel, conducting writes, spacing $d < \lambda$ act as polarizer. Electrical field parallel to wires induces electrical currents in wires. Induced electrical current reflects polarization parallel to wires. Polarization perpendicular to wires is transmitted.

Rule of thumb:

$$ d < \lambda/2 \quad \quad \text{strong polarization} $$

$$ d >> \lambda \quad \quad \text{high transmission of both polarization states } $$

These wire grids are easy to make, work very well and very nice to use.

**Polaroid polarizers**:

This is somewhat cheaper way to make polarizers, and were developed quite a while ago. Developed in Edwin Land in 1938 $\rightarrow$ polaroid sheet polarizers. Polyvinyl alcohol (PVA) sheet, laminated to sheet of cellulose acetate butyrate, treated with iodine. PVA-iodine complex analogous to short, conducting wires. Mechanical stretching aligns PVA-iodine molecules. Cheap, can be manufactured in large sizes.

The beauty of this approach is that all materials are very cheap, and you can make them really large, and they can be mass manufactured, which explains why they are so cheap.

## 5.5 - Crystal Polarizers

When you want to make high quality polarizers that produces really pure polarized light, then crystals are by far the best way to go. This is because, the atoms and molecules in a crystal are extremely well aligned, and they are anisotropic. This can separate incominb beams into two beams with precisely orthogonal linear polarization states. They also work well over large wavelength range. There are many different configurations. Calcite is most often used in crystal-based polarizers, because of large birefringence, low absorption in visible. Many other suitable crystals including quartz. Quartz are way more abundunt, you can grow them artificially.

In anisotropic materials, the dieelectric constant is a tensor. So think of it like a 3 by 3 matrix. Maxwell equations imply symmetric dielectric tensor. A Cartesian coordinate system exists where the tensor is diagonal $\rightarrow$ 3 principal indices of refraction in coordinates system spanned by principla axes.

**Uniaxial Materials:**

Anisotropic materials $n_x \neq n_y \neq n_z$

Uniaxial materials $n_x = n_y \neq n_z$. This means that two of the indices of refraction are the same.

**Optic axis not to be confused with optical axis.** It is an axis that has different index of refraction. Also called the c-axis or crystallographic axis.

Ordinary axis $n_o = n_x = n_y$. A plane in which the index of refraction stays the same.

Extraordinary axis $n_e = n_z$, along optic axis.

Fast axis: Axis with the smallest index, because here goes the light the fastest.

Most materials used in polarimetry are (almost) uniaxial, e.g. calcite

**Wave propagation in Uniaxial Media:**

Two solutions to the wave equation with orthogonal linear polarizations.

Ordinary ray propagates as in isotropic medium with index $n_o$. Extraordinary ray sees direction-dependent index of refraction:

$$ n_2(\theta) = \frac{n_o n_e}{\sqrt{n_o^2 \sin^2 \theta + n_e^2 \cos^2\theta}} $$

$\theta$ angle between wave vector and optic axis. For $\theta = 0, n_2 = n_o$, For $\theta = 90 \text{ deg}, n_2 = n_e$.

**How can we make this into a polarizer?**

![](figs/fig5_8.png)

One linear polarizer goes straight through as in isotropic material (ordinary ray). Linear polarization of ordinary ray is perpendicular to optic axis and wave vector. Perpendicular linear polarization propagates at dispersion angle $\alpha$ (extraordinary ray). Linear polarization of extraordinary ray is perpendicular to ordinary ray. Optical problems: different optical path lengths, crystal abberations (astigmatism).

**Total Internal Reflection (TIR):**

Another way to build a crystal polarizer is based on TIR. The principle is this: $n_o \neq n_e \rightarrow$ One beam can be totally reflected while other is transmitted. This is the principal of most crystal polarizers.

**Common Crystal Polarizers:**

Savart Plate; path length stays the same. Wollaston Prism does the same with a different mechanism. Note on the side: The angles in a Wollaston Prism are not exactly the same. The Foster Prism is another way.

![](figs/fig5_9.png)

## 5.6 - Retarders

**Linear retarders or Wave Plates:**

A linear retarder is a uniaxial crystal, optic axis parallel to surface $\theta = 90 \text{ deg}$. Fast axis (f) has lowest index, slow axis (s) has highest index. Example is the halfwave retarder: One polarization state is delayed by 180 degrees w.r.t to other state.

Retarders do not change intensity or degree of polariation. They simply introduce a phase change in two orthonogal polarization components.

Retardation of retarder depends very much on wavelength. Achromatic retarders: combinations of different materials or the same materials with different fast axis directions.

**Phase Change on Total Internal Reflection (TIR):**

Another way to build a retarder is to use a phase change on TIR. TIR induces phase change that depends on polarization.

**Variable retarders:**

A very different way to do retardation is variable retarders. Sensitive polarimeters requires retarders whose properties (retardance, fast axis orientation) can be varied quickly (modulated). Retardance changes (change of birefringence). Fast-axis orientation changes (change of c-axis direction).

**Liquid Crystals (LC):**

Liquid crystals molecules are elongated, often with dipole moment. Crystal at low temperatures, isotropic liquid at high temperatures. At intermediate temperatures, the liquid crystal is ordered in orientation and sometimes also space in one or more dimensions. However, there is always one spatial dimensions in which the molecules are random.

Now, how do you make retarders out of LC?

![](figs/fig5_10.png)

# Lecture 6 - Thin Films

## 6.1 - Thin-film Coatings

A thin film is a layer of material ranging from fractions of a nanometer (monolayer) to several micrometers in thickness. The controlled synthesis of materials as thin films (a process referred to as deposition) is a fundamental step in many applications.

The layer thickness $d_i < \lambda$ leads to interference between reflected and refracted waves. An infinite number of multiple reflections and refractions must be considered, which is of course not handy.

Ghosts are multiple reflections within a lens.

Thin film coating performance depends on polarization. Some polarizers are based on thin-film coatings. Coatings can also reduce polarizing effects of optical components. All aluminium mirrors have (protective) aluminium oxidet thin film.

## 6.2 - Calculating Thin Film Properties Part 1

If your thin film stack contains many layers, you will have to consider all interferences between reflected and refracted rays of each interface. The complexity of calculation was significantly reduced by the matrix approach (Abelès 1950). This approach is the basis for all thin-film software, and it applies only to isotropic thin films. For anisotropic materials, *Berreman calculus* exists.

**Add some more stuff from the slides here.**

## 6.3- Calculating Thin Film Properties Part 2

We want to figure out how the electic field propagates between two layers/interfaces. So we're in 1 medium, and there is an interface at the top and one at the bottom.

$$ \delta = \frac{2\pi}{\lambda} \tilde{n}_1 d_1 \cos \theta_1 $$

**Note**: There is a strange cosine factor. If you tilt the wave, it actually means that the path length difference gets shorter, which is very counter intuitive. This is something that is not explained in many textbooks, but here is the explanation.

![](figs/fig6_1.png)

This can be developed into a matrix formalism.

![](figs/fig6_2.png)

## 6.4 - Materials, Manufacturing and Properties

Depending on the index of refraction and the wavelength range that you need to transmit, you may choose different materials.

**Deposition**: *Evaporation* of solid material through high electric current (e.g. classic Al mirror coatings). *Sputtering*: plasma glow discharge ejects material from solid substance uncontrolled ballistic flights, mechanical shields to homogenize beam. *Ion-assisted deposition*.

Dichroic filter: It transmits one wavelength range and reflects another wavelength range. Very handy in astronomy.

![](figs/fig6_3.png)

## 6.5 - Fabry-Perot Filters

The Fabry-Perot (tunable) filter was fabricated by Fabry and Perot in 1899.

![](figs/fig6_4.png)

The nice thing about these things is that the bandpass can be adjusted by changing plate separation. There is interference between partially transmitting plates containing medium with index of refraction $n$. The path difference between successive beams is $\Delta = 2 n d \cos \theta$. 

The phase difference: $\delta = 2\pi \Delta / \lambda = 4 \pi n d \cos \theta/\lambda$.

The Finess expressed how narrow the filter in terms of how far the peaks are apart from each other. This is an important quantity.

$$ F = \frac{\pi \sqrt{R}}{1-R} $$

# Lecture 7 - Optical Design

## 7.1 - Optical Design

Optical design has a life cycle. It is not a linear process. It is an iterative process where you have to go back and change things. The reason for that is that the optical design is very close to other designs, such as mechanical designs, electrical designs, software design etc. You start with the requirements review.

![](figs/fig7_1.png)

**Tolerancing**: This means how accurate things have to be manufactured.

**Requirements**:
- *Examples directly related to optical design*: FoV, angular resolution, spectral range, spectral resolution, spectral sampling, minimum transmission.
- *Examples that (may) influence optical design*: Telescope interfaces & instrument location, cooling, weight limit, size limit, schedule, detector pixel size (angular sampling, custom detectors unlikely), temperature range.

Once you have all the above together, you want to do a requirement review for completeness and consistency. Then you derive the optical design requirements from the overall requirements.

Optical design influences most instrument aspects: SNR, mass, volume, cost, etc. To track the whole instrument with budgets (whole is sum of parts), you look at the different parts: For example, wavefront error budget, transmission/photon budget, weight budget, volume budget, cost budget, etc. This allows trade-offs between components while fulfilling overall requirements.

Once you have all the optics, you need to assemble them, integrate them and you need to test. This is referred to as AIT, or sometimes AIV (Verification instead of testing). AIT plan before production starts. Manufacturer may carry ou part of AIT, add alignment guides during optics manufacturing. Alignment: Put optics into correct position and orientation, often requires mechanical and optical equipment and lasters can be helpful but have no field. Testing: Performance and science verifications.

## 7.2 - Optical Design Principles

1. **Minimize the number of optical components**: Additional elements add cost, ghosts, scattered light. However, additional elements increase performance.
2. **Maximize the Radii of Curvature**: Eases manufacturing and alignment, reduces abberations. However, it might require more elements and/or longer designs and/or larger detectors.
3. **Maximize the Allowed Tolerances**: Tolerances are errors in manufacturing, assembly, alignment that you have to cope with. It simplifies manufacturing, mechanis, operational requirements.
4. **Place Low-Quality Components Close to Focus**: Low quality means here large wavefront abberations. If you have these components, if you put them close to the pupil, they have a very large influence on the performance. If you take the lens, and you place the same lens close to the focus, you don't see any influence at all. There is full abberation close to pupil, minor abberation close to focus.
5. **Place Spatially Varying Components Close to Pupil**: If all field points must pass the same part of a component, then place the component close to the pupil. But: rays from different fields pass filter at different angle $\rightarrow$ changes with field position.
6. **Place Angle-Sensitive Components in Collimated Beam**: If all rays from one field point should pass the component under the same inclination angle, then place the component in a collimated beam.
7. **Place Inclination-Sensitive Components in Telecentric Beam**: If the component is sensitive to the inclination angle, then place the component in a telecentric beam
8. **Oversize Optical Elements**: The edges of optical elements/lenses always have worse optical qualities. Typically it's like 10 percent. You also need to hold/mount optical elements, so you can't use the outer edge. Thermal and stress effects most pronounces at edges. So, typically a good rule of thumb to oversize the optical element by 10-20 percent.

## 7.3 - First-Order and Realistic Designs

First you have to make global design choices: Are you going to use lenses or mirrors? (Depends on wavelength range). Choice of dispersing elements (prism, grating). Location aperture stop. Location of iamge and pupil planes. What is the sampling that you well have with the detector (Nyquist: > 2 pixels per resolution element. You may decides to split the light into different arms, because of detectors covering a small wavelength range. (De-)magnification. F-number is an important one, because abberations scale with a high power of the F-number. Operating temperature range, pressure.

First-order designs use ideal optical elements (e.g. paraxial surfaces), central and extreme field points and rays, image and puupil locations. It establishes the general configuration. It is often based on existing designs, and can be sketched on paper of in a spreadsheet. It provides a first idea of size of different designs.

After the first design you make a realistic design through raytracing software. This designs replaces your ideal components with real components. You assess performance of realitic design with raytracing. Raytracing is based on geometrical optics approximation, so it has its limits. These programs are only useful once major design decisions have already been made.

## 7.4 - Optimization

Optimaization is build in most optical design software and automatically improves performance. Degree of freedom = variables of optical design. Examples are radii of curvature of optical surface, spacing between elemnets, conic constants, glass thicknesses etc. Sometimes you can automatically change glass types. Optimzation does not add/remove optical elements, or its order. Optimzation requires a merit function (that can be optimized). The process is a non-linear minimization of the merit function.

The merit function depends on optical design parameters, boundary conditions such as maximum length, system parameters such as f-number, abberation parameters such as rms spot size, rms wavefront abberation, field curvature, often as a function of field angle and wavelength. The function is a weighted sum of individual merit function terms.

Before you do global minimization, you can do local optimzation (gradient descent). Local optimzation depends on a starting point. Global optimzation can be seen as a large amount of local optimzations with different starting points, which often brings large improvements. You can never be sure if the result you reach is actually global, so you run the code for a night or so.

## 7.5 - Tolerancing

Tolerincing is based on tolerancing analysis. Everything that is manufactured will have deviations fromt the design values. The tolerancing analysis determines the maximum allowed tolerances to which optical elements have to be manufactured, have to be positioned and environmental parameters have to be controlled (e.g. temperature). You can trade off tolerances between optical elements to increase manufacturability, reduce costs, make alignment easier, make performance less sensitive to environmental changes.

The simples form of a tolerance analysis is a sensitivity analysis. Sensitivity analysis requries assumed/known tolerances for each design parameter, and a scalar merit function to quantify performance change due to tolerances. It is often the same merit function as used for optimzation. This analysis reveals the sensitivity of the merit function w.r.t the assumed tolerances, it focuses attention on most sensitive parameters.

**Compensators**: Some effects of tolerances can be compensated during alignment or manufacturing of other components. A compensator is a parameter that is optimized after tolerances have been applied. The image focus is the most common compensator. The focus compensator greatly reduces sensitivity to radii and thicknesses.

Sensitivity analysis determines merit function change for given tolerances. However, we want maximum tolerances for given maximum allowed merit change $\rightarrow$ inverse sensitivity analysis.

**Inverse Sensitivity Analysis**: 
- Determines maximum tolerance in each individual design parameter separately for given maximum allowed change in merit function.
- Maximum allowed merit change for inverse sensitivity analysis should be fraction of total maximum merit change.
- Provides a first approximation to tolerances to be specified.
- Does not consider coupling of simultaneous errors in all design parameters.

**Monte Carlo Tolerance Analysis**: This is really the thing what you want to do in the end.
- Considers coupling of simultaneous errors in all design parameters
- Many realizations of random parameter errors within tolerances
- Provides realistic estimate of expected performance

# Lecture 8 - Adaptive Optics

## 8.1 - The Atmosphere

Astronomers what to obtain the highest possible spatial resolution as possible. Unfortunately there is an atmosphere.

The largest fraction of the atmosphere by mass is in the first 20 kilometres. The mixing of warm and cold air generates turbulence in the atmosphere.

To describe the mixing process of warm air bubles mixed with the cool air in the atmospher above, we turn to to the Kolomogorov model.

**Theory of turbulence**: The Kolmogorov model (1941) describes  turbulence in velocity fields. The flow is turbulent if the *Reynolds number* $R_e$ is large.

$$ R_e = \frac{V_0 L_0}{\nu_0} $$

Where $L_0$ is the scale size, $\nu_0$ the viscosity of air and $V_0$ the velocity. Turbulent flow breaks up into smaller scales, the  Reynolds number drops until the inner scale $l_0$ is reached, where the
kinetic energy is dissipated as heat.

The following expression relates the inner and outer scale:

$$ l_0 = \frac{L_0}{R_e^{3/4}} $$

So if the average flow velocity $V$ is larger, Reynolds number is larger, and inner scale is smaller.

At each scale size $l$, there are fluctuations in the velocity field. The power spectrum of these fluctuation is then

$$ \Psi(\kappa) \propto \kappa^{-5/3} $$

This is the one dimensional Kolmogorov spectrum.

If we have two clumps inside the turbulent velocity field separated by distance $r$ and they have two components of velocity in the $x$-direction, then:

$$ D_V(r) = |V(x) - V(x+r^2)|^2 = C_V^2 r^{2/3} $$

$C_V^2$ is the velocity structure constant.

Changes in temperature are the dominant source of changes in refractive index of the air.

Adding up the separate contributions of each layer of turbulence by integrating through all heights above the telescope yields the expression for the Fried length parameter $r_0$

$$ r_0 = \bigg[ 0.423 k^2 (\text{sec}\zeta) \int \text{d}h C_N^2(h) \bigg]^{-3/5} $$

The atmosphere is a complex 3D column of air, which we simplify by describing it with Kolmogorov statistics. There is an outer scale and inner scale length ($L_0$ and $l_0$), and the power spectrum of index fluctuations running between these two scales. Turbulence isn't homogenous throughout the atmosphere, but is concentrated at different altitudes, which can be from 2 to 5 km depending on the telescope site and the season. Each thin layer of this turbulent atmosphere is described with fried length and the mean velocity of the layer. By integrating out the $C_n^2$ profile, an estimate of the seeing can be calculated.

## 8.2 - Lucky Imaging and Tip Tilt Sensors

Ground based telescopes do not reach the diffraction limit for diameters larger than 0.1m. Atmospheric turbulence smears diffraction limited images into seeing limited images typically 1 arcsecond in diameter. We can then a given path of free atmosphere as intercepted by a telescope, and then decompose it into the Zernike polynomials. 

**Lucky imaging**: EMCCDs have little to no read noise. Rapid readouts allow freezing of atmospheric turbulence. Keep the images with the smallest amount of turbulence and shift and stack them together.

## 8.3 - Adaptive Optics and Wavefront Senses

![](figs/fig8_1.png)

You determine the quality of the corrected wavefront with the Strehl ratio $S$. The ratio of diffraction limited peak to the measured peak height. Can be very tricky to measure accurately.

The achromaticity of the atmospheric optical path difference (OPD) is exploited in adaptive optics (AO). Measuring the wavefront at shorter wavelengths means that you can correct for the atmosphere at longer wavelengths. Many systems measure in the visible and provide correction for red and infra-red wavelengths.

## 8.4 - Deformable Mirrors

## 8.5 - Laser Guide Star Systems

# Lecture 9 - Imagers and Detectors

## 9.1 - Detectors

The human eye: Theoretical: 14''. In practice: 1 arcminute. The development of photographic plates allowed the accurate recording of astronomical events. Their hayday was in the 1920s until the 1980s. 

Canals on Mars: not important. Photographic plates: not important.



## 9.2 - Sampling Theorem

## 9.3 - Case study: SDSS

## 9.4 - Case study: Atmospheric Dispersion Correctors

# Lecture 10 - Spectrographs

## 10.1 - Filters and basic spectrographs

**Spectrograph**: An instrument used to measure properties of light over a specific portion of the electromagnetic spectrum.

Important questions when designing spectrographs:
- What kind of spectral resolution do you need?
$$ R = \frac{\lambda}{\Delta\lambda} $$
- What bandwidth (wavelength range) do you want to cover? How wide a a range of wavelengths should the spectrograph work over?
- Maximise throughput for best efficiency.

![](figs/fig10_1.png)

Original UBV filters by Johnson and Morgan 1953. VRIJKLMNQ filters by Johnson 1960. After that, each set of filters has a specific science goal, i.e. measuring properties of stars or galaxy observations.

**Slitless spectrograph**: Put a dispersing element in front of the telescope aperture. Each individual image of a star is smeared out across the detector into a small spectrum $\rightarrow$ Handy when you want to characterise large quantities of stars simultaneously. Solution to spectrum overlaying each other is taking multiple observations from differenct directions. Slitless spectrographs can also be used on extended objects.

Layout basic spectrgraph:

![](figs/fig10_2.png)

**Collimated beam**: Light that has parallel rays, and therefore will spread minimally as it propagates. Perfect collimated beam would mean no divergence, so no dispersion with distance. Diffraction prevents this.

**Anamorphic magnification**: Means that magnification along mutually perpendicular radii is different. Anamorphic magnification arises because
the diffracting element may send light off
at a large angle from the camera normal,
and is defined as $r$.

**Spatial (de)magnification**: Spatial (de)magnification occurs because of the different focal lengths of the camera and collimator so that detector pixels are Nyquist sampled.

**Resolution element**: The minimum resolution of the spectrograph. This will depend of the spectral size of the image, which is a factor of image size, spectral magnification and the linear dispersion

**The slit**: We cannot record three dimensions of data (x,y, wavelength) onto a two dimensional detector, so we need to choose how we fill up our detector are. For a seeing limited object, such as a star, varying the slit width is a balance between spectral resolution and throughput. Slit too wide, spectral resolution goes down. Slit too narrow, flux from seeing limited object is lost.

Spectrographic slits are given in terms of their angular size on the sky, either in arc seconds or in radians.

## 10.2 - Dispersing Elements

**Angular dispersion**:
$$ A = \frac{d\beta}{d\lambda} $$

**Linear dispersion**:
$$ \frac{dl}{d\lambda} = f_\text{cam} A $$

**Glass prisms** are one of the earliest dispersing elements used in spectrographs. Prisms are used near minimum deviations so that rays inside the prism are parallel to the base. The input and output beams are the same size. Dispersion is higher at bluer wavelengths.

**Diffraction grating**: Optical elements with one side a periodic structure on them. Provides coherent interference in preferred direction. Can be transmissive or reflective, and consist
of thousands of periodic features on an optically flat surface. Flat wavefront passes through periodic structure, which changes the amplitude and/or phase. Direction of constructive interference is
wavelength dependent. 

**Formula diffractiong grating**: From diffraction theory, the grating equation relates the order $m$, to the groove spacing $\sigma$ (the number of mm between each ruled line). Constructive diffraction is acheived when this equation is satisfied.

$$ m\lambda = \sigma(\sin \alpha \pm \sin \beta) $$

**Higher spectral orders**: Higher order dispersion from the grating will result in overlapping spectra: We can either use an ORDER BLOCKING
FILTER or a CROSS DISPERSER to split out the different spectral orders: Free spectral range:

$$ m\lambda^\prime = (m+1)\lambda $$

You can optimize the grading efficiency by setting a different blaze normal.

**Volume phase holographic gratings**: Another type of diffractiong grating. This is typically a very thin layer of material that holds a periodic pattern. Advantage is that they can have very high transmision efficiencies, almost 100 percent.

Looking at higher $m$, you can get higher spectra resolution $R$.

## 10.3 - Immersed Gratings and other Spectrographs

**Immersion grating**: A grating that is cut into one side of a high refractive index material. Light enters the immersion grating from one side, ideally normal surface, to the high refractive index material, propagates inside the material, and is refracted of the diffraction grating cut into the material. Advantage is that much smaller volume ideal for cryogenic cooling and space based applications.

$$ m\lambda = n\sigma(\sin \alpha \pm \sin \beta) $$

Angular dispersion within material:

$$ A = \frac{m}{n\sigma \cos \beta} $$

IG spectrograph takes much smaller volume. Silicon is an ideal material  to make IG at thermal IR wavelengths. It is transparent from 1 to 7 microns and has a relatively high index of 3.4. Pro: Extremely clean
grooves mean little scattered light. Disadvantage is that you have to deal with the crystal geometry of silicon.

**The Littrow spectrograph**: The incident eangle equals the diffracted angle $\alpha = \beta$. Simplifies the grating design, setting the blaze angleso that optimum efficiency is for $\alpha$.

**Detector**: The smallest resolution for the spectrograph should be samples at the minimum of the Nyquist frequency, which is 2 pixels per resolution element.

**Fourier Transform Spectrograph**: A Michelson interferometer with one moving arm. PROS: Simple, compact, absolute calibration of spectral lines possible. CONS: very susceptible to any change in background flux.

## 10.4 - Multi Object Spectrographs

Multi object spectroscopy is the technology that enables us to take spectra of many hundreds of targets simultaneously, within a telescope's field of view. There are several different ways of doing this.

- Drilled spectro slits. Old-fashioned way. Slits are drilled at specific locations of where all the sources will be. 
- Laser cut slits. Looks like little barcodes. These are individual spectra of the night sky. Highly efficient way.
- Configurable slits (on MOSFIRE, Keck).

## 10.5 - Fibre fed Spectrographs

**Fibre optics**: Made of glass, typically fused silica. Diameters of 200 microns, down to 10 microns. Can be used for reformatting the folcal plane of the telescope into an arbitrary shape. Leads to imaging hundreds of objects simulatenously. They rely on the principle of total internal reflection. 

Optical fibre causes azimuthal scrambling, because of the large amount of internal reflections. The output light will not be a dot, but a circular bright wing of light.

A tremendous advantage op optical fibre is that the optics of the spectrograph can now be completely decoupled from the input of the optiacl fibres. This allows for tremendous flexibility in the optical design. 

**Focal Ratio Degradation**: Ideal preservation of input beam angle and output beam angle. In reality, deformation and stress causes light to 'spread' in output angle cone. If you don't take this into consideration you will lose a lot of light.

Typical layout optical fibre spectrograph:

![](figs/fig10_3.png)

Because gluing optical fibres onto a glass plate is a awful process to do by hand, robotic positioners were quickly developed to automate the whole process. 

The main limit for observing highly crowded fields is the physical size of the fiber carriers - cannot get two fibres closer than 30 arcseconds, in the case of 2dF. Therefore densely crowded fields require multiple observations.

## 10.6 - Three Dimensional Spectroscopy

Ideally, you want to record $x, y$ and $\lambda$ all 3 simultaneously. In 2D there are pixels, in 3D they are called spaxels.

**Optical Fibre Image Reformatter**: Use the flexibility of fibres to reformat the 2D sky into a 1D entrance slit. Pros: complete reformatting of the sky possible, decouples IFU from spectrograph, typically small (10 arcsec) fields of view. Cons: variable fibre
transmission.

# Lecture 11 - Coronagraphs

## 11.1 - History

Bernard Lyot was a French astronomer and istrument maker who wanted to study the Sun's corona without waiting for an eclipse. He invented the coronagraph in 1939 to block sunlight and take photos of the corona.

It took a large amount of time before it discovered something. First circumstellar disk was imaged in 1984. First evidence that infrared excesses measured by all sky surveys in the IR were due to flattened disks of dust - sites of planet formation.

I would be take another large amount of time before the next disk would be discovered, AU Mic in 2004.

Coronagraphs are also installed on space telescopes, for example HST.

The Fomalhaut Debrisk Disk is one of the most famous debris disk images.

The first brown dwarf was discovered in 1994 from the Palomar 60 inch telescopes. Gliese 229B taken with the HST.

Coronagraphs are a way of blocking the central light of the central star, while the light from just besides the star (where planets are) shines through. The goal is to minimize the diffracted light close to the star.

## 11.2 - The Lyot Coronagraph

Coronagraphs are angular filters. 

## 11.3 - Small IWA Coronagraphs

Inner Working Angle (IWA) is the smallest angle on the sky at which it can reach its designed contrast.

## 11.4 - Diversity and Algorithms

## 11.5 - New Concepts in Coronagraphs

# Lecture 12 - Polarimeter

Polarimetry is the noble science of measuring polarization. 

## 12.1 - Astronomical polarimetry: opportunities, challenges & definitions

**Astronomical polarimetry**:
- Polarization is created (and/or modified) whenever the rotational symmetry is broken as seen from the observer.
- Polarization therefore carries information about remote sources that cannot be obtained in any other way!
- If you measure carefully enough, everything is polarized...
- ... including many things that you didn’t expect to see or really don’t want to see.

The basics, how do we describe a polarization measurement? We do this with the Stokes parameters. Most detectors, if not all, measure only intensity, they count photons. It is useful to describe the Stokes parameters in terms of polarization-filtered intensity measurements. $I$ is the intensity, $Q$ and $U$ desribe linear polarization and $V$ describes circular polarization. Polarimetry is therefore differential photometry, because when using stokes parameters, you are subtracting different kind of intensity measurements from each other. Fractinoal stokes parameters means everything is divided by $I$, so everything will have a value between $-1$ and $1$.

![](figs/fig12_1.png)

![](figs/fig12_2.png)

This was the real basics. Now to the  performance parameters.
- Polarimetric sensitivity: The noise level in $Q/I$. Photon noise is a fundamental limit to polarimetric sensitivity. Polarimetry is a photon-hungry technique!
- Polarimetric accuracy.

![](figs/fig12_3.png)

![](figs/fig12_4.png)

3 challenges for polarimetry
1. Each polarization measurement is a different measurement $\rightarrow$ Measurement strategy needs to be adapted to differential effects (noise/systematics) that can induce polarimetric errors.
2. Each optic that is not perfectly rotationally symmetric induces instrumental polarization and/or modifies the incoming polarization. $\rightarrow$ The end-to-end polarimetric response need to be calibrated.
3. The polarization measurement needs to be integrated with imaging and/or spectral measurements. $\rightarrow$ Polarimeters need to be optimized at systems level, and fully integrated in the design from the very beginning.

## 12.2 - Basics of polarization measurements

Transforming polarization information into at least two
intensity measurements. Polarization modulation can be in one or several dimensions: Time, space, or wavelength.

A rotating polarizer samples the whole region of $U/I$ and $Q/I$ space, which provides a good polarization measurement for linear polarization. Maximum separation of modulation states provides optimal polarimetric efficiency: optimal noise propagation to $Q/I$ and $U/I$.

**Full-Stokes polarization modulation**: The Poincaré sphere.

**The modulation matrix**: Is derived from the Mueller matrix.

**Demodulation**:

## 12.3 - Temporal modulation & Dual-beam exchange

## 12.4 - Spectral modulation

## 12.5 - Instrumental polarization & polarization calibration

## 12.6 - Polarimetric systems