# Clustering and Symmetries in Cosmology

DESI is designed to study dark energy by measuring the expansion history of the Universe. The idea is as follows: We can measure how matter is spatially distributed at different points in the Universe's history. On the large scales DESI observes, the distribution of matter is due to gravity and how spacetime "carries" matter as it expands. Albert Einstein's theory of general relativity (GR) explains how gravity works (side note: DESI can actually test GR), so we can use it to isolate effects caused by the expanding Universe. When we study how expansion has changed over time, we see that expansion has sped up (or accelerated) over the last few billion years. Dark energy is the name we give to whatever is causing accelerated expansion. Thus any study of expansion history is also a study of dark energy.

Great, so now we understand that looking at a bunch of matter can tell us about dark energy. Finding matter sounds easy enough in principle, but doing so turns out to be difficult in practice. Why is that? Most matter in the Universe (about 84% of it) is dark matter, which, as the name implies, neither emits nor absorbs light. We have to be creative to figure out where matter is! A common approach is to look for normal matter, like we see in galaxies, and use that to estimate the distribution of total matter. This works because normal and dark matter are gravitationally attracted to each other, causing them to cluster.


<img src="images/Coma-Cluster-Justin-Ng.jpg" width="600px"/>
The Coma Cluster is a gravitationally bound collection of about 10,000 galaxies. The bright galaxies in this picture can be used to infer the distribution of dark matter and total (dark + normal) matter. Image credit: Justin Ng


### The equation of clustering

Cosmological perturbation theory is the tool we use to describe matter clustering based on observed galaxy clustering. The equation relating these quantities may look confusing at first glance, but it's really just a polynomial with more than one independent variable. Most of you have experience working with polynomials that have one independent variable, e.g. the equation of a line $f(x)= mx + b$ and the equation of a quadratic function $f(x) = ax^2 + bx + c$. Others may have worked with equations possessing two independent variables such as the equation of a circle: $x^2+y^2=r^2$. If you're used to seeing these equations, the equation for clustering won't look too bad. 

Without further ado, I present one of the many equations used to describe galaxy clustering:

$\delta_g = b_1\delta_m + \frac{b_2}{2}\left[\delta^2_m-\langle \delta^2_m \rangle\right] +\frac{b_s}{2}\left[s^2-\langle s^2 \rangle\right] + \ldots$

Let's take a minute to talk about what each piece of this means. The Greek letter delta ($\delta$) is an overdensity, a measure of how the density of the region you're looking at compares to the average density of the Universe. The subscript $g$ stands for galaxies and $m$ stands for total matter. The letter $s$ symbolizes the tidal tensor which describes the gravitational acceleration of two pieces of matter that are close to each other. Angle brackets $\langle \rangle$ mean take the average value of the quantity between them. Unknown numbers $b_1, b_2,$ and $b_s$ are called bias coefficients. Many other more complicated models exist for galaxy clustering, but the simplified one above does a good job of explaining the basic idea.

<img src="images/millennium_sim_z0_600px.jpg" width="600px"/>
This is a picture from the Millennium Simulation showing part of the cosmic web of matter in the Universe. Galaxies  (yellow) clearly lie on top of dark matter (purple). Galaxies are "positively biased" since they are located in areas where the matter density is higher than average. 

### Where did the equation come from?

The only quantity in the galaxy clustering equation we can directly measure from observations is $\delta_g$. How then do we know what terms should appear on the right-hand side of the equation? Fully answering this question requires answering two distinct questions: (a) What are the fields we should care about? (b) In what ways are those fields "allowed" to be included? (By the way, a "field" is a physical quantity that takes on a value at each point in spacetime.) 

(a) A fundamental principle of cosmological perturbation theory is that gravity is the only long-range force we care about. This means that every field we include must be connected to gravity in one way or another. One type of field that makes sense to include is the matter overdensity. We could treat normal matter and dark matter with different fields, but it's more common to put them together. Another type of field we could include is related to changes in the velocity of our matter since gravitational acceleration causes those changes. We don't include constant velocity fields because of special relativity. A tidal tensor like $s$ should be included based on what its definition. Other fields exist and could be included, but we'll stop here.

(b) Once we have the fields we care about we can think about how to use them to make a clustering equation. Symmetries give us the answer to this question. Try this: Take a piece of paper and draw a square. Now rotate that square by 90 degrees. Notice how the square looks the same as it did before you rotated it. We call this rotation one of the symmetries of the square because doing that rotation doesn't change anything. If instead you now rotate the square by some other angle that isn't a multiple of 90 degrees, you'll see that the square looks different. Those rotations are not symmetries of the square.

<img src="images/square-symmetry.png" width="600px"/>
Image credit: David Eck

When we think about how to build the clustering equation, we need to ensure that the symmetries of all our terms match. This is why the equation above is allowed to have $s^2$ but not $s$ by itself; $s^2$ and $\delta_g$ have the same symmetries while $s$ has a different type of symmetry. 

# Coding exercise goes here