# Computation of Expectations

## Definition: Borel-Measurable Function

A function $g: \mathbb{R} \to \mathbb{R}$ is called **Borel-measurable** if for every Borel set $B \subseteq \mathbb{R}$, the preimage $g^{-1}(B)$ is a Borel set in $\mathbb{R}$. 

In other words, for any $B \in \mathcal{B}(\mathbb{R})$, where $\mathcal{B}(\mathbb{R})$ is the Borel $\sigma$-algebra on $\mathbb{R}$:

$$
g^{-1}(B) = \{x \in \mathbb{R} : g(x) \in B\} \in \mathcal{B}(\mathbb{R}).
$$

### Key Intuition

- A Borel-measurable function maps Borel sets in the codomain $\mathbb{R}$ back to Borel sets in the domain $\mathbb{R}$.
- Borel-measurability ensures that the function is compatible with the structure of the Borel $\sigma$-algebra, allowing integration and other operations to be well-defined.

### Examples

1. **Continuous Functions**: Every continuous function $g: \mathbb{R} \to \mathbb{R}$ is Borel-measurable because the preimage of an open set (which is a Borel set) under a continuous function is also an open set.

2. **Step Functions**: Functions that take only finitely many values and are constant on intervals are Borel-measurable.

3. **Indicator Functions**: The function $\mathbb{I}_A(x)$, which is $1$ if $x \in A$ and $0$ otherwise, is Borel-measurable if $A \in \mathcal{B}(\mathbb{R})$.

### Why It Matters

Borel-measurability is essential in probability and measure theory because:

1. It ensures that the integral of a function can be well-defined with respect to a measure.
2. It allows functions to be compatible with random variables and their distributions, facilitating computations in expectation and probability.

---

## Definition of Expectation

Let $X$ be a random variable on a probability space $(\Omega, \mathcal{F}, P)$. The **expectation** of $X$ is defined as:

$$
\mathbb{E}[X] = \int_\Omega X(\omega) \, dP(\omega),
$$

which averages the values of $X(\omega)$ over $\Omega$, taking probabilities into account. Note that this is the same definition as in the chapter before where we defined Lebesgue integral as a infinitisemaly small range between values on the y axis defined as probabilities of individual outcomes.


### Key Properties

1. **Linearity**: Expectation satisfies:
   $$
   \mathbb{E}[X + Y] = \mathbb{E}[X] + \mathbb{E}[Y].
   $$

2. **Computation**: For practical computations, we rely on the distribution of $X$ rather than integrating over the abstract space $\Omega$.


### Distribution Measure of a Random Variable

The **distribution measure** $\mu_X$ of a random variable $X$ is defined as:

$$
\mu_X(B) = P(X \in B), \quad \text{for every Borel subset } B \subseteq \mathbb{R}.
$$

This allows us to express expectations in terms of integrals over $\mathbb{R}$ instead of $\Omega$.

---

## Theorem 1.5.1: Relating Abstract and Real Integrals

Let $X$ be a random variable on $(\Omega, \mathcal{F}, P)$, and let $g$ be a Borel-measurable function on $\mathbb{R}$. Then:

1. For non-negative $g$:
   $$
   \mathbb{E}[|g(X)|] = \int_\mathbb{R} |g(x)| \, d\mu_X(x).
   $$

2. If $\mathbb{E}[|g(X)|] < \infty$, then:
   $$
   \mathbb{E}[g(X)] = \int_\mathbb{R} g(x) \, d\mu_X(x).
   $$

### Explanation for Theorem 1.5.1: Relating Abstract and Real Integrals

**Theorem 1.5.1** establishes a relationship between the abstract Lebesgue integral over a probability space $(\Omega, \mathcal{F}, P)$ and the Lebesgue integral over the real numbers $\mathbb{R}$, using the distribution measure $\mu_X$ of a random variable $X$.

### Key Idea:
- Instead of integrating over the abstract sample space $\Omega$, we can integrate over the real line $\mathbb{R}$ with respect to $\mu_X$.
- The distribution measure $\mu_X$ captures the probabilities of $X$ taking values in subsets of $\mathbb{R}$.

### Why Is This Useful?
The probability space $\Omega$ is often abstract and complex, making direct computation challenging. By mapping the problem to $\mathbb{R}$, we can use familiar tools, such as densities or sums, to compute expectations.

### Practical Impact:
1. For a non-negative Borel-measurable function $g$:
   $$
   \mathbb{E}[|g(X)|] = \int_\mathbb{R} |g(x)| \, d\mu_X(x).
   $$

2. If the expectation is finite, the expectation simplifies to:
   $$
   \mathbb{E}[g(X)] = \int_\mathbb{R} g(x) \, d\mu_X(x).
   $$

### Example:
If $X$ takes only finitely many values $\{x_k\}$ with probabilities $\{P_k\}$, then:
$$
\mathbb{E}[g(X)] = \sum_{k} g(x_k) P_k.
$$
For a continuous random variable with density $f(x)$:
$$
\mathbb{E}[g(X)] = \int_{-\infty}^\infty g(x) f(x) \, dx.
$$

---

## Special Cases of $\mu_X$

### 1. Discrete Random Variables
If $X$ takes only finitely many values $\{x_0, x_1, \dots, x_n\}$, then $\mu_X$ places a mass $P_k = P(X = x_k)$ at each $x_k$. The expectation becomes:

$$
\mathbb{E}[g(X)] = \sum_{k=0}^n g(x_k) P_k.
$$

### 2. Continuous Random Variables with Density
If $X$ has a density $f(x)$, then:
$$
\mu_X(B) = \int_B f(x) \, dx.
$$

Using this density, the expectation is:
$$
\mathbb{E}[g(X)] = \int_{-\infty}^\infty g(x) f(x) \, dx.
$$

---

## Theorem 1.5.2: Expectation with Densities

If $X$ has a density $f(x)$, then for a Borel-measurable function $g$:

1. For non-negative $g$:
   $$
   \mathbb{E}[|g(X)|] = \int_{-\infty}^\infty |g(x)| f(x) \, dx.
   $$

2. If $\mathbb{E}[|g(X)|] < \infty$, then:
   $$
   \mathbb{E}[g(X)] = \int_{-\infty}^\infty g(x) f(x) \, dx.
   $$

### Explanation for Theorem 1.5.2: Expectation with Densities

**Theorem 1.5.2** provides a specific computation of expectations when a random variable $X$ has a **probability density function** $f(x)$.

#### Key Idea:
- If $X$ has a density $f(x)$, the distribution measure $\mu_X$ is determined by:
  $$
  \mu_X(B) = \int_B f(x) \, dx,
  $$
  for any Borel subset $B \subset \mathbb{R}$.

- This allows us to compute expectations directly using the density.

#### Why Is This Useful?
When $X$ has a density, $\mu_X$ integrates naturally with $f(x)$, simplifying computations. Instead of working with abstract measures, we compute expectations directly in terms of $f(x)$.

#### Results:
1. For a non-negative Borel-measurable function $g$:
   $$
   \mathbb{E}[|g(X)|] = \int_{-\infty}^\infty |g(x)| f(x) \, dx.
   $$

2. If the expectation is finite:
   $$
   \mathbb{E}[g(X)] = \int_{-\infty}^\infty g(x) f(x) \, dx.
   $$

#### Example:
Suppose $X$ has a density $f(x) = e^{-x} \mathbb{I}_{\{x \geq 0\}}$ (Exponential distribution with rate 1). To compute $\mathbb{E}[X^2]$:
1. Use the formula:
   $$
   \mathbb{E}[X^2] = \int_{-\infty}^\infty x^2 f(x) \, dx.
   $$
2. Substitute $f(x)$ and integrate:
   $$
   \mathbb{E}[X^2] = \int_0^\infty x^2 e^{-x} \, dx = 2.
   $$

### Step-by-Step Solution for 2.

1. **Start with the integral**:
   $$
   \mathbb{E}[X^2] = \int_0^\infty x^2 e^{-x} \, dx.
   $$

2. **Use integration by parts**:
   Let:
   - $u = x^2$ (so that $du = 2x \, dx$),
   - $dv = e^{-x} \, dx$ (so that $v = -e^{-x}$).

   Using the integration by parts formula:

   $$
   \int u \, dv = uv - \int v \, du,
   $$

   we get:

   $$
   \int_0^\infty x^2 e^{-x} \, dx = \left[ -x^2 e^{-x} \right]_0^\infty + \int_0^\infty 2x e^{-x} \, dx.
   $$

3. **Evaluate the boundary term**:
   At $x = \infty$, $-x^2 e^{-x} \to 0$ because $e^{-x}$ decays faster than $x^2$ grows.  
   At $x = 0$, $-x^2 e^{-x} = 0$.  
   So, the boundary term is $0$.

   This leaves:

   $$
   \int_0^\infty x^2 e^{-x} \, dx = \int_0^\infty 2x e^{-x} \, dx.
   $$

4. **Simplify the second integral**:
   Use integration by parts again for $\int_0^\infty 2x e^{-x} \, dx$:
   - Let $u = 2x$ (so that $du = 2 \, dx$),
   - Let $dv = e^{-x} \, dx$ (so that $v = -e^{-x}$).

   Then:

   $$
   \int_0^\infty 2x e^{-x} \, dx = \left[ -2x e^{-x} \right]_0^\infty + \int_0^\infty 2 e^{-x} \, dx.
   $$

   - The boundary term $\left[ -2x e^{-x} \right]_0^\infty$ is $0$ for the same reason as above.
   - The remaining integral is:

     $$
     \int_0^\infty 2 e^{-x} \, dx = 2 \int_0^\infty e^{-x} \, dx = 2 \left[ -e^{-x} \right]_0^\infty = 2 (0 - (-1)) = 2.
     $$

---