# Linearity of expectations and product rule in calculus

# Problems

- (**Card game**) What is the expected number of cards that need to be turned over in a regular $52$-card deck in order to see the first ace?

- (**Coupon collections**) Expected number of boxes need to be opened to get a complete set of $N$-coupons. Expected number of distince coupons in a $n$-coupon collection.

- (**Divide a disk with $n$ lines**) Choose six points on a circle randomly, connecting 1 and 2, 3 and 4, 5 and 6
gives us three lines. They divide the disk into parts. 
  - What’s the expectation of the number of parts?
  - What if we have $2n$ points?

- (**Packet network**) In packet networks, a packet can be crudely modeled as a string of IID binary digits with $P(0) = P(1) = \frac{1}{2}$. Packets are usually separated from each other by a special bit pattern, $01111110$, called a flag. If this special pattern appears within a packet, it could be interpreted as a flag indicating the end of the packet. To prevent this probleman extra binary digit of value 0 is inserted after each appearance of $011111$ in the original string (this can be deleted after reception). 
  - Find the expected number of inserted bits in a string of length $n$.

# Mathematics: Product rule and integrals in calculus. 
Derivative, integrals, inner product are same type of operators in view of Fourier transform.

---
(Theorem) No matter $X_{1}, X_{2}, \dots, X_{n}$ are **independent or not**, we always have
$$E[X_{1}+\dots+X_{n}]=E[X_{1}]+\dots+E[X_{n}].$$

---
# Divide a disk with $n$ lines
First, we consider the expectation of parts in the situation of two lines. By symmetry, the first point can be fixed at $0$, let the second point be at $\theta$. Let $(x,y)$ denote the positions of the two points of the second line. Then the probability of getting $3$ parts can be computed by 
$$\frac{1}{(2\pi)^{3}}\int_{0}^{2\pi}\theta^{2}+(2\pi-\theta)^{2}d\theta=\frac{2}{3}.$$
Thus the probability of getting $4$ parts is $\frac{1}{3}$, the expectation is $\frac{10}{3}$.
Now the trick here is realizing that the total number of parts is given by the parts of the three pairs of lines. That is, if we denote the number of parts cut out by the three pairs of lines respectively by $X, Y, Z$, then the total number of parts is given by $X+Y+Z - 5$. Lastly, by the **linearity of expectation**, we get $E(X+Y+Z-5)=E(X)+E(Y)+E(Z)-5=\frac{10}{3}\times 3 -5 =5$.

For the general case, we switch our point of view a little bit, let $Y_{k}$ be the number of new intersection points of the $k$-th line with the first $k-1$ lines. For example, $Y_{2}$ is a random variable with probability $\frac{2}{3}$ taking value $0$, and with probability $\frac{1}{3}$ taking value $1$. The idea is that number of new parts created by adding the $k$-th line $=$ $Y_{k}+1$, let $Y_{k,j}$ be the number of intersection point between the $k$-th line and the $j$-th line, up to a zero measure subset, we have 
$$Y_{k}=Y_{k,1}+\dots+Y_{k,k-1}$$
Though $Y_{k,j}$ are **not** independent to each other, by linearity of expectation, we have
$$E[Y_{k}]=E[Y_{k,1}]+\dots+E[Y_{k, k-1}]=\frac{k-1}{3}$$
Thus 
$$E[\text{number of parts with $n$ lines}]=(E[Y_{n}]+1)+\dots+ E[Y_{3}]+\frac{10}{3}=\frac{(n+2)(n+3)}{6}$$

# Packet network
For each position $i\geq6$ in the
original string, define $X_{i}$ as a random variable whose value is $1$ if an insertion occurs
after the $i$-th data bit. The total number of insertions is then just the sum of $X_{i}$ from $i=6$ to
$n$ inclusive. Since $E[X] = 2^{-6}$, the expected number of insertions is $(n-5)2^{-6}$. Note that
the positions in which the insertions occur are highly dependent, and the problem would be quite difficult if one didn't use the linearity of expectation to avoid worrying about the dependence.


# Card game (Xinfeng Zhou, Page 95). 

Sometimes, we need to come up with "clever" random variables to count things correctly. 

# Coupon collections (Xinfeng Zhou, Page 97). 

# Code: Compute derivatives of Fourier transform of convolutions of distributions with Wolfram Language/Mathematica


# Remarks

Some interesting questions:

- What if the target of random variables $\{X_{1}, \dots, X_{n}\}$ are not linear? For example, $X_{i}$ takes values in $S^{1}$?
- What if the target is not even abelian? 
- What if the random variables are given by distributions on a non-trivial bundle, so that the target is very non-linear and topologically non-trivial?