forked from matloff/probstatbook
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Transform.tex
298 lines (232 loc) · 8.89 KB
/
Transform.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
\chapter{Transform Methods}
\label{chap:xform}
We often use the idea of \textbf{transform} functions. For example, you
may have seen \textbf{Laplace transforms} in a math or engineering
course. The functions we will see here differ from this by just
a change of variable.
Though in the form used here they involve only univariate distributions,
their applications are often multivariate, as will be the case here.
\section{Generating Functions}
Let's start with the \textbf{generating function}.
For any nonnegative-integer valued
random variable V, its generating function is defined by
\begin{equation}
\label{genftn}
g_V(s)=E(s^{V}) = \sum_{i=0}^\infty s^i p_V(i), ~ 0 \leq s \leq 1
\end{equation}
For instance, suppose N has a geometric distribution with parameter p,
so that $p_N(i) = (1-p) p^{i-1}$, i = 1,2,... Then
\begin{equation}
g_N(s) = \sum_{i=1}^\infty s^i \cdot (1-p) p^{i-1}
= \frac{1-p}{p} \sum_{i=1}^\infty s^i \cdot p^i
= \frac{1-p}{p} \frac{ps}{1-ps}
= \frac{(1-p)s}{1-ps}
\end{equation}
Why restrict s to the interval [0,1]? The answer is that for $s > 1$
the series in (\ref{genftn}) may not converge. for $0 \leq s \leq 1$,
the series does converge. To see this, note that if s = 1, we just get
the sum of all probabilities, which is 1.0. If a nonnegative s is less
than 1, then $s^i$ will also be less than 1, so we still have
convergence.
One use of the generating function is, as its name implies, to generate
the probabilities of values for the random variable in question. In
other words, if you have the generating function but not the
probabilities, you can obtain the probabilities from the function.
Here's why: For clarify, write (\ref{genftn}) as
\begin{equation}
g_V(s) = P(V=0) + s P(V=1) + s^2 P(V=2) + ...
\end{equation}
From this we see that
\begin{equation}
g_V(0) = P(V = 0)
\end{equation}
So, we can obtain P(V = 0) from the generating function. Now
differentiating (\ref{genftn}) with respect to s, we have
\begin{eqnarray}
g'_V(s) &=&
\frac{d}{ds} \left [ P(V=0) + s P(V=1) + s^2 P(V=2) + ... \right ]
\nonumber \\
&=&
P(V=1) + 2s P(V=2) + ...
\end{eqnarray}
So, we can obtain P(V = 2) from $g'_V(0)$, and in a similar manner can
calculate the other probabilities from the higher derivatives.
\section{Moment Generating Functions}
The generating function is handy, but it is limited to discrete random
variables. More generally, we can use the {\bf moment generating
function}, defined for any random variable X as
\begin{equation}
\label{momgen}
m_X(t) = E[e^{tX}]
\end{equation}
for any t for which the expected value exists.
That last restriction is anathema to mathematicians, so they use the
characteristic function,
\begin{equation}
\phi_X(t) = E[e^{itX}]
\end{equation}
which exists for any t. However, it makes use of pesky complex numbers,
so we'll stay clear of it here.
Differentiating (\ref{momgen}) with respect to t, we have
\begin{equation}
m_X'(t) = E[X e^{tX}]
\end{equation}
We see then that
\begin{equation}
m_X'(0) = EX
\end{equation}
So, if we just know the moment-generating function of X, we can obtain
EX from it. Also,
\begin{equation}
m_X''(t) = E(X^2 e^{tX})
\end{equation}
so
\begin{equation}
m_X''(0) = E(X^2)
\end{equation}
In this manner, we can for various k obtain $E(X^k)$, the {\bf k$th$
moment} of X, hence the name.
\section{Transforms of Sums of Independent Random Variables}
\label{tranfssums}
Suppose X and Y are independent and their moment generating functions
are defined. Let Z = X+Y. then
\begin{equation}
m_Z(t) = E[e^{t(X+Y)}] = E[e^{tX} e^{tY}] = E(e^{tX}) \cdot E(e^{tY}) =
m_X(t) m_Y(t)
\end{equation}
In other words, the mgf of the sum is the product of the mgfs! This is
true for other transforms, by the same reasoning.
Similarly, it's clear that the mgf of a sum of three independent
variables is again the product of their mgfs, and so on.
\section{Example: Network Packets}
As an example, suppose say the number of packets N received on a network
link in a given time period has a Poisson distribution with mean $\mu$,
i.e.
\begin{equation}
P(N = k) = \frac{e^{-\mu} \mu^k}{k!}, k = 0,1,2,3,...
\end{equation}
\subsection{Poisson Generating Function}
Let's first find its generating function.
\begin{equation}
g_N(t) = \sum_{k=0}^{\infty} t^k \frac{e^{-\mu} \mu^k}{k!}
= e^{-\mu} \sum_{k=0}^{\infty} \frac{(\mu t)^k}{k!}
= e^{-\mu+\mu t}
\end{equation}
where we made use of the Taylor series from calculus,
\begin{equation}
e^u = \sum_{k=0}^{\infty} u^k/k!
\end{equation}
\subsection{Sums of Independent Poisson Random Variables Are Poisson
Distributed}
\label{sumpoi}
Supposed packets come in to a network node from two independent links,
with counts $N_1$ and $N_2$, Poisson distributed with means $\mu_1$ and
$\mu_2$. Let's find the distribution of $N = N_1+N_2$, using a
transform approach.
From Section \ref{tranfssums}:
\begin{equation}
\label{sumpoieqn}
g_N(t) = g_{N_1}(t) g_{N_2}(t) = e^{-\nu + \nu t}
\end{equation}
where $\nu = \mu_1 + \mu_2$.
But the last expression in (\ref{sumpoieqn}) is the generating function for
a Poisson distribution too! And since there is a one-to-one
correspondence between distributions and transforms, we can conclude
that N has a Poisson distribution with parameter $\nu$. We of
course knew that N would have mean $\nu$ but did not know that N would
have a Poisson distribution.
So: A sum of two independent Poisson variables itself has a Poisson
distribution. By induction, this is also true for sums of k independent
Poisson variables.
% \section{Random Number of Bits in Packets on One Link}
% \label{randomn}
%
% Consider just one of the two links now, and for convenience denote the
% number of packets on the link by N, and its mean as $\mu$. Continue to
% assume that N has a Poisson distribution.
%
% Let B denote the number of bits in a packet, with $B_1,...,B_N$ denoting
% the bit counts in the N packets. We assume the $B_i$ are independent
% and identically distributed. The total number of bits received during
% that time period is
%
% \begin{equation}
% T=B_1+...+B_N
% \end{equation}
%
% Suppose the generating function of B is known to be h(s). Then what is
% the generating function of T?
%
% \begin{eqnarray}
% g_T(s) & = & E(s^{T})\\
% & = & E[E(s^{T}|N)]\\
% & = & E[E(s^{B_1+...+B_N}|N)]\\
% & = & E[E(s^{B_1}|N)...E(s^{B_N}|N)]\\
% & = & E[h(s)^{N}]\\
% & = & g_N[h(s)]\\
% & = & e^{-\mu+\mu h(s)}
% \end{eqnarray}
%
% Here is how these steps were made:
%
% \begin{itemize}
%
% \item From the first line to the second, we used
% the Theorem of Total Expectation.
%
% \item From the second to the third, we just used the definition of T.
%
% \item From the third to the fourth lines, we have used algebra plus the
% fact that the expected value of a product of independent random
% variables is the product of their individual expected
% values.
%
% \item From the fourth to the fifth, we used the definition of h(s).
%
% \item From the fifth to the sixth, we used the definition of $g_N$.
%
% \item From the sixth to the last we used the formula for the generating
% function for a Poisson distribution with mean $\mu$.
%
% \end{itemize}
%
% We can then get all the information about T we need from this formula, such
% as its mean, variance, probabilities and so on, as seen previously.
\section{Other Uses of Transforms}
Transform techniques are used heavily in queuing analysis, including for
models of computer networks. The techniques are also used extensively
in modeling of hardware and software reliability.
Transforms also play key roles in much of theoretical probability, the
Central Limit Theorems\footnote{The plural is used here because there
are many different versions, which for instance relax the condition that
the summands be independent and identically distributed.} being a good
example. Here's an outline of the proof of the basic CLT, assuming the
notation of Section \ref{formalclt}:
First rewrite Z as
\begin{equation}
Z = \sum_{i=1}^n \frac{X_i - m}{v \sqrt{n}}
\end{equation}
Then work with the characteristic function of Z:
\begin{eqnarray}
c_Z(t) &=& E(e^{itZ}) ~~~~ \textrm{(def.)} \\
&=& \Pi_{i=1}^n E[e^{it(X_i-m)/(v\sqrt{n})}] ~~~~ \textrm{(indep.)} \\
&=& \Pi_{i=1}^n E[e^{it(X_1-m)/(v\sqrt{n})}] ~~~~ \textrm{(ident. distr.)} \\
&=& [g(\frac{it}{\sqrt{n}})]^n
\label{gpower}
\end{eqnarray}
where g(s) is the characteristic function of $(X_1-m)/v$, i.e.
\begin{equation}
g(s) = E[e^{is \cdot \frac{X_1 - m}{v}}]
\end{equation}
Now expand (\ref{gpower}) in a Taylor series around 0, and use the fact
that $g'(0)$ is the expected value of $(X_1-m)/v$, which is 0:
\begin{eqnarray}
[g(\frac{t}{\sqrt{n}})]^n
&=& \left [1 - \frac{t^2}{2n} + o(\frac{t^2}{n}) \right ]^n \\
&\rightarrow& e^{-t^2/2} \textrm{ as } n \rightarrow \infty
\label{ourcltgoal}
\end{eqnarray}
where we've also used the famous fact that $(1 - s/n)^n$ converges to
$e^{-s}$ as $n \rightarrow \infty$.
But (\ref{ourcltgoal}) is the density of N(0,1), so we have proved the
Central Limit Theorem.