In [1]:
from nbmetalog import nbmetalog as nbm
import sympy


In [2]:
nbm.print_metadata()


context: ci
hostname: e9f8a06047df
interpreter: 3.8.12 (default, Jan 15 2022, 18:39:47)  [GCC 7.5.0]
nbcellexec: 2
nbname: extrema_product_probability_density_function
nbpath: /opt/hereditary-stratigraph-concept/binder/popsize/extrema_product_probability_density_function.ipynb
revision: null
session: a901e11a-a6c1-4bca-aaa6-09fbf8c4df80
timestamp: 2022-12-06T08:38:36Z00:00


IPython==7.16.1
keyname==0.4.1
yaml==5.3.1
nbmetalog==0.2.6
sympy==1.5.1
re==2.2.1
ipython_genutils==0.2.0
logging==0.5.1.2
zmq==22.3.0
json==2.0.9
ipykernel==5.5.3


# Goal

Suppose we have $\boldsymbol{X}_i \sim p_i(x_i) = nx_i^{n-1}$ on $[0, 1]$ and 0 otherwise.
How is $\hat{\boldsymbol{X}}_k = \prod_{i=1}^k \boldsymbol{X}_i \sim \hat{p}_k(\hat{x}_k)$ distributed?


# Strategy

Derive the probability density function for $\boldsymbol{X}_1 \times \boldsymbol{X}_2$ then derive a result under the more general case $\hat{\boldsymbol{X}}_k \times \boldsymbol{X}_{k+1}$ to set up an inductive proof for the general form of the probability density function for $\hat{\boldsymbol{X}}_k$.


# Product of Two Random Variables over [0,1]

Adapted from <https://math.stackexchange.com/a/659278>.

Suppose we have two random variables $\boldsymbol{Y}_1 \sim p_1(y_1)$ and $\boldsymbol{Y}_2 \sim p_2(y_1)$.
Denote their cumulative density functions as $P_1(y_1)$ and $P_2(y_2)$, respectively.
Let $\boldsymbol{Z} = \boldsymbol{Y}_1 \boldsymbol{Y}_2$.
Denote the probabilty density function for $\boldsymbol{Z}$ as $p(z)$ and the cumulative density function for $\boldsymbol{Z}$ as $P(z)$.

It follows that,

$\begin{align*}
P(z)
&= \int_0^1 P_2\Big(\frac{z}{y_1}\Big) p_1(y_1) \, \mathrm{d}y_1\\
&= \int_0^1 \int_0^{\min(1,\frac{z}{y_1})} p_2(y_2) \, \mathrm{d}y_2 \, p_1(y_1) \, \mathrm{d}y_1\\
&= \int_0^z \int_0^{\min(1,\frac{z}{y_1})} p_2(y_2) \, \mathrm{d}y_2 \, p_1(y_1) \, \mathrm{d}y_1
  + \int_z^1 \int_0^{\min(1,\frac{z}{y_1})} p_2(y_2) \, \mathrm{d}y_2 \, p_1(y_1) \, \mathrm{d}y_1\\
&= \int_0^z \int_0^{1} p_2(y_2) \, \mathrm{d}y_2 \, p_1(y_1) \, \mathrm{d}y_1
  + \int_z^1 \int_0^{\frac{z}{y_1}} p_2(y_2) \, \mathrm{d}y_2 \, p_1(y_1) \, \mathrm{d}y_1\\.
\end{align*}$

So,

$\begin{align*}
p(z)
= \frac{\mathrm{d}}{\mathrm{d}z}
\Big(
    \int_0^z \int_0^{1} p_2(y_2) \, \mathrm{d}y_2 \, p_1(y_1) \, \mathrm{d}y_1
      + \int_z^1 \int_0^{\frac{z}{y_1}} p_2(y_2) \, \mathrm{d}y_2 \, p_1(y_1) \, \mathrm{d}y_1
\Big).
\end{align*}$


# Compute Probability Density Functions for $\hat{\boldsymbol{X}}_2, \hat{\boldsymbol{X}}_3, \dots$


We can compute the probability density function for $\hat{\boldsymbol{X}}_2 = \boldsymbol{X}_1 \times \boldsymbol{X}_2$ by combining the probability density functions for $\boldsymbol{X}_1$ and $\boldsymbol{X}_2$ as above.
Let's use computer algebra to do the hard work.


In [3]:
x = sympy.Symbol('x', positive=True, real=True,)

def pdf_of_unitrv_product(
    left_rv_pdf: sympy.Expr,
    right_rv_pdf: sympy.Expr,
) -> sympy.Expr:
    """Calculate the probability density function for a product of two random variables that only take on values between 0 and 1.

    Parameters
    ----------
    left_rv_pdf : sympy.Expr
        Probability density function of left random variable multiplicand, in terms of sympy variable x.
        Left rv multiplicand must only take on values between 0 and 1.
    right_rv_pdf : sympy.Expr
        Probability density function of right random variable multiplicand, in terms of sympy variable x.
        Right rv multiplicand must only take on values between 0 and 1.

    Returns
    -------
    product_cdf : sympy.Expr
        Probability density of random variable representing product of left and right random variable multiplicands.
    """

    x1 = sympy.Symbol('x_1', positive=True, real=True,)
    x2 = sympy.Symbol('x_2', positive=True, real=True,)

    product_cdf = sympy.integrate(
        sympy.Integral(
            left_rv_pdf.subs(x,x1,),
            (x1, 0, 1),
        ) * right_rv_pdf.subs(x,x2,),
        (x2, 0, x),
    # note use of Integral instead of integrate,
    # x appears in the integral bounds
    # so the integral dissapears under differentiation below
    ) + sympy.Integral(
        sympy.integrate(
                left_rv_pdf.subs(x,x1,),
               (x1, 0, x/x2),
        ) * right_rv_pdf.subs(x,x2,),
        (x2, x, 1,)
    ).simplify()

    product_pdf = sympy.diff(
        product_cdf,
        x,
    ).simplify()

    return product_pdf


In [4]:
n = sympy.Symbol('n', positive=True, real=True,)

pdf_X = n * x**(n-1)
pdf_X


n*x**(n - 1)

In [5]:
pdf_hatX_2 = pdf_of_unitrv_product(pdf_X, pdf_X)
pdf_hatX_2


-n**2*x**(n - 1)*log(x)

We can continue multiplying our $\hat{\boldsymbol{X}}_k$ by $\boldsymbol{X}_i$ to generate the probability densities of $\hat{\boldsymbol{X}}_2, \hat{\boldsymbol{X}}_3, \dots$


In [6]:
pdf_hatX_3 = pdf_of_unitrv_product(pdf_hatX_2, pdf_X)
pdf_hatX_3


n**3*x**(n - 1)*log(x)**2/2

In [7]:
pdf_hatX_4 = pdf_of_unitrv_product(pdf_hatX_3, pdf_X)
pdf_hatX_4


-n**4*x**(n - 1)*log(x)**3/6

We now have a hypothesis for the general form of the probability density $\hat{\boldsymbol{X}}_k$,

$\begin{align*}
\hat{\boldsymbol{X}}_k
\sim
p(\hat{x}_k)
&=
\frac{(-1)^{k+1} n^{k} x^{n-1} \log^{k-1}(x)}{(k-1)!}.
\end{align*}$

Let's test our hypothesis.


In [8]:
def generate_hatx_pdf(*, k: int,):
    return (-1)**(k+1) * x**(n-1) * n**(k) * sympy.log(x)**(k-1) / sympy.factorial(k-1)


In [9]:
generate_hatx_pdf(k=1,)


n*x**(n - 1)

In [10]:
generate_hatx_pdf(k=2,)


-n**2*x**(n - 1)*log(x)

In [11]:
generate_hatx_pdf(k=3,)


n**3*x**(n - 1)*log(x)**2/2

In [12]:
generate_hatx_pdf(k=4,)


-n**4*x**(n - 1)*log(x)**3/6

# Inductive Step

We hypothesize

$\begin{align*}
\hat{\boldsymbol{X}}_k
\sim
p(\hat{x}_k)
&=
\frac{(-1)^{k+1} n^{k} \hat{x}_k^{n-1} \log^{k-1}(\hat{x}_k)}{(k-1)!}.
\end{align*}$

Note base case

$\begin{align*}
\boldsymbol{X}_1
=
\hat{\boldsymbol{X}}_1
\sim
p(\hat{x}_1)
=
\frac{(-1)^{2} n^{1} \hat{x}_1^{n-1} \log^{0}(\hat{x}_1)}{0!}
\stackrel{\checkmark}{=}
nx^{n-1}
=
p(x_1).
\end{align*}$

To prove this general form for all $k \in \mathbb{Z}_+$, we must show

$\begin{align*}
\hat{\boldsymbol{X}}_{k+1}
=
\hat{\boldsymbol{X}}_k \times \boldsymbol{X}_{k+1}
\sim
p(\hat{x}_{k+1})
&=
\frac{(-1)^{k+2} n^{k+1} x^{n-1} \log^{k}(x)}{k!}.
\end{align*}$

Recalling our formula for computing the probability density of the product of two unit random variables,

$\begin{align*}
p(\hat{x}_{k+1})
=& \frac{\mathrm{d}}{\mathrm{d}\hat{x}_{k+1}}
\Big(
    \int_0^{\hat{x}_{k+1}} \int_0^{1} p_1(x_1) \, \mathrm{d}x_1 \, p_2(x_2) \, \mathrm{d}x_2
      + \int_{\hat{x}_{k+1}}^1 \int_0^{\frac{\hat{x}_{k+1}}{x_1}} p_1(x_1) \, \mathrm{d}x_1 \, p_2(x_2) \, \mathrm{d}x_2
\Big)\\
=& \frac{\mathrm{d}}{\mathrm{d}\hat{x}_{k+1}}
\Big(
    \int_0^{\hat{x}_{k+1}} \int_0^{1} nx_1^{n-1} \, \mathrm{d}x_1 \, \frac{(-1)^{k+1} n^{k} x_2^{n-1} \log^{k-1}(x_2)}{(k-1)!} \, \mathrm{d}x_2\\
& + \int_{\hat{x}_{k+1}}^1 \int_0^{\frac{\hat{x}_{k+1}}{x_2}} nx_1^{n-1} \, \mathrm{d}x_1 \, \frac{(-1)^{k+1} n^{k} x_2^{n-1} \log^{k-1}(x_2)}{(k-1)!} \, \mathrm{d}x_2
\Big)\\
=& \frac{\mathrm{d}}{\mathrm{d}\hat{x}_{k+1}}
\frac{(-1)^{k+1} n^{k}}{(k-1)!} \Big(
    \int_0^{\hat{x}_{k+1}}  (x_1^n|_0^1) \, x_2^{n-1} \log^{k-1}(x_2) \, \mathrm{d}x_2 + \int_{\hat{x}_{k+1}}^1  (x_1^n|_0^{\frac{\hat{x}_{k+1}}{x_2}}) x_2^{n-1} \log^{k-1}(x_2) \, \mathrm{d}x_2
\Big)\\
=& \frac{\mathrm{d}}{\mathrm{d}\hat{x}_{k+1}}
\frac{(-1)^{k+1} n^{k}}{(k-1)!} \Big(
    \int_0^{\hat{x}_{k+1}} x_2^{n-1} \log^{k-1}(x_2) \, \mathrm{d}x_2 + \int_{\hat{x}_{k+1}}^1 \frac{\hat{x}_{k+1}^n}{x_2^n} x_2^{n-1} \log^{k-1}(x_2) \, \mathrm{d}x_2\\
    =& \frac{(-1)^{k+1} n^{k}}{(k-1)!} \frac{\mathrm{d}}{\mathrm{d}\hat{x}_{k+1}}
\Big(
    \int_0^{\hat{x}_{k+1}} x_2^{n-1} \log^{k-1}(x_2) \, \mathrm{d}x_2 + \hat{x}_{k+1}^n \int_{\hat{x}_{k+1}}^1 \frac{1}{x_2} \log^{k-1}(x_2) \, \mathrm{d}x_2
\Big).
\end{align*}$

The rightmost integral can be computed directly.


In [13]:
x = sympy.Symbol('x', positive=True, real=True,)
k = sympy.Symbol('k', integer=True, positive=True, real=True,)
s = sympy.Symbol('s', positive=True, real=True,)

density = sympy.log(x)**(k-1)/x
sympy.Integral(
    density,
    (x, s, 1),
)


Integral(log(x)**(k - 1)/x, (x, s, 1))

In [14]:
# substitue k -> k+1 -> k
# because sympy can't handle k-1 in exponent
sympy.integrate(
    density.subs(k, k+1),
    (x, s, 1),
).subs(k+1, k).simplify()


-log(s)**k/k

Continuing our derivation,

$\begin{align*}
p(\hat{x}_{k+1})
    =& \frac{(-1)^{k+1} n^{k}}{(k-1)!} \frac{\mathrm{d}}{\mathrm{d}\hat{x}_{k+1}}
\Big(
    \int_0^{\hat{x}_{k+1}} x_2^{n-1} \log^{k-1}(x_2) \, \mathrm{d}x_2 + \hat{x}_{k+1}^n \int_{\hat{x}_{k+1}}^1 \frac{1}{x_2} \log^{k-1}(x_2) \, \mathrm{d}x_2
\Big)\\
    =& \frac{(-1)^{k+1} n^{k}}{(k-1)!} \frac{\mathrm{d}}{\mathrm{d}\hat{x}_{k+1}}
\Big(
    \int_0^{\hat{x}_{k+1}} x_2^{n-1} \log^{k-1}(x_2) \, \mathrm{d}x_2 - \hat{x}_{k+1}^n \frac{\log^{k}(\hat{x}_{k+1})}{k}
\Big)\\
    =& \frac{(-1)^{k+1} n^{k}}{(k-1)!}
\Big(
    \hat{x}_{k+1}^{n-1} \log^{k-1}(\hat{x}_{k+1}) - \frac{1}{k}\frac{\mathrm{d}}{\mathrm{d}\hat{x}_{k+1}} \hat{x}_{k+1}^n \log^{k}(\hat{x}_{k+1})
\Big)\\
    =& \frac{(-1)^{k+1} n^{k}}{(k-1)!}
\Big(
    \hat{x}_{k+1}^{n-1} \log^{k-1}(\hat{x}_{k+1}) - \frac{1}{k}\Big[ n\hat{x}_{k+1}^{n-1} \log^{k}(\hat{x}_{k+1}) + k\hat{x}_{k+1}^n \log^{k-1}(\hat{x}_{k+1}) \frac{1}{\hat{x}_{k+1}} \Big]
\Big)\\
    =& \frac{(-1)^{k+1} n^{k}\log^{k-1}(\hat{x}_{k+1})}{(k-1)!}
\Big(
    \hat{x}_{k+1}^{n-1} - \frac{1}{k}\Big[ n\hat{x}_{k+1}^{n-1} \log(\hat{x}_{k+1}) + k\hat{x}_{k+1}^n \frac{1}{\hat{x}_{k+1}} \Big]
\Big)\\
    =& \frac{(-1)^{k+1} n^{k}\log^{k-1}(\hat{x}_{k+1})}{(k-1)!}
\Big(
    \hat{x}_{k+1}^{n-1} - \frac{1}{k}\Big[ n\hat{x}_{k+1}^{n-1} \log(\hat{x}_{k+1}) + k\hat{x}_{k+1}^{n-1} \Big]
\Big)\\
    =& \frac{(-1)^{k+1} n^{k}\log^{k-1}(\hat{x}_{k+1})\hat{x}_{k+1}^{n-1}}{(k-1)!}
\Big(
    1 - \frac{1}{k}\Big[ n \log(\hat{x}_{k+1}) + k \Big]
\Big)\\
    =& \frac{(-1)^{k+1} n^{k}\log^{k-1}(\hat{x}_{k+1})\hat{x}_{k+1}^{n-1}}{(k-1)!}
\Big(
    1 - \frac{n \log(\hat{x}_{k+1})}{k} - 1
\Big)\\
    =& \frac{(-1)^{k+1} n^{k}\log^{k-1}(\hat{x}_{k+1})\hat{x}_{k+1}^{n-1}}{(k-1)!}
\Big(
    - \frac{n \log(\hat{x}_{k+1})}{k}
\Big)\\
    =& \frac{(-1)^{k+2} n^{k}\log^{k-1}(\hat{x}_{k+1})\hat{x}_{k+1}^{n-1}}{(k-1)!}
\Big(
    \frac{n \log(\hat{x}_{k+1})}{k}
\Big)\\
    =& \frac{(-1)^{k+2} n^{k}\log^{k-1}(\hat{x}_{k+1})\hat{x}_{k+1}^{n-1}}{k!}
\Big(
    n \log(\hat{x}_{k+1})
\Big)\\
    =& \frac{(-1)^{k+2} n^{k+1}\log^{k-1}(\hat{x}_{k+1})\hat{x}_{k+1}^{n-1}}{k!}
\Big(
    \log(\hat{x}_{k+1})
\Big)\\
    \stackrel{\checkmark}{=}& \frac{(-1)^{k+2} n^{k+1}\log^{k}(\hat{x}_{k+1})\hat{x}_{k+1}^{n-1}}{k!}\\
\end{align*}$


# Result

We have shown that for the product of $k$ random variables each $\boldsymbol{X}_i \sim n x_i^{n-1}$,

$$
\hat{\boldsymbol{X}}_k \sim p(x)
= \frac{
   (-1)^{k+1} x^{n-1} n^{k} \log^{k-1}(x)
}{
(k-1)!
}.
$$
