In [1]:
from IPython.display import Image

## basics

- 如果 $\phi$ 是凸函数，$X$ 是随机变量，则有：

    $$
    \phi(E(x))\leq E(\phi(x))
    $$
    
    - 如果 $\phi$ 是 concave 的时，$\phi(E(x))\geq E(\phi(x))$
    - 期望的函数小于函数的期望；
        - 可以简单将期望理解为均值，均值的函数 <= 函数的均值；
    - 典型的凸函数（下凸为凸）
        - $\phi$ 是 linear 时，等式成立；
        - $\log(\cdot)$ 是 concave 
- 一道测试：
    - https://www.probabilitycourse.com/chapter6/6_2_5_jensen's_inequality.php

In [6]:
Image(url='https://francisbach.com/wp-content/uploads/2023/03/jensen-3-1024x456.png', width=500)

- 关于凸函数，其定义，通俗来说就是弦在弧上；
    - 凸函数的一个判断：二阶导数 >= 0（比如 $\phi(x)=x^2$）

In [4]:
Image(url='https://i.stack.imgur.com/qQNZu.gif', width=500)

## examples

In [10]:
import numpy as np

# transform function
def payoff(x):
    return x*x

In [12]:
# each possible roll of the dice
outcomes = np.asarray([1, 2, 3, 4, 5, 6])

In [13]:
np.mean(payoff(outcomes))

15.166666666666666

In [14]:
payoff(np.mean(outcomes))

12.25

In [17]:
# transform function
def payoff(x):
    return x*x

# repeated trials
n_trials = 10
n_samples = 50
for i in range(n_trials):
    # roll the dice [1,6] many times (e.g. 50)
    outcomes = np.random.randint(1, 7, n_samples)
    # calculate the payoff for each outcome
    payoffs = [payoff(x) for x in outcomes]
    # calculate the mean of the payoffs
    v1 = np.mean(payoffs)
    # calculate the payoff of the mean outcome
    v2 = payoff(np.mean(outcomes))
    # confirm the expectation
    assert v1 >= v2
    # summarize the result
    print('->%d: %.2f >= %.2f' % (i, v1, v2))

->0: 14.04 >= 11.02
->1: 16.46 >= 13.69
->2: 14.34 >= 11.42
->3: 15.94 >= 12.82
->4: 13.52 >= 11.02
->5: 16.12 >= 12.96
->6: 14.76 >= 11.56
->7: 13.70 >= 10.89
->8: 17.54 >= 14.59
->9: 18.42 >= 15.52


In [19]:
# transform function
def payoff(x):
    return np.log(x)

# repeated trials
n_trials = 10
n_samples = 50
for i in range(n_trials):
    # roll the dice [1,6] many times (e.g. 50)
    outcomes = np.random.randint(1, 7, n_samples)
    # calculate the payoff for each outcome
    payoffs = [payoff(x) for x in outcomes]
    # calculate the mean of the payoffs
    v1 = np.mean(payoffs)
    # calculate the payoff of the mean outcome
    v2 = payoff(np.mean(outcomes))
    # confirm the expectation
    assert v1 <= v2
    # summarize the result
    print('->%d: %.2f <= %.2f' % (i, v1, v2))

->0: 1.16 <= 1.32
->1: 1.04 <= 1.25
->2: 1.13 <= 1.30
->3: 1.20 <= 1.33
->4: 1.10 <= 1.25
->5: 1.05 <= 1.23
->6: 1.12 <= 1.26
->7: 1.12 <= 1.29
->8: 1.06 <= 1.23
->9: 1.28 <= 1.39


## ELBO

- $\log(\cdot)$ 是 concave function

$$
\begin{aligned}
\log p(\mathbf{x})
&=\log \int p(\mathbf{x}, \mathbf{z}) d\mathbf{z} \\
&=\log \int p(\mathbf{x}, \mathbf{z}) \frac{q(\mathbf{z})}{q(\mathbf{z})} d\mathbf{z} \\
&=\log \mathbb{E}_{q} \bigg[ \frac{p(\mathbf{x}, \mathbf{z})}{q(\mathbf{z})} \bigg] \\
&\geq \underbrace{\mathbb{E}_{q} \bigg[ \log \frac{p(\mathbf{x}, \mathbf{z})}{q(\mathbf{z})} \bigg]}_{\text{ELBO}}\\
\end{aligned}
$$

- 事实上：

$$
\begin{aligned}
\log p(\mathbf{x})
&=\mathbb{E}_{q}[\log p(\mathbf{x})]\\
&=\mathbb{E}_{q} \bigg[ \log \frac{p(\mathbf{x}, \mathbf{z})}{p(\mathbf{z} \vert \mathbf{x})} \bigg] \\
&=\mathbb{E}_{q} \bigg[ \log \frac{p(\mathbf{x}, \mathbf{z})}{p(\mathbf{z} \vert \mathbf{x})} \frac{q(\mathbf{z})}{q(\mathbf{z})} \bigg] \\
&=\mathbb{E}_{q} \bigg[ \log \frac{p(\mathbf{x}, \mathbf{z})}{q(\mathbf{z})} \frac{q(\mathbf{z})}{p(\mathbf{z} \vert \mathbf{x})} \bigg] \\
&=\underbrace{\mathbb{E}_{q} \bigg[ \log \frac{p(\mathbf{x}, \mathbf{z})}{q(\mathbf{z})} \bigg]}_{\text{ELBO}} + \underbrace{\mathbb{E}_{q} \bigg[ \log \frac{q(\mathbf{z})}{p(\mathbf{z} \vert \mathbf{x})} \bigg]}_{\text{KL}}\\
\end{aligned}
$$

In [7]:
Image(url='https://jejjohnson.github.io/research_notebook/_images/elbo_inequality.png')

## 投资领域（investment）

> Don’t put all your eggs in one basket. (多元化投资)



- 有两个投资项目（two assets）， A and B. 
    - Asset A is a low-risk, low-return asset, like a government bond. 
    - Asset B is a high-risk, high, asset-return like a tech startup’s stock.
- Porfolio：资产组合
    - $r_a,r_b$ 分别是 A 和 B 的投资回报率（return）
    - 随机变量 $x\in[0,1]$ 表示在 A 上的投资比例，$1-x$ 为在 B 上的；
    - $r_p=xr_a+(1-x)r_b$
    - 则其方差：$\text{Var}(r_p)=x^2\text{Var}(r_a)+(1-x^2)\text{Var}(r_b)+2x(1-x)\text{Cov}(r_a,r_b)$
        - 其二阶导数为 $2\sigma_a^2-4\sigma_{ab}+2\sigma_b^2\geq 0$ ($|\sigma_{ab}|\leq \sigma_a\sigma_b$)
        - $\text{Var}(r_p)$（投资组合的方差） 是关于投资比例 $x$ 的凸函数
    - 关于凸函数，则也有 $\text{Var}(xr_a+(1-x)r_b)\leq x\text{Var}(r_a)+(1-x)\text{Var}(r_b)$
        - 多元化投资组合的风险小于或等于其各个资产风险的加权平均，

- https://medium.com/@hosamedwee/jensens-inequality-2-unlocking-optimization-and-decision-making-power-9109914db6f5
- $f(X)$: If you invest in a single risky asset, your risk is simply f(X) — the risk of that single asset.
- $f(E(X))$: If you diversify across multiple assets, your risk is f(E[X]) — the risk of the diversified portfolio.
- $f(E(x))\leq E(f(X))$: According to Jensen’s inequality, f(E[X]) ≤ E[f(X)] — the average risk of the individual assets.
    - Therefore, the risk from diversifying, f(E[X]), is less than the risk from a single asset, f(X).