Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

just update ch10. bjk expression #59

Merged
merged 1 commit into from
Jan 8, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
Empty file modified .github/ISSUE_TEMPLATE/custom.md
100644 → 100755
Empty file.
Empty file modified APP/README.md
100644 → 100755
Empty file.
Empty file modified CH01/README.md
100644 → 100755
Empty file.
Empty file modified CH02/Input/data_2-1.txt
100644 → 100755
Empty file.
Empty file modified CH02/Input/latice.xml
100644 → 100755
Empty file.
Empty file modified CH02/Input/logic_data_1.txt
100644 → 100755
Empty file.
Empty file modified CH02/Input/logic_data_2.txt
100644 → 100755
Empty file.
Empty file modified CH02/README.md
100644 → 100755
Empty file.
Empty file modified CH02/assets/XOR_3D.gif
100644 → 100755
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Empty file modified CH03/Input/data_3-1.txt
100644 → 100755
Empty file.
Empty file modified CH03/Input/data_3-2.txt
100644 → 100755
Empty file.
Empty file modified CH03/README.md
100644 → 100755
Empty file.
Empty file modified CH03/assets/fig3_2.png
100644 → 100755
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Empty file modified CH03/knn.py
100644 → 100755
Empty file.
Empty file modified CH03/unit_test.py
100644 → 100755
Empty file.
Empty file modified CH04/Input/data_4-1.txt
100644 → 100755
Empty file.
Empty file modified CH04/README.md
100644 → 100755
Empty file.
Empty file modified CH04/assets/nb_pgm.png
100644 → 100755
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Empty file modified CH04/nb.py
100644 → 100755
Empty file.
Empty file modified CH04/unit_test.py
100644 → 100755
Empty file.
Empty file modified CH05/README.md
100644 → 100755
Empty file.
Empty file modified CH05/assets/Economics_Gini_coefficient2.svg
100644 → 100755
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Empty file modified CH05/assets/熵与概率的关系.png
100644 → 100755
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Empty file modified CH05/dt.py
100644 → 100755
Empty file.
Empty file modified CH05/unit_test.py
100644 → 100755
Empty file.
Empty file modified CH06/Input/data.txt
100644 → 100755
Empty file.
Empty file modified CH06/README.md
100644 → 100755
Empty file.
Empty file modified CH06/demo.py
100644 → 100755
Empty file.
Empty file modified CH06/maxent.py
100644 → 100755
Empty file.
Empty file modified CH07/README.md
100644 → 100755
Empty file.
Empty file modified CH07/svm.py
100644 → 100755
Empty file.
Empty file modified CH07/unit_test.py
100644 → 100755
Empty file.
36 changes: 12 additions & 24 deletions CH09/README.md
100644 → 100755
Original file line number Diff line number Diff line change
Expand Up @@ -145,12 +145,12 @@ while flag_a or flag_b:

三硬币模型可以写作
$$\begin{equation}
\begin{aligned}
P(y|\theta)&=\sum_z P(y,z|\theta) \\
&=\sum_z P(z|\theta)P(y|z,\theta) \\
&=\pi p^y (1-p)^{1-y} + (1-\pi)q^y(1-q)^{1-y}
\end{aligned}
\end{equation}
\begin{aligned}
P(y|\theta)&=\sum_z P(y,z|\theta) \\
&=\sum_z P(z|\theta)P(y|z,\theta) \\
&=\pi p^y (1-p)^{1-y} + (1-\pi)q^y(1-q)^{1-y}
\end{aligned}
\end{equation}
$$
以上

Expand All @@ -162,7 +162,6 @@ $$
观测数据表示为$Y=(Y_1, Y_2, Y_3, \dots, Y_n)^T$, 未观测数据表示为$Z=(Z_1,Z_2, Z_3,\dots, Z_n)^T$, 则观测数据的似然函数为

> 其实觉得这里应该是小写的$y=(y_1,y_2,\dots,y_n), z=(z_1, z_2, \dots,z_n)$

$$
P(Y|\theta) = \sum\limits_{Z}P(Z|\theta)P(Y|Z,\theta)
$$
Expand All @@ -178,7 +177,6 @@ $$
$$
\hat \theta = \arg\max\limits_{\theta}\log P(Y|\theta)
$$

这个题目的标准答案实际上也是未知的. 因为可能生成这样的观测的假设空间太大.

#### 三硬币模型的EM算法
Expand All @@ -195,7 +193,6 @@ EM算法首选参数初值, 记作$\theta^{(0)}=(\pi^{(0)},p^{(0)}, q^{(0)})$,
$$
\mu_j^{i+1} = \frac{\pi^{(i)}(p^{(i)})^{y_j}(1-p^{(i)})^{1-y_j}}{\pi^{(i)}(p^{(i)})^{y_j}(1-p^{(i)})^{1-y_j} + (1-\pi^{(i)})(q^{(i)})^{y_j}(1-q^{(i)})^{1-y_j}}
$$

因为是硬币, 只有0,1两种可能, 所有有上面的表达.

这个表达方式还可以拆成如下形式
Expand All @@ -206,7 +203,6 @@ $$
\frac{\pi^{(i)}(1-p^{(i)})}{\pi^{(i)}(1-p^{(i)}) + (1-\pi^{(i)})(1-q^{(i)})}&, y_j = 0\\
\end{cases}
$$

所以, 这步(求$\mu_j$)干了什么, 样本起到了什么作用?

这一步, 通过假设的参数, 计算了不同的样本对假设模型的响应($\mu_j$), 注意这里因为样本($y_j$)是二值的,所以, 用$\{y_j, 1-y_j\}$ 构成了one-hot的编码, 用来表示样本归属的假设.
Expand All @@ -216,7 +212,6 @@ $$
这一步是什么的期望? 书中有写, **观测数据来自硬币$B$的概率, 在二项分布的情况下, 响应度和概率是一个概念. **这个说明, 有助于后面M步公式的理解.

##### 3.M步

$$
\begin{align}
\pi^{(i+1)} &= \frac{1}{n}\sum_{j=1}^{n}\mu_j^{(i+1)}\\
Expand All @@ -226,7 +221,6 @@ p^{(i+1)} &= \frac{\sum_{j=1}^{n}\mu_j^{(i+1)}y_j}{\sum_{j=1}^{n}\mu_j^{(i+1)}}\
q^{(i+1)} &= \frac{\sum_{j=1}^{n}(1-\mu_j^{(i+1)})y_j}{\sum_{j=1}^{n}(1-\mu_j^{(i+1)})}
\end{align}
$$

上面, 红色部分的公式从`观测数据是来自硬币B的概率`这句来理解.

##### 初值影响
Expand Down Expand Up @@ -344,11 +338,9 @@ $$
高斯混合模型的参数估计是EM算法的一个重要应用, 隐马尔科夫模型的非监督学习也是EM算法的一个重要应用.

1. 书中描述的是一维的高斯混合模型, d维的形式如下[^2], 被称作多元正态分布, 也叫多元高斯分布

$$
\phi(y|\theta_k)=\frac{1}{\sqrt{(2\pi)^d|\Sigma|}}\exp\left(-\frac{(y-\mu_k)^T\Sigma^{-1}(y-\mu_k)}{2}\right)其中,协方差矩阵
$$

其中,协方差矩阵$\Sigma\in \R^{n\times n}$

#### GMM的图模型
Expand All @@ -374,7 +366,6 @@ P(y|\theta)=&\prod_{j=1}^NP(y_j|\theta)\\
=&\prod_{j=1}^N\sum_{k=1}^K\alpha_k\phi(y|\theta_k)
\end{align}
$$

使用EM算法估计GMM的参数$\theta$

##### 1. 明确隐变量, 初值
Expand All @@ -385,16 +376,14 @@ $$
1. 依第$k​$个分模型的概率分布$\phi(y|\theta_k)​$生成观测数据$y_j​$
1. 反映观测数据$y_j$来自第$k$个分模型的数据是**未知的**, $k=1,2,\dots,K$ 以**隐变量$\gamma_{jk}$**表示
**注意这里$\gamma_{jk}$的维度$(j\times k)$**

$$
$$
\gamma_{jk}=
\begin{cases}
1, &第j个观测来自第k个分模型\\
0, &否则
\end{cases}\\
j=1,2,\dots,N; k=1,2,\dots,K; \gamma_{jk}\in\{0,1\}
$$

$$
注意, 以上说明有几个假设:

1. 隐变量和观测变量的数据对应, 每个观测数据, 对应了一个隐变量, $\gamma_{jk}$是一种one-hot的形式.
Expand All @@ -403,21 +392,20 @@ $$
- 完全数据为$(y_j,\gamma_{j1},\gamma_{j2},\dots,\gamma_{jK},k=1,2,\dots,N)$

- 完全数据似然函数
$$
$$
\begin{aligned}
P(y,\gamma|\theta)=&\prod_{j=1}^NP(y_j,\gamma_{j1},\gamma_{j2},\dots,\gamma_{jK}|\theta)\\
=&\prod_{k=1}^K\prod_{j=1}^N\left[\alpha_k\phi(y_j|\theta_k)\right]^{\gamma_{jk}}\\
=&\prod_{k=1}^K\alpha_k^{n_k}\prod_{j=1}^N\left[\phi(y_j|\theta_k)\right]^{\gamma_{jk}}\\
=&\prod_{k=1}^K\alpha_k^{n_k}\prod_{j=1}^N\left[\frac{1}{\sqrt{2\pi}\sigma_k}\exp\left(-\frac{(y_j-\mu_k)^2}{2\sigma^2}\right)\right]^{\gamma_{jk}}\\
\end{aligned}
$$
$$
其中$n_k=\sum_{j=1}^N\gamma_{jk}, \sum_{k=1}^Kn_k=N$

- 完全数据对数似然函数
$$
$$
\log P(y,\gamma|\theta)=\sum_{k=1}^K\left\{n_k\log \alpha_k+\sum_{j=1}^N\gamma_{jk}\left[\log \left(\frac{1}{\sqrt{2\pi}}\right)-\log \sigma_k -\frac{1}{2\sigma^2}(y_j-\mu_k)^2\right]\right\}
$$

$$
##### 2. E步,确定Q函数

把$Q​$ 函数表示成参数形式
Expand Down
Empty file modified CH09/assets/gmm_graph_model.png
100644 → 100755
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Empty file modified CH09/bmm.py
100644 → 100755
Empty file.
Empty file modified CH09/gmm.py
100644 → 100755
Empty file.
Empty file modified CH09/model.py
100644 → 100755
Empty file.
Empty file modified CH09/unit_test.py
100644 → 100755
Empty file.
Empty file modified CH10/Input/data_10-1.txt
100644 → 100755
Empty file.
Empty file modified CH10/Input/data_10-2.txt
100644 → 100755
Empty file.
19 changes: 18 additions & 1 deletion CH10/README.md
100644 → 100755
Original file line number Diff line number Diff line change
Expand Up @@ -247,6 +247,7 @@ $$
>
>
>
>

- 这里需要**注意**下, 按照后向算法, $\beta$在递推过程中会越来越小, 如果层数较多, 怕是$P(O|\lambda)$会消失
- 另外一个要注意的点$\color{red}o_{t+1}\beta_{t+1}$
Expand Down Expand Up @@ -316,7 +317,23 @@ $$

$\gamma$的维度应该是$N\times T$, 通过$\sum\limits_{t=1}^T$可以降维到$N$, 但是实际上$B$的维度是$N\times M$, 所以有了这个表达, 窃以为这里可以表示成$b_{jk}$, 书中对应部分的表达在$P_{172}的10.3$, 也说明了$b_j(k)$的具体定义.

这里涉及到实际实现的时候, 可以考虑把观测序列$O$转换成one-hot的形式, $O_{one\_hot}$维度为$M\times T$,$B$的维度$N\times M$, $B\cdot O$之后, 转换成观测序列对应的发射概率矩阵, 维度为$N\times T$.
这里涉及到实际实现的时候, 可以考虑把观测序列$O​$转换成one-hot的形式, $O_{one\_hot}​$维度为$M\times T​$,$B​$的维度$N\times M​$, $B\cdot O​$之后, 转换成观测序列对应的发射概率矩阵, 维度为$N\times T​$.

补充一下, $o_t=v_k$有另外一种表达是$ \sigma_{o_t,v_k}$, 克罗内克函数。

克罗内克函数是一个二元函数, 自变量一般是两个整数, 如果两者相等, 输出是1, 否则为0.

其实和指示函数差不多, 只不过条件只限制在了相等。
$$
\sigma_{ij}=
\begin{cases}
1 (i = j)\\
0 (i\ne j)
\end{cases}
\\
b_j(k)=\frac{\sum\limits_{t=1,o_t=v_k}^{T}\gamma_t(j)}{\sum\limits_{t=1}^T\gamma_t(j)}=\frac{\sum\limits_{t=1}^{T}\sigma_{o_t,v_k}\gamma_t(j)}{\sum\limits_{t=1}^T\gamma_t(j)}
$$


#### $E$步与$M$步的理解

Expand Down
Empty file modified CH10/assets/graph_model.png
100644 → 100755
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Empty file modified CH10/hmm.py
100644 → 100755
Empty file.
Empty file modified CH10/unit_test.py
100644 → 100755
Empty file.
Empty file modified CH11/Input/template
100644 → 100755
Empty file.
Empty file modified CH11/README.md
100644 → 100755
Empty file.
Empty file modified CH11/assets/1537524145846.png
100644 → 100755
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Empty file modified CH11/crf.py
100644 → 100755
Empty file.
Empty file modified CH11/unit_test.py
100644 → 100755
Empty file.
Empty file modified CH12/README.md
100644 → 100755
Empty file.
Empty file modified CH12/assets/loss.png
100644 → 100755
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Empty file modified README.md
100644 → 100755
Empty file.
Empty file modified Refs/README.md
100644 → 100755
Empty file.
Empty file modified assets/content_distribution.png
100644 → 100755
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Empty file modified assets/data_algo_map.png
100644 → 100755
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 1 addition & 1 deletion errata.md
100644 → 100755
Original file line number Diff line number Diff line change
Expand Up @@ -54,5 +54,5 @@

1. $P_{140}$例题来源于http://www.csie.edu.tw, 这个大概应该是http://www.csie.ntu.edu.tw。 但是也没找到对应的例子页面。

1.
1. $P_{148}$在提升树这个地方, 最后得到的提升树是$f_M(x)$, 前面介绍加法模型的时候, 得到的是$f(x)$实际上是一样的意思, 但是两个地方的表达不太一样。这个, 其实不算吧。。

Empty file modified glossary_index.md
100644 → 100755
Empty file.
Empty file modified math_markdown.md
100644 → 100755
Empty file.
Empty file modified math_markdown.pdf
100644 → 100755
Empty file.
Empty file modified notebook/README.md
100644 → 100755
Empty file.
Empty file modified ref_downloader.sh
100644 → 100755
Empty file.
Empty file modified requirements.txt
100644 → 100755
Empty file.