# 对GMM模型进行验证

In [1]:
import numpy as np
import matplotlib.pyplot as plt
import GMM as _G

假设存在一个如下的实例：
统计得到某地一定数量男生与女生的身高和体重信息（可以假设它们分别满足正态分布），
但由于某种原因误将男女混合在了一起，现希望利用GMM拟合混合数据

### 假定一组参数 

In [2]:
girl_num=1500
boy_num=1000
girl_h_mean=160
girl_w_mean=50
girl_h_std=4
girl_w_std=5
boy_h_mean=170
boy_w_mean=70
boy_h_std=5
boy_w_std=4

### 用参数生成数据

In [3]:
boy_=np.random.normal(size=boy_num*2).reshape(boy_num,2)
boy_[:,0]=boy_[:,0]*boy_h_std+boy_h_mean
boy_[:,1]=boy_[:,1]*boy_w_std+boy_w_mean
girl_=np.random.normal(size=girl_num*2).reshape(girl_num,2)
girl_[:,0]=girl_[:,0]*girl_h_std+girl_h_mean
girl_[:,1]=girl_[:,1]*girl_w_std+girl_w_mean
mixture_=np.append(boy_,girl_,axis=0)

### 利用GMM拟合数据

In [4]:
gmm=_G.GMM(clu=2)

In [5]:
gmm.solve(mixture_)

1. Lower bound: -1728.1122734203414
2. Lower bound: -1692.2000837629346
3. Lower bound: -1685.0542730728716
4. Lower bound: -1679.2889548699338
5. Lower bound: -1673.7040052760997
6. Lower bound: -1668.5891059200435
7. Lower bound: -1664.3187546745055
8. Lower bound: -1660.8917353916413
9. Lower bound: -1657.98463503263
10. Lower bound: -1655.1180531464759
11. Lower bound: -1651.6789871808317
12. Lower bound: -1646.9242122835715
13. Lower bound: -1640.1546137551024
14. Lower bound: -1630.9584867317737
15. Lower bound: -1620.9069427617014
16. Lower bound: -1615.2108411407075
17. Lower bound: -1614.159252662651
18. Lower bound: -1614.0587304408227
19. Lower bound: -1614.049705609962
20. Lower bound: -1614.0488992141952
21. Lower bound: -1614.0488273276055
22. Lower bound: -1614.0488209238601
23. Lower bound: -1614.048820353544


0

### 结果分析 

In [6]:
gmm.pi

array([0.60015477, 0.39984523])

In [7]:
gmm.mu

array([[159.96991805,  49.24012381],
       [169.91205492,  70.20472254]])

In [8]:
gmm.sigma

array([[[17.20830932,  1.34460804],
        [ 1.34460804, 19.85936414]],

       [[24.07581662, -1.39642132],
        [-1.39642132, 17.32223925]]])

从得到的参数数据可以看出，GMM模型可以较为准确的地拟合出样本数据