New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

stacking notes #20

Closed

JiaxiangBU opened this issue Jan 7, 2020 · 7 comments

Labels

enhancement good first issue

Owner

JiaxiangBU commented Jan 7, 2020 •

edited

10.2 stacking
https://jiaxiangbu.github.io/learn_kaggle/learning_notes.html

在训练阶段，~~第一层模型~~，只要保证一个样本不被两次层训练即可

JiaxiangBU added enhancement good first issue labels

Owner Author

JiaxiangBU commented Jan 7, 2020 •

edited

1、第一层的模型个数，

没有要求。

2、第二层模型能用集成模型吗，

最好不。参考华泰证券的研究报告，我回头发你看下。

3、判断融合模型是否过拟合，

正常的判断，训练集和测试集的评价指标

4、第一层模型调参

至少kfold的交叉验证。

Owner Author

JiaxiangBU commented Jan 7, 2020 •

edited

模型原理、样本分离都是可以处理的方式，更直接的是在 stacking 的时候使用，证明各个预测值相关性低。
这个可以查看华泰证券这篇研究报告，在之前凤凰金融我们用到过。

林晓明, 陈烨, and 李子钰. 2018. 人工智能选股之stacking集成学习. 华泰证券股份有限公司.

具体地见，https://jiaxiangbu.github.io/phoenix-finance/output/fcontest_output30.html

https://github.com/JiaxiangBU/tutoring/issues/54

Owner Author

JiaxiangBU commented Jan 8, 2020

https://www.kaggle.com/lijiaxiang/stacking 这个我开源了。

Owner Author

JiaxiangBU commented Jan 8, 2020

k-fold https://jiaxiangbu.github.io/learn_kaggle/learning_notes.html#k-fold 6 K-Fold

Owner Author

JiaxiangBU commented Jan 20, 2020

https://jiaxiangbu.github.io/learn_fe/target_encoding_learning_notes.html#%E6%80%BB%E7%BB%93
stacking 的思路和 target encoding 非常类似，我这里举了一个例子，不正确的做 target encoding ，会把一个随机变量弄显著。

1 similar comment

Owner Author

JiaxiangBU commented Jan 20, 2020

https://jiaxiangbu.github.io/learn_fe/target_encoding_learning_notes.html#%E6%80%BB%E7%BB%93
stacking 的思路和 target encoding 非常类似，我这里举了一个例子，不正确的做 target encoding ，会把一个随机变量弄显著。

Owner Author

JiaxiangBU commented Jan 20, 2020

@Ricardo627721141 上次跟你说的 stacking 处理方式有些出入，正确的理解是
同一个训练集不要重复使用，是指的是不要再两层，同一层可以重复调用。
https://www.kaggle.com/lijiaxiang/stacking 这是一个 demo。

JiaxiangBU closed this as completed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment