Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

请问我这边模型训练期间 P值跟F1值都不到0.1,是数据处理的原因么 #24

Open
VVesley opened this issue Mar 26, 2021 · 8 comments

Comments

@VVesley
Copy link

VVesley commented Mar 26, 2021

No description provided.

@RandolphVI
Copy link
Owner

@VVesley 检查一下你自己用的数据,1. 标注是否合理正确;2. 分词是否有误;3.预处理是否合理。这里的模型都非常简单,正常数据能跑通的话,Precision 和 F1 值一般都不会这么低,和你的数据关联很大。

@VVesley
Copy link
Author

VVesley commented Mar 26, 2021

我这边只有总共只有9种标签,而且是中文的数据

@VVesley
Copy link
Author

VVesley commented Mar 26, 2021

请问我把标签类别数量改为9之后出现了
File "../utils/data_helpers.py", line 351, in _create_onehot_labels
label[int(item)] = 1
IndexError: list assignment index out of range
这个,请问是怎么回事

@RandolphVI
Copy link
Owner

@VVesley 代码逻辑部分建议你可以自己尝试理清楚再作提问(使用自己的数据集,data_helpers.pyparam_parser.py都要根据你数据的统计指标进行更改,包括标签个数、你预设的最大句子长度等,参数部分请认真详细参考Usage),另外也可以参考别人提过的 issue

@VVesley
Copy link
Author

VVesley commented Mar 26, 2021

请问TOPk是什么意思 我看了说明还不是很懂

@RandolphVI
Copy link
Owner

@VVesley 举个例子,按照你的数据集总共 9 个标签,最后网络会输出 1 * 9 的 tensor,其中每一个元素都代表了对应标签的预测概率值,一般做法是阈值法TopK 法,如果使用阈值法就是事先给定一个阈值(例如 0.5),如果超过则标记为正,反之为负,对应 param_parser.py 设定的参数 threshold 超参数(默认值是 0.5);如果使用的是 TopK,即取 9 个当中前 K 个,不管其预测概率值是多少均标记为正,对应 param_parser.py 设定的参数 TopK 超参数(默认值是 5,即取前 5 个可能的类别)。

@VVesley
Copy link
Author

VVesley commented Mar 27, 2021

喔喔喔 明白了 很感谢

@VVesley
Copy link
Author

VVesley commented Mar 27, 2021

请问我这边CNN跟RNN同时训练,两边的性能评估是一样的,是因为啥

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants