-
Notifications
You must be signed in to change notification settings - Fork 82
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
请教我的测试结果 label和entity结果相差较大的问题 #16
Comments
分词用的是jieba精确模式,我看您公开的处理后的resume数据集是有‘word‘这项的,想请问您分词用的是哪种方式 |
虽然resume数据中我们进行了分词,但实际上代码并没有用到,本质上还是在字符级别上进行抽取。 |
在flat数据集中,predict有嵌套结果,例如m17文化广场实体被预测为 m17、文化广场、m17文化广场,在训练数据中没有标注嵌套实体,如果希望只把模型当作flat ner模型,我应该怎样做呢? |
可以考虑Named Entity Recognition as Dependency Parsing中的处理方法,对出现嵌套的实体的头尾关系概率进行比较,选取最大的那一个 |
|
请问你自己构建的中文数据集,超参数是这么设置的,还是和resume-zh的一样吗 |
将BERT-base替换为BERT-wwm是可行的,实验做起来也很简单。另外可以尝试将word信息融入卷积模块中。
不同的数据需要的参数可能不太一样,需要根据实验结果进行相对应的调试。 |
作者您好,首先感谢您的分享。有一个问题想要请教,在我自己构建的数据集(中文,flat)上进行实验时,test的label和entity的准召相差还比较大,val时相差不是很大,请问这是decode时出现了什么问题呢
2022-04-01 17:47:44 - INFO: Epoch: 9
2022-04-01 17:48:01 - INFO:
+---------+--------+--------+-----------+--------+
| Train 9 | Loss | F1 | Precision | Recall |
+---------+--------+--------+-----------+--------+
| Label | 0.0061 | 0.9698 | 0.9694 | 0.9703 |
+---------+--------+--------+-----------+--------+
2022-04-01 17:48:02 - INFO: EVAL Label F1 [0.99797655 0.98461538 0.94179894 0.90293454 0.85561497 0.99166667
0.8 0.98181818]
2022-04-01 17:48:02 - INFO:
+--------+--------+-----------+--------+
| EVAL 9 | F1 | Precision | Recall |
+--------+--------+-----------+--------+
| Label | 0.9321 | 0.9258 | 0.9389 |
| Entity | 0.9207 | 0.9187 | 0.9226 |
+--------+--------+-----------+--------+
2022-04-01 17:48:03 - INFO: TEST Label F1 [0.99777767 0.985705 0.91578947 0.90581162 0.81385281 0.98876404
0.85964912 1. ]
2022-04-01 17:48:03 - INFO:
+--------+--------+-----------+--------+
| TEST 9 | F1 | Precision | Recall |
+--------+--------+-----------+--------+
| Label | 0.9334 | 0.9176 | 0.9513 |
| Entity | 0.8928 | 0.8799 | 0.9061 |
+--------+--------+-----------+--------+
2022-04-01 17:48:03 - INFO: Best DEV F1: 0.9230
2022-04-01 17:48:03 - INFO: Best TEST F1: 0.8848
2022-04-01 17:48:08 - INFO: TEST Label F1 [0.99751797 0.98505523 0.9197861 0.904 0.79831933 0.98876404
0.84581498 1. ]
2022-04-01 17:48:08 - INFO:
+------------+--------+-----------+--------+
| TEST Final | F1 | Precision | Recall |
+------------+--------+-----------+--------+
| Label | 0.9299 | 0.9131 | 0.9486 |
| Entity | 0.8848 | 0.8688 | 0.9014 |
+------------+--------+-----------+--------+
The text was updated successfully, but these errors were encountered: