We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
你好,请问下,可以基于你发布的模型,只用自己的数据再训练吗?
1w条样本领域内的样本句子,进行数据增强(替换谐音词,英文单词增删改字母),生成了11w增强样本你, 采用这种方式,训练下来感觉train样本纠错还勉强能看,迁移到测试集上以后就比较差,好纠结啊。
The text was updated successfully, but these errors were encountered:
可以再训练;建议融合我的训练集从头训练。
Sorry, something went wrong.
如果从头微调macbert,是不是直接把train_macbert4csc.yml文件里的BERT_CKPT改为hfl/chinese-macbert-base就行? 另外MacBERT的输入长度限制是512,对吗?
可以
谢谢。另外,找了一些公开数据集,有的case没有错误,也就json中没有wrong_ids,这种数据放进训练集模型可以跑吗?会对模型效果产生负面影响吗?
可以放
No branches or pull requests
你好,请问下,可以基于你发布的模型,只用自己的数据再训练吗?
![image](https://private-user-images.githubusercontent.com/63828645/336342502-fd14e392-862b-419d-a91f-f71cf3e16ca3.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjAwMTczNDAsIm5iZiI6MTcyMDAxNzA0MCwicGF0aCI6Ii82MzgyODY0NS8zMzYzNDI1MDItZmQxNGUzOTItODYyYi00MTlkLWE5MWYtZjcxY2YzZTE2Y2EzLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA3MDMlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNzAzVDE0MzA0MFomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWQxMjU0ZjlhNDdkYTlmM2ZkNzY3NjY0Y2RkMTEwNGQ2OTZiNmI0ODc1Y2JjNTUyYWI4M2IwYzU1YmExOWQ5MDgmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.2bRGXJET4bM_b3IpSxL3ROU6jCGHne-F08SJQIMQ20M)
1w条样本领域内的样本句子,进行数据增强(替换谐音词,英文单词增删改字母),生成了11w增强样本你,
采用这种方式,训练下来感觉train样本纠错还勉强能看,迁移到测试集上以后就比较差,好纠结啊。
The text was updated successfully, but these errors were encountered: