Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

the accuracy between "AttaCut" and "newmm". #26

Closed
charlesfufu opened this issue Dec 31, 2020 · 3 comments
Closed

the accuracy between "AttaCut" and "newmm". #26

charlesfufu opened this issue Dec 31, 2020 · 3 comments

Comments

@charlesfufu
Copy link

Why the accuracy of "AttaCut" model more worse than "newmm" throgh PyThaiNLP in TNHC(Thai National Historical Corpus) dataset?
Is there any problem that needs my attention?
屏幕快照 2020-12-31 下午5 36 01
And, do you what's the meaning of "、#、$" in TNHC dataset?

@p16i
Copy link
Collaborator

p16i commented Dec 31, 2020

Regarding the first question, we have some discussion in https://arxiv.org/pdf/1911.07056.pdf, Section 5.4. I'll post the snapshot here:

image

And, do you what's the meaning of "、#、$" in TNHC dataset?
I don't know either.

@charlesfufu
Copy link
Author

https://arxiv.org/pdf/1911.07056.pdf

Thanks for your reply.
So, can you provide some BEST dataset(tokenized data) for evaluating ?

@charlesfufu
Copy link
Author

Regarding the first question, we have some discussion in https://arxiv.org/pdf/1911.07056.pdf, Section 5.4. I'll post the snapshot here:

image

And, do you what's the meaning of "、#、$" in TNHC dataset?
I don't know either.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants