-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Please provide performance metrics in the benchmarks #122
Comments
In chapter 9.5 of the README it says: The statistical models in Lingua are larger than those of similar libraries. So querying them takes more time. There is a benchmark script in this repo which gives you a clue how performant the library is. You can run it locally with poetry:
|
Thanks, I'll have to give that a try and share some rough results here. I do think it would be nice/useful to present such stats in the official benchmark comparisons as there's no way to know what "noticeably slower" means. I know that Fasttext and cld2 tend to be exceptionally fast, so perhaps noticeably slower is still quite acceptable. But if it's a difference of 0.001s vs 1s, then obviously that's a problem. |
@nickchomey I'm relatively new to this repo but it has more languages than the translation repo I have been using. Could help test and show an "output chart" or help craft then submit a PR for this, so I'm willing to collab with you to look at a few options to generate the stats. |
@datatalking this isn't a focus for me at the moment and probably won't be for at least a few months, so Im not able to collaborate on anything. But if you have time and desire to do so, that would be great! |
Performance metrics are now provided in the README. |
I'm impressed by the accuracy of Lingua as compared to even fasttext, but it would be very useful to also see performance metrics in the benchmarks to determine if that accuracy comes at a cost. Likewise it would be useful for comparing lingua's low and high accuracy modes.
The text was updated successfully, but these errors were encountered: