Add low accuracy mode #119

pemistahl · 2022-11-05T09:01:13Z

Lingua's high detection accuracy comes at the cost of being noticeably slower than other language detectors. The large language models also consume significant amounts of memory. These requirements might not be feasible for systems running low on resources.

For users who want to classify mostly long texts or need to save resources, a so-called low accuracy mode will be implemented that loads only a small subset of the language models into memory. The API will be as follows:

LanguageDetectorBuilder::from_all_languages().with_low_accuracy_mode().build();

The downside of this approach is that detection accuracy for short texts consisting of less than 120 characters will drop significantly. However, detection accuracy for texts which are longer than 120 characters will remain mostly unaffected.

pemistahl added the new feature label Nov 5, 2022

pemistahl added this to the Lingua 1.5.0 milestone Nov 5, 2022

pemistahl added a commit that referenced this issue Feb 14, 2023

Add low accuracy mode (#119)

53c9ada

pemistahl closed this as completed Feb 14, 2023

pemistahl added a commit that referenced this issue Feb 14, 2023

Fix low accuracy mode in WASM module (#119)

bf59430

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add low accuracy mode #119

Add low accuracy mode #119

pemistahl commented Nov 5, 2022

Add low accuracy mode #119

Add low accuracy mode #119

Comments

pemistahl commented Nov 5, 2022