NgramRef is used instead of Ngram during the detection phase #148

serega · 2023-01-21T12:44:15Z

The Model stores Ngrams in HashMap, where Ngram is just a wrapper around String. TestDataLanguageModel::from and the Iterator of NgramRange allocate a lot of small temporary Strings on the heap during detection. I introduced NgramRef, which is just like Ngram, but holds a &str intead of String. Now NgramRange iterates over slices of the input string, which is more efficient. I used Borrow trait trick to borrow Ngram as &str. Initially I tried borrowing Ngram as NgramRef to discover that this is not possible. However, the API to LanguageModel is still type safe fn get_relative_frequency<'a>(&self, ngram: &NgramRef<'a>).

Benchmark against current main branch

For the benchmark I used the accuracy reports test data. The benchmark code is here. I tested both single-threaded and multi-threaded/parallel mode.

Results

I tested the patch on two machines

Intel(R) Core(TM) i7-6800K CPU @ 3.40GHz on Linux
M1 Max on Mac

The numbers in the columns Before and After are throughput as detections per second.

Single threaded benchmark

cargo run --release --bin bench -- --max-examples 30000

Machine	Before	After	Change
i7-6800K	1202	1635	1.3x
M1 Max	1704	2198	1.3x

Multi threaded benchmark

cargo run --release --bin bench -- --max-examples 50000 --parallel

Machine	Before	After	Change
i7-6800K	4402	4823	1.09x
M1 Max	1279	1539	1.2x

The numbers for multi-threaded benchmark are much higher if the change is applied on top of #82

Multi threaded benchmark

cargo run --release --bin bench -- --max-examples 50000 --parallel

Machine	Before	After	Change
i7-6800K	8033	12183	1.5x
M1 Max	5858	6860	1.17x

pemistahl · 2023-05-24T07:24:25Z

This concept has now been implemented in a93ef8c so this PR can be closed. Thanks again @serega for your valuable input and sorry for the long delay.

NgramRef is used instead of Ngram during the detection phase

f33172c

serega mentioned this pull request Jan 21, 2023

Load models slightly more eagerly and reuse for all ngrams during detection. #82

Closed

pemistahl force-pushed the main branch from 29164b1 to dfbb165 Compare January 30, 2023 12:52

pemistahl force-pushed the main branch from a578d46 to 53ea6b0 Compare April 23, 2023 19:39

serega mentioned this pull request May 8, 2023

Performance optimizations (up to 3518% faster language detection) #177

Closed

pemistahl closed this May 24, 2023

pemistahl added this to the Lingua 1.5.0 milestone Jun 10, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NgramRef is used instead of Ngram during the detection phase #148

NgramRef is used instead of Ngram during the detection phase #148

serega commented Jan 21, 2023

pemistahl commented May 24, 2023

NgramRef is used instead of Ngram during the detection phase #148

NgramRef is used instead of Ngram during the detection phase #148

Conversation

serega commented Jan 21, 2023

Benchmark against current main branch

Results

Single threaded benchmark

Multi threaded benchmark

Multi threaded benchmark

pemistahl commented May 24, 2023