Skip to content

Commit

Permalink
Update benchmark results
Browse files Browse the repository at this point in the history
  • Loading branch information
phoerious committed Sep 21, 2021
1 parent b587d64 commit 991d8ad
Showing 1 changed file with 4 additions and 4 deletions.
8 changes: 4 additions & 4 deletions docs/man/parse/lang.rst
Original file line number Diff line number Diff line change
Expand Up @@ -57,14 +57,14 @@ If you know your text is from one of several candidate languages, you can restri

Benchmarks
^^^^^^^^^^
On inputs the size of an average webpage, Resiliparse's fast language detector is about 2--5x as fast as `FastText <https://fasttext.cc/blog/2017/10/02/blog-post.html>`_ (depends on FastText's convergence speed) and even 40x as fast as `langid <https://github.com/saffsd/langid.py>`_:
On inputs the size of an average webpage, Resiliparse's fast language detector is about 3--5x as fast as `FastText <https://fasttext.cc/blog/2017/10/02/blog-post.html>`_ (depends on FastText's convergence speed) and even 40x as fast as `langid <https://github.com/saffsd/langid.py>`_:

::

Benchmarking language detectors (10,000 rounds):
Resiliparse: 2.4s
FastText: 8.4s
Langid: 101.3s
Resiliparse: 1.7s
FastText: 6.8s
Langid: 101.7s

Resiliparse's performance advantage comes mostly from the fact that the language detector does not need to tokenize the text or build a vocabulary map at all, which makes it very low-latency, independent of the vocabulary size, and guarantees a fixed memory ceiling and linear runtime complexity.

Expand Down

0 comments on commit 991d8ad

Please sign in to comment.