Skip to content

v2.1.2

Choose a tag to compare

@ti250 ti250 released this 01 Apr 17:41
· 67 commits to master since this release
152a45f

The new class PlainTextCacher allows the user to cache results of expensive computation such as tagging or tokenising. Use this by calling cache_document on an existing document and hydrate_document to reuse these computations.

In an extraction scenario for a real document, this can lead to >10x performance, as only parsing needs to be done from the second run onwards.