- Add #39: Export to C shared library.
- Add better Handling of the version.
- Fix #42: No null pointers in verifier results.
- Fix #41: More words in black list.
- Add #37: add to git protob and version files.
- Add #36: Refactor GNfinder options.
- Add #35: Add version info to gRPC server.
- Add #34: Better language detection.
- Add #33: Make it possible to force Bayes not only "on" but also "off".
- Add #32: Add benchmarks to
- Add #31: Speedup name-finding for large numbers of small texts. Solving only partialy by preloading Bayes training data. We are going to do other optimizations later.
- Fix #30: Tokenizer breaks if a text ends on a dash followed by space.
- Add #29: Enhance verification results. Now preferred data sources have the same fields as the best result. Classification has IDs and ranks.
- Add: Update dictionaries setting latin common names to grey dictionary.
- Add: Dictionaries update.
- Add #28: Generic names from ICN (botanical) code might have authors in parentheses that look the same as subgenus part of ICZN names. As a result parsing such names creates fake uninomials. We removed such fake uninomials from uninomial white dictionary.
- Add #27: Refactor code to make it more maintainable
- Add #26: Command line app tests
- Fix #25: Make CLI app work again (cobra-based cli does not allow
root command with input without flags so
gndinfer text.txtwas broken).
- Fix #24: Canonical form for matched names
- Fix #23: ExactMatch results have editDistance > 0 somtimes
- Add more tests for gnindex.
- Add #21: support updated gnindex API
- Add #22: Go module support for more stable builds
- Add #19: bring gRPC output close to cli output. Breaks backward compatibility of gRPC.
- Add #20: update API interaction with gnindex.
- Add #17: return offsets for the start and the end of name-strings.
- Fix #18: gRPC works with diacritics in text input.
- Add #16: docker support. Command
make dockercreates docker image.
- Add #15: enable gRPC to set data-source IDs for verification.
- Add #14: setting for name verification data-sources as well as command line flag. Currently tests for gRPC are located in Ruby gem gndinder project.
- Add #12: gRPC-based HTTP API to access gnfinder from other languages.
- Add StemEditDistance for fuzzy matching by stem.
- Add #11: Quality Summary and Preferred data sources in verification.
- Add #9: Additional information how to install in README.md.
- Add #8: Retry verification if any error happens in the process.
- Add #7: Add EditDistance field to verification output.
- Add #6: Add 'NoMatch' value to verification 'MatchType'.
- Fix #5: Hide verification "data" if it is empty.
- Remove #6: Remove Verified field, as it repeats 'NoMatch' information.
Add #4: Name resolution attempts several times in case of timeout
Fix #3: Name verification breaks on large documents with thousands of words
- Add: Tokenizer for breaking a text into tokens.
- Add: Heuristic rules for scientific name finding.
- Add: Bayes rules for scientific name finding.
common european wordsdictionary.
- Add: Bayes training script to create reference data for Bayes algorithms.
- Add: Command line application
gnfinderis created using
- Add: Name-verification via gnindex.
- Add: Makefile to simplify compilation of the command line tool.
This document follows changelog guidelines