Skip to content
Scientific Names finder tool for UTF8 texts
Go Makefile Other
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
data
dict
gnfinder
grpc
heuristic
lang
nlp
output
protob Fix #29 expand preferred data sources in verification Apr 23, 2019
scripts
statik
testdata
token
verifier
.gitignore
.travis.yml
.vim.custom
CHANGELOG.md
Dockerfile
LICENSE
Makefile
README.md Fix #27 refactor packages Feb 18, 2019
find.go
gnfinder.go
gnfinder_suite_test.go
gnfinder_test.go
go.mod
go.sum
output_test.go

README.md

Global Names Finder

Build Status Doc Status Go Report Card

Finds scientific names using dictionary and nlp approaches.

Features

  • Multiplatform packages (Linux, Windows, Mac OS X).
  • Self-contained, no external dependencies, only binary gnfinder or gnfinder.exe (~15Mb) is needed.
  • Takes UTF8-encoded text and returns back JSON-formatted output that contains detected scientific names.
  • Automatically detects the language of the text, and adjusts Bayes algorithm. for the language. English and German languages are currently supported.
  • Uses complementary heuristic and natural language processing algorithms.
  • Does not use Bayes algorithm if language cannot be detected. There is an option that can override this rule.
  • Optionally verifies found names against multiple biodiversity databases using gnindex service.
  • The library can be used concurrently to significantly improve speed. On a server with 40threads it is able to detect names on 50 million pages in approximately 3 hours using both heuristic and Bayes algorithms. Check bhlindex project for an example.

Install as a command line app

Download the binary executable for your operating system from the latest release.

Linux or OS X

Move gnfinder executabe somewhere in your PATH (for example /usr/local/bin)

sudo mv path_to/gnfinder /usr/local/bin

Windows

One possible way would be to create a default folder for executables and place gnfinder there.

Use Windows+R keys combination and type "cmd". In the appeared terminal window type:

mkdir C:\bin
copy path_to\gnfinder.exe C:\bin

Add C:\bin directory to your PATH environment variable.

Go

go get github.com/gnames/gnfinder
cd $GOPATH/src/github.com/gnames/gnfinder
make install

Usage

Usage as a command line app

To see flags and usage:

gnfinder --help
# or just
gnfinder

To see the version of its binary:

gnfinder -v

Examples:

Getting data from a pipe forcing English language and verification

echo "Pomatomus saltator and Parus major" | gnfinder find -c -l eng

Verifying data against NCBI and Encyclopedia of Life

echo "Pomatomus saltator and Parus major" | gnfinder find -c -l eng -s "4,12"

Getting data from a file and redirecting result to another file

gnfinder find file1.txt > file2.json

Usage as gRPC service

Start gnfinder as a gRPC server:

# using default 8778 port
gnfinder grpc

#using some other port
gnfinder grpc -p 8901

Use a gRPC client for gnfinder. To learn how to make one, check a Ruby implementation of a client.

Usage as a library

cd $GOPATH/srs/github.com/gnames/gnfinder
make deps
import (
  "github.com/gnames/gnfinder"
  "github.com/gnames/gnfinder/dict"
)

bytesText := []byte(utfText)

jsonNames := FindNamesJSON(bytesText)

Usage as a docker container

docker pull gnames/gnfinder

# run gnfinder server, and map it to port 8888 on the host machine
docker run -d -p 8888:8778 --name gnfinder gnames/gnfinder

Development

To install latest gnfinder

git get github.com/gnames/gnfinder
cd $GOPATH/src/github.com/gnames/gnfinder
make deps
make
gnfinder -h

Testing

Install [ginkgo], a [BDD] testing framefork for Go.

make deps

To run tests go to root directory of the project and run

ginkgo

#or

go test

#or

make test
You can’t perform that action at this time.