Skip to content

Conversation

@jrwishart
Copy link

See Jira DS-2663

… input character vector.

* Fixed up documentation to be more consistent.
Travis CI needs to be able to download the CoreNLP java library and then run a unit test.

* Updated travis.yml to allow for wget of CoreNLP, unzip it and set environment variable for CoreNLP
* Added unit test of the Tom Cruise data.
* Updated `cnlp_init_corenlp_custom` function to find CoreNLP if a system environment variable can point to it.
…it test calculation size for entity detection.
…onsolidate entity unit tests.

Remove old code using the tab separated conll file approach that is unable to handle combined entities and uses custom regex commands to extract entities. Replaced with a entity extraction function that uses a json approach. Generated file is larger but handles entity combining automatically and no regex required to extract entities.

General other housekeeping.

* Change aut to ctb in DESCRIPTION
* Update NAMESPACE for required jsonlite function
* Remove multiarch R CMD check to stop rJava 32bit errors on local development machine
* Move custom json entity extraction function NERAnnotate into separate R file.
… and tweak travis code to remove Warnings.

Output directory of CoreNLP English model jar files needed to be set correctly. Also ignoring vignettes in the R check  since vignettes output is cleared on each package build and travis cant see the output.
@jrwishart jrwishart requested a review from JustinCCYap October 8, 2019 02:03
@JustinCCYap JustinCCYap merged commit 1051806 into master Oct 8, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants