-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add parameter to enable lower-cased lookup of first word in sentence in SfstAnnotator #1041
Conversation
Thanks, that looks very useful. I see the default value of the parameter is "false". Should it maybe be "true"? Did you notice our contribution guidelines? It would be great if you could provide us with a CLA so we can merge the PR. |
* master: (653 commits) #1299 - Update to CoreNLP 3.9.2 #1337 - Connl2012 writer uses WordSense, but does not declare it #1299 - Update to CoreNLP 3.9.2 No issue. Fixed JavaDoc error. #1340 - Upgrade dependencies (1.11.0) #1358 - Improve error messages in TSV3 #1357 - Upgrade to ICU4J 64.2 #1340 - Upgrade dependencies (1.11.0) #1343 - Segmenter for Chinese No issue. Formatting and remove dewac test since dewac model is no longer available. #186 - Change artifactId to "dkpro-core-XXX" #186 - Change artifactId to "dkpro-core-XXX" #186 - Change artifactId to "dkpro-core-XXX" #186 - Change artifactId to "dkpro-core-XXX" #186 - Change artifactId to "dkpro-core-XXX" #186 - Change artifactId to "dkpro-core-XXX" #186 - Change artifactId to "dkpro-core-XXX" #186 - Change artifactId to "dkpro-core-XXX" #186 - Change artifactId to "dkpro-core-XXX" #186 - Change artifactId to "dkpro-core-XXX" ... % Conflicts: % dkpro-core-sfst-gpl/src/main/java/de/tudarmstadt/ukp/dkpro/core/sfst/SfstAnnotator.java
@ramonziai @rziai Since this is a commit to a GPL module, we cannot simply accept the PR under the contribution section of the Apache License - would you be able to provide a CLA? |
Can one of the admins verify this patch? |
Yes, I just sent a signed ICLA to licenses@ukp.informatik.tu-darmstadt.de. Will that be sufficient? |
Jenkins, can you test this please? |
* master: #186 - Change artifactId to "dkpro-core-XXX"
@ramonziai yep, thanks :) |
…entence in SfstAnnotator - Formatting - Adjusting to changes in DKPro Core package names
Jenkins, can you test this please? |
…entence in SfstAnnotator - Add missing JavaDoc
Jenkins, can you test this please? |
- Commented out logging test because it fails on Jenkins/Windows
Jenkins, can you test this please? |
Jenkins, can you test this please? |
…entence in SfstAnnotator - Move changes to the right source file
Jenkins, can you test this please? |
… of github.com:dkpro/dkpro-core into bugfix/1362-NifWriter-does-not-write-out-NE-identifier * 'bugfix/1362-NifWriter-does-not-write-out-NE-identifier' of github.com:dkpro/dkpro-core: #1041 - Add parameter to enable lower-cased lookup of first word in sentence in SfstAnnotator #1366 - Added support in CONLL-U reader for document and paragraph IDs #1041 - Add parameter to enable lower-cased lookup of first word in sentence in SfstAnnotator #1041 - Add parameter to enable lower-cased lookup of first word in sentence in SfstAnnotator Added parameter to enable lower-cased lookup of first word in sentence.
* master: #1041 - Add parameter to enable lower-cased lookup of first word in sentence in SfstAnnotator #1362 - NifWriter does not write out NE identifier #1362 - NifWriter does not write out NE identifier #1152 - Introduce "order" feature on tokens #1366 - Added support in CONLL-U reader for document and paragraph IDs #1041 - Add parameter to enable lower-cased lookup of first word in sentence in SfstAnnotator #1041 - Add parameter to enable lower-cased lookup of first word in sentence in SfstAnnotator Added parameter to enable lower-cased lookup of first word in sentence. % Conflicts: % dkpro-core-io-conll-asl/src/test/java/org/dkpro/core/io/conll/ConllUReaderTest.java % dkpro-core-io-json-asl/src/test/resources/conll/2000/chunk2000_ref.json % dkpro-core-io-xmi-asl/src/test/resources/xmi/english.xmi
* master: #1041 - Add parameter to enable lower-cased lookup of first word in sentence in SfstAnnotator #1362 - NifWriter does not write out NE identifier #1362 - NifWriter does not write out NE identifier #1152 - Introduce "order" feature on tokens #1366 - Added support in CONLL-U reader for document and paragraph IDs #1041 - Add parameter to enable lower-cased lookup of first word in sentence in SfstAnnotator #1366 - Added support in CONLL-U reader for document and paragraph IDs #1367 - Support TCF orthography via SofaChangeAnnotations #1041 - Add parameter to enable lower-cased lookup of first word in sentence in SfstAnnotator #1327 - Update LIF support #1366 - Added support in CONLL-U reader for document and paragraph IDs #1367 - Support TCF orthography via SofaChangeAnnotations Forgot to commit the list declaration Warn if CONLL-U file contains multiple documents Added support in CONLL-U reader for document and paragraph IDs #186 - Change artifactId to "dkpro-core-XXX" #1299 - Update to CoreNLP 3.9.2 #1337 - Connl2012 writer uses WordSense, but does not declare it #1299 - Update to CoreNLP 3.9.2 Added parameter to enable lower-cased lookup of first word in sentence.
Many full-form lexicons underlying morphological models do not handle uppercase versions of words. The result is that uppercase forms are not found and cannot be analyzed. This fix adds a parameter which enables the lookup of lowercase versions of sentence-initial words, depending on the locale of the document language.