-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update datasets API #1384
Comments
reckart
added a commit
that referenced
this issue
Jun 12, 2019
- Add SHA512 hash support - Update dataset descriptions - Add option to calculate hash sum of plain text files using normalized whitespace - Added option to set a default verification policy via a system property - Fixed a few bad dataset descriptions - Updated documentation regarding the new verificationMode option and SHA512 - Improved log messages
reckart
added a commit
that referenced
this issue
Jun 12, 2019
- Update GUM 5.0.0 UD dataset license info
reckart
added a commit
that referenced
this issue
Jun 12, 2019
- Removed comment which shouldn't apply to GUM UD version
reckart
added a commit
that referenced
this issue
Jul 4, 2019
* master: #1382 - TEI reader seems not to be trimming whitespace #1378 - BratReader crashes when an annotation covers more than two spans of text #1379 - Add generic XML types #1382 - TEI reader seems not to be trimming whitespace #1384 - Update datasets API #1384 - Update datasets API #1384 - Update datasets API #1384 - Update datasets API
reckart
added a commit
that referenced
this issue
Jul 19, 2019
* 1.11.x: (371 commits) No issue. Set version to 1.11.0-SNAPSHOT. No issue. Fix checkstyle issue. No issue. Fix more JavaDoc issues. No issue. Upgrade to DKPro Meta 0.2.0. #1382 - TEI reader seems not to be trimming whitespace #1382 - TEI reader seems not to be trimming whitespace #1378 - BratReader crashes when an annotation covers more than two spans of text #1379 - Add generic XML types #1382 - TEI reader seems not to be trimming whitespace #1384 - Update datasets API #1384 - Update datasets API #1384 - Update datasets API #1384 - Update datasets API #1381 - Annotations starting/ending in inter-token space cause exception #1379 - Add generic XML types #1379 - Add generic XML types #1379 - Add generic XML types #1376 - Update TreeTagger build.xml #1346 - Reader for Annotated Gigaword #186 - Change artifactId to "dkpro-core-XXX" ... % Conflicts: % dkpro-core-io-brat-asl/src/main/java/de/tudarmstadt/ukp/dkpro/core/io/brat/BratReader.java % dkpro-core-io-brat-asl/src/main/java/de/tudarmstadt/ukp/dkpro/core/io/brat/internal/model/BratAnnotationDocument.java % dkpro-core-io-brat-asl/src/main/java/de/tudarmstadt/ukp/dkpro/core/io/brat/internal/model/TypeMapping.java % dkpro-core-io-brat-asl/src/test/java/de/tudarmstadt/ukp/dkpro/core/io/brat/BratReaderWriterTest.java
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Many of the URLs listed in the dataset descriptions now redirect elsewhere (usually http -> https) and need to be updated since the DatasetFactory cannot deal with redirects yet.
The text file for CC-BY 4.0 has had a whitespace-only change. It would be good to have the option to ignore whitespace when validating plain text files to be more resilient against such changes.
The text was updated successfully, but these errors were encountered: