Be notified of new releases
Create your free GitHub account today to subscribe to this repository for new releases and build software alongside 31 million developers.Sign up
- The PDF filtering code has been hardened to withstand processing uncharacteristic PDF files with excessively large in-memory representations, without filling up the heap and without requiring changes to existing plugins.
The proxy failed to normalize URLs in requests that include an AUID.
Cancelling hashes started from DebugPanel or HasherService frequently did not work, and sometimes crashed the daemon.
Aborting crawls using
crawlPriorityAuMapdid not work.
- Upgraded third-party libraries to address security vulnerabilities reported against them. Updated versions include Apache PDFBox 1.8.16 (CVE-2018-11797), Apache Commons Compress 1.18 (CVE-2018-11771) and FasterXML Jackson 2.9.7 (CVE-2018-7489).
- Some of the ways ServeContent can be invoked failed in some cases on AUs having multiple crawl-start URLs, when some of the start URLs do not exist.
The new metadata type "
File" supports indexing of arbitrary publication types. Support is in place for both publication level items (
MetadataField.PUBLICATION_TYPE_FILE) and article level items (
MetadataField.ARTICLE_TYPE_FILE). Article level file items will be assumed to have a publication level file parent even if not explicitly defined. Item metadata beyond the standard access URL, publisher, and provider may be stored as arbitrary key-value pairs in a
Content Configuration web service now adds AUs from their TDB definition rather than by AUID, matching the way other subsystems add AUs: Including non-definitional parameters, and choosing the least full repository.
Deep crawl status information (
lastCompletedDeepCrawlDepth) is tracked and reported in the UI, and through the
Debug Panel and AU Status now include a "Validate Files" action which runs the plugin's
ContentValidatoron all files in the AU, reporting any
In lieu of a MIME-type content validator factory, plugins may specify an
ValidationFailureswill occur for URLs that match one of the patterns but whose
Content-Typedoes not match the corresponding MIME-type. E.g.,
<entry> <string>au_url_mime_validation_map</string> <list> <string>/doi/pdf(plus)?/, application/pdf</string> <string>/doi/(abs|full)/, text/html</string> </list> </entry>
ContentValidationException.LogOnlyto record a warning message without causing validation failure.
The "Files" list from AU Status now includes a
SubscriptionManageromitted non-definitional parameters when adding subscribed AUs.
The Link Rewriter rewrote in-page links ("
#ref"), breaking them.
Metadata item type inference reversed
BOOKVOLUMEin some circumstances.
queryAus()web service, selecting
newContentCrawlUrlsfield caused a fatal error.
queryAus()web services was not accessible using
Fixed unsafe database resource closings and incorrect comparisons in metadata-handling code.
Fixed active task removal when metadata indexing for an AU is disabled.
- Allow the content configuration Web Service to use the same storage volume selection logic as the UI when adding AUs.
Bug fixes in ServeContent link rewriting and OpenURL resolver.
Properly trigger configuration of AUs after synchronizing whole title subscriptions.