Skip to content
This repository has been archived by the owner on Jul 3, 2023. It is now read-only.

Bump tika.version from 1.24 to 1.27 #179

Merged
merged 1 commit into from
Sep 14, 2021

Conversation

dependabot[bot]
Copy link
Contributor

@dependabot dependabot bot commented on behalf of github Sep 14, 2021

Bumps tika.version from 1.24 to 1.27.
Updates tika-core from 1.24 to 1.27

Changelog

Sourced from tika-core's changelog.

Release 2.1.1 - ???

  • Improve robustness and features of the httpfetcher (TIKA-3543)

  • Add optional fetch ranges to FetchEmitTuple to allow range fetching from, e.g. http or s3 (TIKA-3542).

  • Exclude dependencies on jsoup and ehcache in ucar grib/cdm (TIKA-3003).

Release 2.1.0 - 08/18/2021

MAJOR CHANGES in 2.1.0:

  • Improved packaging for tika-parsers-extended. Use the tika-parser-scientific-package and tika-parser-sqlite3-package artifacts if you want fat jars with dependencies. (TIKA-3510)

  • Tika app writes UTF-8 when an encoding is not specified; the legacy behavior was UTF-8 on Mac OS, but System default on other OSs (TIKA-3515).

  • Change the default rendering strategy for PDFs from NO_TEXT to ALL (TIKA-3520).

Other changes:

  • Fixed bug that pointed to the wrong tessdata directory if the user specified a tesseract path but not also a tessdata path (TIKA-3518).

  • Fixed bug in Icu4j's encoding detector where it would return non-standard names for charsets, e.g. IBM424_rtl is now returned as IBM424 (TIKA-3516).

  • Add a simple UrlFetcher in tika-core as a basic alternative to tika-fetcher-http (TIKA-3527).

  • Add tika-pipes support for Google Cloud Storage (TIKA-3524).

  • Fix markup ordering errors in xhtml output for ODT files (TIKA-2242).

  • Fix serialization of embedded docs in OpenSearch emitter and fix embedded documents not being indexed in some use cases in the Solr emitter (TIKA-3490).

  • Add pipesClientId system property to PipesServer so that each forked process can log to its own logger (TIKA-3480).

  • Add DateNormalizingMetadataFilter let users ensure that all dates emitted to Solr/OpenSearch are in UTC. Users can configure which timezone they'd like to use in cases where the file format does not store a timezone (TIKA-3496).

  • Breaking change in the Solr and OpenSearch emitters. To achieve

... (truncated)

Commits
  • ccf9442 [maven-release-plugin] prepare release 1.27-rc1
  • 31d44e9 prep for 1.27-rc1
  • f414130 TIKA-3459 -- integrate Drew Noakes metadata-extractor as the underlying MP4 p...
  • 74c5e5a TIKA-3460 -- add missing properties files for jaiimageio-core
  • 57f5912 TIKA-3457 -- general upgrades for 1.27
  • 4ba5fd7 TIKA-3456 -- LanguageDetector should chunk long strings and test for hasEnoug...
  • 90c6ea4 TIKA-3444 -- upgrade to pdfbox 2.0.24
  • 1224f88 TIKA-3441 -- improve likelihood that tesseract processes will be shutdown on ...
  • e8ec223 Merge remote-tracking branch 'origin/branch_1x' into branch_1x
  • d7fa2cd TIKA-3441 -- improve likelihood that tesseract processes will be shutdown on ...
  • Additional commits viewable in compare view

Updates tika-parsers from 1.24 to 1.27

Changelog

Sourced from tika-parsers's changelog.

Release 2.1.1 - ???

  • Improve robustness and features of the httpfetcher (TIKA-3543)

  • Add optional fetch ranges to FetchEmitTuple to allow range fetching from, e.g. http or s3 (TIKA-3542).

  • Exclude dependencies on jsoup and ehcache in ucar grib/cdm (TIKA-3003).

Release 2.1.0 - 08/18/2021

MAJOR CHANGES in 2.1.0:

  • Improved packaging for tika-parsers-extended. Use the tika-parser-scientific-package and tika-parser-sqlite3-package artifacts if you want fat jars with dependencies. (TIKA-3510)

  • Tika app writes UTF-8 when an encoding is not specified; the legacy behavior was UTF-8 on Mac OS, but System default on other OSs (TIKA-3515).

  • Change the default rendering strategy for PDFs from NO_TEXT to ALL (TIKA-3520).

Other changes:

  • Fixed bug that pointed to the wrong tessdata directory if the user specified a tesseract path but not also a tessdata path (TIKA-3518).

  • Fixed bug in Icu4j's encoding detector where it would return non-standard names for charsets, e.g. IBM424_rtl is now returned as IBM424 (TIKA-3516).

  • Add a simple UrlFetcher in tika-core as a basic alternative to tika-fetcher-http (TIKA-3527).

  • Add tika-pipes support for Google Cloud Storage (TIKA-3524).

  • Fix markup ordering errors in xhtml output for ODT files (TIKA-2242).

  • Fix serialization of embedded docs in OpenSearch emitter and fix embedded documents not being indexed in some use cases in the Solr emitter (TIKA-3490).

  • Add pipesClientId system property to PipesServer so that each forked process can log to its own logger (TIKA-3480).

  • Add DateNormalizingMetadataFilter let users ensure that all dates emitted to Solr/OpenSearch are in UTC. Users can configure which timezone they'd like to use in cases where the file format does not store a timezone (TIKA-3496).

  • Breaking change in the Solr and OpenSearch emitters. To achieve

... (truncated)

Commits
  • ccf9442 [maven-release-plugin] prepare release 1.27-rc1
  • 31d44e9 prep for 1.27-rc1
  • f414130 TIKA-3459 -- integrate Drew Noakes metadata-extractor as the underlying MP4 p...
  • 74c5e5a TIKA-3460 -- add missing properties files for jaiimageio-core
  • 57f5912 TIKA-3457 -- general upgrades for 1.27
  • 4ba5fd7 TIKA-3456 -- LanguageDetector should chunk long strings and test for hasEnoug...
  • 90c6ea4 TIKA-3444 -- upgrade to pdfbox 2.0.24
  • 1224f88 TIKA-3441 -- improve likelihood that tesseract processes will be shutdown on ...
  • e8ec223 Merge remote-tracking branch 'origin/branch_1x' into branch_1x
  • d7fa2cd TIKA-3441 -- improve likelihood that tesseract processes will be shutdown on ...
  • Additional commits viewable in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

  • @dependabot rebase will rebase this PR
  • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
  • @dependabot merge will merge this PR after your CI passes on it
  • @dependabot squash and merge will squash and merge this PR after your CI passes on it
  • @dependabot cancel merge will cancel a previously requested merge and block automerging
  • @dependabot reopen will reopen this PR if it is closed
  • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
  • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

Bumps `tika.version` from 1.24 to 1.27.

Updates `tika-core` from 1.24 to 1.27
- [Release notes](https://github.com/apache/tika/releases)
- [Changelog](https://github.com/apache/tika/blob/main/CHANGES.txt)
- [Commits](apache/tika@1.24...1.27)

Updates `tika-parsers` from 1.24 to 1.27
- [Release notes](https://github.com/apache/tika/releases)
- [Changelog](https://github.com/apache/tika/blob/main/CHANGES.txt)
- [Commits](apache/tika@1.24...1.27)

---
updated-dependencies:
- dependency-name: org.apache.tika:tika-core
  dependency-type: direct:production
  update-type: version-update:semver-minor
- dependency-name: org.apache.tika:tika-parsers
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
@dependabot dependabot bot added the dependencies Pull requests that update a dependency file label Sep 14, 2021
@lewismc lewismc merged commit 736267b into master Sep 14, 2021
@lewismc lewismc deleted the dependabot/maven/tika.version-1.27 branch September 14, 2021 01:16
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
dependencies Pull requests that update a dependency file
Projects
None yet
1 participant